Technische Universität Braunschweig
  • Studium & Lehre
    • Vor dem Studium
      • Informationen für Studieninteressierte
      • Studiengänge von A-Z
      • Bewerbung
      • Fit4TU - Self-Assessment
      • Beratungsangebote für Studieninteressierte
      • Warum Braunschweig?
    • Im Studium
      • Erstsemester-Hub
      • Semestertermine
      • Lehrveranstaltungen
      • Studien-ABC
      • Studienorganisation
      • Beratungsnavi
      • Zusatzqualifikationen
      • Finanzierung und Kosten
      • Besondere Studienbedingungen
      • Gesundheit & Wohlbefinden
      • Campusleben
    • Nach dem Studium
      • Exmatrikulation und Vorlegalisation
      • Nach dem Abschluss
      • Alumni*ae
    • Strategien und Qualitätsmanagement
      • Strategiepapiere für Studium und Lehre
      • Studienqualitätsmittel
      • Studiengangsentwicklung
      • Qualitätsmanagement
      • Systemakkreditierung
      • Rechtliche Grundlagen
      • TU Lehrpreis
    • Für Lehrende
      • Informationen für Lehrende
      • Konzepte
      • Lernmanagementsystem Stud.IP
    • Kontakt
      • Studienservice-Center
      • Sag's uns - in Studium und Lehre
      • Zentrale Studienberatung
      • Immatrikulationsamt
      • Abteilung 16 - Studium und Lehre
      • Career Service
      • Projekthaus
  • Forschung
    • Forschungsprofil
      • Forschungsschwerpunkte
      • Exzellenzcluster der TU Braunschweig
      • Forschungsprojekte
      • Forschungszentren
      • Forschungsprofile der Professuren
    • Frühe Karrierephase
      • Förderung in den frühen Phasen der wissenschaftlichen Karriere
      • Promotion
      • Postdocs
      • Nachwuchsgruppenleitung
      • Junior Professur und Tenure-Track
      • Habilitation
      • Service-Angebote für Wissenschaftler*innen
    • Forschungsdaten & Transparenz
      • Transparenz in der Forschung
      • Forschungsdaten
      • Open Access Strategie
      • Digitale Forschungsanzeige
    • Forschungsförderung
      • Netzwerk Forschungsförderung
      • Datenbanken und Stiftungen
    • Kontakt
      • Forschungsservice
      • Graduiertenakademie
  • International
    • Internationale Studierende
      • Warum Braunschweig?
      • Studium mit Abschluss
      • Austauschstudium
      • TU Braunschweig Summer School
      • Geflüchtete
      • International Student Support
    • Wege ins Ausland
      • Studium im Ausland
      • Praktikum im Ausland
      • Lehren und Forschen im Ausland
      • Arbeiten im Ausland
    • Internationale Forschende
      • Welcome Support
      • Promotionsstudium
      • Service für gastgebende Einrichtungen
    • Sprachen und interkulturelle Kompetenzvermittlung
      • Deutsch lernen
      • Fremdsprachen lernen
      • Interkulturelle Kompetenzvermittlung
    • Internationales Profil
      • Internationalisierung
      • Internationale Kooperationen
      • Strategische Partnerschaften
      • Internationale Netzwerke
    • International House
      • Wir über uns
      • Kontakt & Sprechstunden
      • Aktuelles und Termine
      • International Days
      • 5. Studentische Konferenz: Internationalisierung der Hochschulen
      • Newsletter, Podcast & Videos
      • Stellenausschreibungen
  • Die TU Braunschweig
    • Unser Profil
      • Ziele & Werte
      • Ordnungen und Leitlinien
      • Allianzen & Partner
      • Hochschulentwicklung 2030
      • Stiftungsuniversität
      • Internationale Strategie
      • Fakten & Zahlen
      • Unsere Geschichte
    • Karriere
      • Arbeiten an der TU
      • Stellenmarkt
      • Berufsausbildung an der TU
    • Wirtschaft & Unternehmen
      • Unternehmensgründung
      • Freunde & Förderer
    • Öffentlichkeit
      • Veranstaltungskalender
      • Check-in für Schüler*innen
      • Hochschulinformationstag (HIT)
      • Kinder-Uni
      • Das Studierendenhaus
      • Gasthörer*innen & Senior*innenstudium
      • Nutzung der Universitätsbibliothek
    • Presse & Kommunikation
      • Stabsstelle Presse und Kommunikation
      • Medienservice
      • Ansprechpartner*innen
      • Tipps für Wissenschaftler*innen
      • Themen und Stories
    • Kontakt
      • Allgemeiner Kontakt
      • Anreise
      • Für Hinweisgeber
  • Struktur
    • Leitung & Verwaltung
      • Das Präsidium
      • Stabsstellen
      • Verwaltung
      • Organe, Statusgruppen und Kommissionen
    • Fakultäten
      • Carl-Friedrich-Gauß-Fakultät
      • Fakultät für Lebenswissenschaften
      • Fakultät Architektur, Bauingenieurwesen und Umweltwissenschaften
      • Fakultät für Maschinenbau
      • Fakultät für Elektrotechnik, Informationstechnik, Physik
      • Fakultät für Geistes- und Erziehungswissenschaften
    • Institute
      • Institute von A-Z
    • Einrichtungen
      • Universitätsbibliothek
      • Gauß-IT-Zentrum
      • Zentrale Personalentwicklung
      • International House
      • Projekthaus
      • Transferservice
      • Hochschulsportzentrum
      • Einrichtungen von A-Z
    • Studierendenschaft
      • Studierendenparlament
      • Fachschaften
      • Studentische Wahlen
    • Lehrer*innenbildung
      • Lehrer*innenfortbildung
      • Forschung
    • Chancengleichheit
      • Gleichstellung
      • Familie
      • Diversität
    • Kontakt
      • Personensuche
  • Suche
  • Schnellzugriff
    • Personensuche
    • Webmail
    • cloud.TU Braunschweig
    • Messenger
    • Mensa
    • TUconnect (Studierendenportal)
    • Lehrveranstaltungen
    • Im Notfall
    • Stud.IP
    • UB Katalog
    • Status GITZ-Dienste
    • Störungsmeldung GB3
    • IT Dienste
    • Informationsportal (Beschäftigte)
    • Beratungsnavi
    • Linksammlung
    • DE
    • EN
    • IBR YouTube
    • Facebook
    • Instagram
    • YouTube
    • LinkedIn
    • Mastodon
Menü
  • Struktur
  • Fakultäten
  • Carl-Friedrich-Gauß-Fakultät
  • Institute
  • Institut für Betriebssysteme und Rechnerverbund
  • Prof. Dr.-Ing. Christian Dietrich
  • Advent(2)
  • Pipes Full of Gravy
Logo IBR
IBR Login
  • Institut für Betriebssysteme und Rechnerverbund
    • News
    • Wir über uns
      • Gesamtes Team
      • Anreise
      • Raumplan
      • Projekte
      • Veröffentlichungen
      • Software
      • News Archiv
    • Connected and Mobile Systems
      • Team
      • Lehrveranstaltungen
      • Abschlussarbeiten
      • Projekte
      • Veröffentlichungen
      • Software
      • Datensätze
    • Verlässliche Systemsoftware
      • Übersicht
      • Team
      • Lehre
      • Arbeiten & Jobs
      • Forschung
      • Publikationen
    • Algorithmik
      • Team
      • Lehrveranstaltungen
      • Abschlussarbeiten
      • Projekte
      • Veröffentlichungen
    • Mikroprozessorlabor
    • Studium
      • Sommersemester 2025
      • Wintersemester 2024/2025
      • Abschlussarbeiten
    • Service
      • Bibliothek
      • Mailinglisten
      • Webmail
      • Knowledgebase
      • Wiki
      • Account Management
      • Service-Status
    • Spin-Offs
      • Docoloc
      • bliq (formerly AIPARK)
      • Confidential Technologies
    • Forschungsverbünde
      • IST.hub
  • Task Overview
  • Git repository
  • Mailing list
  • Matrix-Channel

Pipes Full of Gravy

☃️
Git-Repository: Template Solution Solution-Diff (Solution is posted at 18:00 CET)
Workload: 131 lines of code
Important System-Calls: epoll_create(2), epoll_ctl(2), epoll_wait(2), pipe(7), splice(2)
Recommended Reads:
  • Dec. 7: Select a Gift file 110 lines [select(2)]
Illustration for this exercsie

Christmas is also the season, where the kitchen in the ELF village is a very busy place. All the food and all the cookies for Christmas eve are prepared ahead of time and stored for the big feast when Santa comes home from the tour. And there is one thing that Santa is especially fond of. There is only one thing that both quenches the thirst and fills up the stomach: Gravy! Immense amount of gravy to drench meat and vegetables alike. But making this special ELF gravy is a very peculiar process as it involves multiple process and filter steps to ensure the delicious clarity of the brown molten gold.

However, last year, while the individiual steps of the process were already optimized, there were a few difficulties with the overall gravy production: the transfers stalled and blocked as there was only a single ELF with a single cart that moved the proto-gravy around. For this year, the ELF senate decided to come up with a better gravy-transportation process.

Of Pipes and Polls

Today, we want to look at two different system call interfaces, combine them, and help our elves with the gravy problem. We will start with epoll(7), which is a more modern variant of select(2), and splice(2), which allows the user to directly redirect data from one pipe to another one. In essence, our ELF gets a mechanism to determine which process of the gravy production is producing proto-gravy at its output (epoll), and a mechanism to directly establish pipelines between the output and the input buckets of the gravy production without loading or unloading of the cart.

epoll

Let's first have a look at epoll() and the problem that it solves: With select, we passed a bit-set of queried file descriptors to the kernel and got back the same set where only the bits of the "ready" descriptors are set. So, we have to rejuvenate the poll set for every select invocation and we have to iterate over the set to determine which descriptors are actually ready. Both steps require time and are inefficient as usually the set of polled descriptors is very large and does not change that often.

To solve this, epoll(7) was created in Linux 2.5.44 (2002): In a nutshell, we create a epoll-descriptor with epoll_create(2), which is a file-descriptor itself, and register the file-descriptors that are interesting for us with epoll_ctl(2). With epoll_wait(2), we query the set of registered descriptors and the kernel returns one or multiple "events" (becoming ready is an event) that we can work on. This is fundamentally different than select in two respects: (1) Unless we register a file descriptor explicitly as EPOLLONESHOT, it remains registered even if it already produced an event. (2) The return value is a dense stream of events with much fewer elements than the number of registered fds. In its functionality, epoll is similar to the poll(2) system call, which already solves (1) and allows us to reuse the set of interesting fds. Nevertheless, poll still forces us to iterate over the fd set to find the interesting ones.

In essence, an epoll event loop looks like this:

// Create epoll object
int epoll_fd = epoll_create(...);

// Register interesting FDs
epoll_ctl(epoll_fd, EPOLL_ADD, fd, ...)

// The event loop
while (true) {
    struct epoll_event events[10];
    int n = epoll_wait(epoll_fd, events,....);
    for (int i = 0; i < n; i++) {
        process(events[i])
    }
}

However, one thing is special about epoll: It does not tell us the file descriptor that produced an event, but it only hands back user-defined data (struct epoll_data_t) that we gave it when registering the fds. This could be the file descriptor number, but it also could be a pointer to some fancy context object.

Pipes and splice

The other part of today are Unix pipes (pipe(7) that we want to splice(2) together. For this, we first have to understand that a Unix pipe, which was championed by Doug Mellroy, is an connection between to file descriptors. These fds are not connected to a real file, but if we write some data into one fd, the data can be read from the other FD. This comes with the constraint that pipe are unidirectional and we have a "write end" and a "read end". And this is reflected in the API to create an pipe object with pipe(2) (or with flags: pipe2(2)):

int pipe(int pipefd[2]);
....
int P[2];
pipe(P);
// P[0]: read end
// P[1]: write end

If you use your shell to spawn a command with pipes, this system call is used by the shell to establish the connection between the processes. For example, for cat file | grep bar, the shell invokes pipe2() once, gives the write end (P[1]) as stdout to the cat-process and the read end (P[0]) to the grep process. Thereby, both processes communicate directly with each other and the shell is no longer involved.

However, sometimes, we want to have a more fine grained control over how and when data is moved between two processes. For example, if we want to measure the amount of gravy that flows between two production steps. For this, the shell must become an intermediator, for example by creating two pipe pairs: one from cat to the shell and one from the shell to grep. Nevertheless, it is not desirable to read() the data into the shells address space first, just to write() it out to the grep-pipe. And this is where splice(2) comes in handy:

ssize_t splice(int fd_in,  off64_t *off_in, 
               int fd_out, off64_t *off_out, 
               size_t len, unsigned int flags);

splice() connects two file descriptors (one or both of them must be a pipe) for len bytes. Furthermore, we can add some flags and set input and output file offsets. As a result, we get the number of bytes that were actually written.

Task

Write a single-threaded program that takes multiple filter commands as separate arguments:

seq 1 100000000000000 | ./epoll "grep [0-5]" "grep 1$"  > /dev/null

The program spawns both filters, splices its stdin data with splice() to the first filter, splices the first filter's output to the second filter's input, and so on, until the output of the last filter is written to stdout of the ./epoll program. Meanwhile, the epoll program should record and regularly print the throughput of the filter pipeline to determine how much gravy is flowing. This is similar to the behavior of the pv(1) program. As an output, you should expect something like:

[grep [0-5]] Started filter as pid 458367
[grep 1$] Started filter as pid 458368
 73.63MiB/s 72.70MiB/s 7.29MiB/s
 85.22MiB/s 85.06MiB/s 8.52MiB/s
 92.25MiB/s 92.25MiB/s 9.22MiB/s
 92.44MiB/s 92.46MiB/s 9.25MiB/s

Hints

  • Use epoll to find those pipe write ends that are ready and have data.
  • Start out with an implementation of copy_splice that uses read()/write()
  • As a simplification, use splice eagerly when data is ready and do not try to identify read ends that are ready to accept new data.
  • Enumerate the connected pipe pairs with an index and set that number as user data when adding file descriptors to the epoll object.

Last modified: 2023-12-01 15:52:28.025765, Last author: , Permalink: /p/advent-10-epoll


aktualisiert am 01.12.2023, 15:52 von Prof. Dr.-Ing. Christian Dietrich

Für alle

Stellen der TU Braunschweig
Jobbörse des Career Service
Merchandising
Sponsoring- & Spendenleistungen
Drittmittelgeförderte Forschungsprojekte
Vertrauenspersonen für Hinweisgeber

Für Studierende

Semestertermine
Lehrveranstaltungen
Studiengänge von A-Z
Informationen für Erstsemester
TUCard

Interne Tools

Status GITZ-Dienste
Handbuch für TYPO3 (Intern)
Corporate Design-Toolbox (Intern)
Glossar (DE-EN)
Meine Daten ändern
Hochschulöffentliche Bekanntmachungen

Kontakt

Technische Universität Braunschweig
Universitätsplatz 2
38106 Braunschweig
Postfach: 38092 Braunschweig
Telefon: +49 (0) 531 391-0

Anreise

© Technische Universität Braunschweig
Impressum Datenschutz Barrierefreiheit