ELFs are C programmers. They like the language, they dream of pointers, and they are sceptic of any new language feature that those shiny newcomers bring. But sometimes they see features that they really would like to have, but they cannot admit that they were wrong about their scepticism. Instead they implement the feature on their own in C. For example, in the Mnemosyne system, the user can annotate a global variable with the persistent
attribute and it will retain its value over different program invocation.
Today, we will look at the virtual-memory subsystem of Linux and the mmap(2) system call, which we can use to manipulate the virtual address space of the current process. As we discussed yesterday, most processes have their very own address space. This means that the same pointer (e.g., 0x45004
) resolves to different memory cell in two different processes, making it a virtual address. The translation between virtual and physical addresses is done by the MMU, which is configured by the operating system. The MMU is hardware circuitry that translates (or maps) on the granularity of pages (usually 4096 bytes): (virtual) pages to (physical) page frames.
While we can use mmap()
to simply request (anonymous) memory for our amusement, the virtual-memory subsystem of Linux is much more flexible (as are other operating systems). And today, we will use one of these features: file-mapped I/O, where we instruct the operating system to make the contents of a file visible within our address space. Unlike pread()
, a file mapping does not only transfer the file contents once into a memory region, but the file mapping is synchronized with the actual file. Therefore, a write into a file mapping will actually change the file; no explicit synchronization (or pwrite()
) is required.
Internally, a mmap()
call creates a new struct vm_area_struct object and adds it to the struct mm_struct of the current process. And if the new mapping is anonymous, that's about it. However, for a file mapping, we also set vm_file
pointer to the file that we address with mmap()
s fd
argument.
But "Wait!", you might ask: "When is the file content actually read into memory?" For this, we have to understand that Linux fills its page tables, which are the MMU-specific representation of an mm_struct
, lazily on demand. So, whenever you want to access a page that is not yet present in the page table, the MMU will throw a page fault, the OS will check whether the access was legitimate, and if so allocate a page frame and install it at the faulting address. And it is in this process, that the OS inspects the vm_area_struct
for a vm_file
pointer and reads in (we call it, "it pages in") the respective file content. The process is paused until the accessed page is actually present in memory. Thereby, we also can conclude that mapping a huge (multiple GiB) file is not expensive in itself, as we only pay for those accesses that we actually perform.
As mapping a file into our virtual-address space is rather boring, we've come up with a more interesting challenge: In this task, you should make some specially-marked variables persistent. For this, we replace the anonymous memory where the variables live at the process start, with a shared file mapping. Thereby, the OS will synchronize those variables with a backing store. For example, in the following (simplified) program, we use the file mmap.persistent
as a backing store:
persistent int count = 10;
...
int main(int argc, char *argv[]) {
setup_persistent("mmap.persistent");
printf("count = %d\n", count++);
}
If we invoke this program three times, we expect the following output:
$> ./test
count = 10
$> ./test
count = 11
$> ./test
count = 12
setup_persistent()
mmap.persistent
with the xxd
tool (hexdump)Last modified: 2023-12-01 15:52:27.502376, Last author: , Permalink: /p/advent-03-mmap
Technische Universität Braunschweig
Universitätsplatz 2
38106 Braunschweig
Postfach: 38092 Braunschweig
Telefon: +49 (0) 531 391-0