Today: Address Translation
-
Levels of indirection.
-
Physical and virtual addresses.
-
Virtual address properties.
ASST2
Checkpoint
-
If you have not finished
ASST2.1
, you’re way, way behind. -
If you have not finished the file system system calls
sys_{open,close,lseek}…
, you’re behind. -
If you finished one of
sys_{fork,wait,exec,exit}
, you’re OK.
Convention
-
Process layout is specified by the Executable and Linker Format (ELF) file. (Remember ELF?)
-
Some layout is the function of convention.
-
Example: why not load the code at
0x0
?-
To catch possibly the most common programmer error:
NULL
pointer problems! -
Leaving a large portion of the process address space starting at
0x0
empty allows the kernel to catch these errors, including offsets againstNULL
caused byNULL
structures:
-
struct bar * foo = NULL;
foo->bar = 10;
Segmentation fault
Core dumped
Aside: ASST2
int32_t sys_open(userptr_t pathname, ...) {
if (pathname == NULL) {
return EINVAL;
}
...
This is also why not to check userptr_t
types for NULL
in ASST2:
-
0x0
can be a valid user address. ("Look, Mom, I did my own linking.") -
There are
2^31
ways that address can be bogus…and you just checked one of them.
Destined To Ever Meet?
-
The stack starts at the top of the address space and grows ↓.
-
The heap starts towards the bottom and grows ↑.
-
Will they ever meet?
-
Probably not! That would mean either the stack or probably the heap was huge.
-
Relocation
int data[128];
...
data[5] = 8; // Where the heck is data[5]?
...
result = foo(data[5]); // Where the heck is foo?
So given our address space model, no more problems with locating things, right?
Not quite! Dynamically-loaded libraries still need to be relocated at run time. Cool: but not something we’ll cover in this course.
Sounds great
What’s the catch?
Address Spaces: A Great Idea?
-
The address space abstraction sounds powerful and useful. (It would be better if it cooked breakfast.)
-
But can we implement it?
Your mission
Implement address spaces
Implementing Address Spaces
-
Address translation: 0x10000 to Process 1 is not the same as 0x10000 to Process 2 is not the same as…
-
Protection: address spaces are intended to provide a private view of memory to each process.
-
Memory management: together one or several processes may have more address space allocated than physical memory on the machine.
-
In a way, we are encouraging processes to spread out and let us handle the details.
-
Guess What?
-
Your entire (programming) life has been a lie.
-
You believe in things that are not actually true.
-
Today your view of the world will change forever.
0x10000
Also not real
Your Mission: Implement Address Spaces
-
Clearly implementing address spaces requires breaking the direct connection between a memory address and physical memory.
-
Introducing another level of indirection is a classic systems technique. We have seen it before. Where?
-
File handles!
-
Translation is Control
Forcing processes to translate a reference to gain access to the underlying object provides the kernel with a great deal of control.
References can be revoked, shared, moved, and altered.
Memory Interface
We don’t usually think about memory as having an interface, but it does:
-
load(address)
: load data from the given address, usually into a register or possible into another memory location. -
store(address, value)
: store value to the given address, where value may be in a register or another memory location.
Virtual v. Physical Addresses
-
The address space abstraction requires breaking the connection between a memory address and physical memory.
-
We refer to data accessed via the memory interface as using virtual addresses:
-
A physical address points to memory.
-
A virtual address points to something that acts like memory.
-
-
Virtual addresses have much richer semantics than physical addresses, encapsulating location, permanence and protection.
Welcome
To the real world
Virtual Addresses: Location
The data referenced by a virtual address might be:
-
in memory! (Duh.) But…the kernel may have moved it to the disk.
Virtual Address → Physical Address
-
on disk, but…the kernel may be caching it in memory.
Virtual Address → Disk, Block, Offset
-
in memory on another machine.
Virtual Address → IP Address, Physical Address
-
a port on a hardware device.
Virtual Address → Device, Port
Virtual Addresses: Permanence
Processes expect data written to virtual addresses that point to physical memory to store values transiently.
Processes expect data written to virtual addresses that point to disk to store values permanently.
-
Hardware may change its registers independently, so a read will not necessarily return the last value written.
Virtual Addresses: Permissions and Protection
-
Some virtual addresses may only be used by the kernel while in kernel mode.
-
Virtual addresses may also be assigned read, write or execute permissions.
-
read/write: a process can load/store to this address.
-
execute: a process can load and execute instructions from this address.
-
Creating Virtual Addresses: exec()
-
exec()
uses a blueprint from an ELF file to determine how the address space should look whenexec()
completes. -
Specifically,
exec()
creates and initializes virtual addresses that (mainly) point to memory:-
code, usually marked read-only.
-
data, marked read-write, but not executable.
-
heap, an area used for dynamic allocations, marked read-write.
-
stack space for the first thread.
-
$ pmap
# memory mappings
Creating Virtual Addresses: fork()
fork()
copies the address space of the calling process.
Creating Virtual Addresses: fork()
The child has the same virtual addresses as the parent but they point to different memory locations.
int i = 2;
ret = fork();
if (ret != 0) {
printf("%x", &i); // prints virtual address 0x20010
i = 4;
printf("%d", i);
} else {
printf("%x", &i);
i = 3;
printf("%d", i);
}
int i = 2;
ret = fork();
if (ret != 0) {
printf("%x", &i); // prints virtual address 0x20010
i = 4;
printf("%d", i); // prints 4. virtual address points to private memory.
} else {
printf("%x", &i);
i = 3;
printf("%d", i);
}
int i = 2;
ret = fork();
if (ret != 0) {
printf("%x", &i); // prints virtual address 0x20010
i = 4;
printf("%d", i); // prints 4. virtual address points to private memory.
} else {
printf("%x", &i); // prints virtual address 0x20010
i = 3;
printf("%d", i);
}
int i = 2;
ret = fork();
if (ret != 0) {
printf("%x", &i); // prints virtual address 0x20010
i = 4;
printf("%d", i); // prints 4. virtual address points to private memory.
} else {
printf("%x", &i); // prints virtual address 0x20010
i = 3;
printf("%d", i); // prints 3. virtual address points to private memory.
}
Issues with fork()
Copying all that memory is expensive!
-
Especially when the next thing that a process frequently does is start load a new binary which destroys most of the state
fork()
has carefully copied! -
We will come back to this problem next week when we talk about clever memory-management tricks.
Creating Virtual Addresses: sbrk()
-
Dynamic memory allocation is performed by the
sbrk()
system call. -
sbrk()
asks the kernel to move the break point, or the point at which the process heap ends.
Creating Virtual Addresses: mmap()
-
mmap()
is a system call that creates virtual addresses that map to a portion of a file.
Example Machine Memory Layout: System/161
-
System/161 emulates a 32-bit MIPS architecture.
-
Addresses are 32-bits wide: from 0x0 to 0xFFFFFFFF.
-
0x0–0x7FFFFFFF
: process virtual addresses. Accessible to user processes, translated by the kernel. 2 GB. -
0x80000000–0x9FFFFFFF
: kernel direct-mapped addresses. Only accessible to the kernel, translated by subtracting 0x80000000. 512 MB. Cached. -
0xA0000000–0xBFFFFFFF
: kernel direct-mapped addresses. Only accessible to the kernel. 512 MB. Uncached. -
0xC0000000–0xFFFFFFFF
: kernel virtual addresses. Only accessible to the kernel, translated by the kernel. 1 GB.
Example Machine Memory Layout: System/161
Mechanism v. Policy
-
We will get to the details of virtual address translation next time.
-
The hardware memory management unit speeds the process of translation once the kernel has told it how to translate an address or according to architectural conventions. The MMU is the mechanism.
-
The operating system memory management subsystem manages translation policies by telling the MMU what to do.
-
Goal: system follows operating system established policies while involving the operating system directly as rarely as possible.
Next Time: Address Translation
-
Multiple approaches to translating addresses.
-
How to do it fast.