Processes and File Handles
Process Example: Firefox
-
Waiting for and processing interface events: mouse clicks, keyboard input, etc.
-
Redrawing the screen as necessary in response to user input, web page loading, etc.
-
Loading web pages—usually multiple parts in parallel to speed things up.
-
Firefox.exe: the executable code of Firefox itself.
-
Shared libraries for web page parsing, security, etc.
-
Stacks storing local variables for running threads.
-
A heap storing dynamically-allocated memory.
-
Configuration files.
-
Fonts.
Finding bash
-
ps aux
gives me all process, thengrep
for the one I’m after. -
…or, do it all in one shot using
pgrep
. -
…or, if I know it’s running in my current session a bare
ps
will do.
$ ps -Lf # thread information
-
UID
: user the process is running as. -
PID
: process ID. -
PPID
: parent process ID. -
PRI
: scheduling priority. -
SZ
: size of the core image of the process (kB). -
WCHAN
: if the process is not running, description of what it is waiting on. -
RSS
: total amount of resident memory is use by the process (kB). -
TIME
: measure of the amount of time that the process has spent running.
-
If
bash
had multiple threads running this view would show them, sobash
does not have multiple threads.
$ lsof # open files
-
/home/challen/.bashrc
was not actually open when I ran this command. -
bash
didn’t have any interesting files open and I was embarrassed.
-
/home/challen/.bashrc
was not actually open when I ran this command. -
bash
didn’t have any interesting files open and I was embarrassed.
Let’s imagine we caught bash during startup when it is reading its configuration parameters.
Aside: the /proc/ file system
-
How do
top
,ps
,pmap
,lsof
, and other process examination utilities gather information? -
Linux reuses the file abstraction for this purpose.
OS Abstraction Cheat Sheet
-
Threads save processor state.
-
Address spaces map the addresses used by processes (virtual addresses) to real memory addresses (physical addresses).
-
Files map offsets into a file to blocks on disk.
-
File-like objects look like files to a process but are not actually stored on disk and may not completely obey file semantics.
-
You can’t seek on a network socket or open certain network-mounted files.
-
-
Processes organize these other operating system abstractions.
Updated Process Model
-
For today’s material being precise about how processes use files becomes important.
File Handles
-
The file descriptor that processes receive from
open()
and pass to other file system system calls is just an int, an index into the process file table. -
That int refers to a file handle object maintained by the kernel.
-
That file handle object contains a reference a separate file object also maintained by the kernel.
-
Which then is mapped by the file system to blocks on disk.
-
So three levels of indirection:
-
file descriptor → file handle.
-
file handle → file object.
-
file object → blocks on disk.
-
-
Why?
Sharing File State
-
File descriptors are private to each process.
-
File handles are private to each process but shared after process creation.
-
File handles store the current file offset, or the position in the file that the next read will come from or write will go to. File handles can be deliberately shared between two processes.
-
-
File objects hold other file state and can be shared transparently between many processes.
Operating System Design Principles
-
Separate policy from mechanism.
-
Facilitate control or sharing by adding a level of indirection.
fork()
# create a new process
fork()
is the UNIX system call that creates a new process.-
fork()
creates a new process that is a copy of the calling process. -
After
fork()
we refer to the caller as the parent and the newly-created process as the child. This relationship enables certain capabilities.
fork()
Semantics
-
Generally
fork()
tries to make an exact copy of the calling process.-
Recent version of UNIX have relaxed this requirement and there are now many flavors of
fork()
that copy different amounts of state and are suitable for different purposes. -
For the purposes of this class, ignore them.
-
-
Threads are a notable exception!
fork()
Against Threads
-
Single-threaded
fork()
has reliable semantics because the only thread the processes had is the one that calledfork()
.-
So nothing else is happening while we complete the system call.
-
-
Multi-threaded
fork()
creates a host of problems that many systems choose to ignore.-
Linux will only copy state for the thread that called
fork()
.
-
Multi-Threaded fork()
fork()
-
Another thread could be blocked in the middle of doing something (uniprocessor systems), or
-
another thread could be actually doing something (multiprocessor systems).
fork()
-
fork()
copies one thread—the caller. -
fork()
copies the address space. -
fork()
copies the process file table.
After fork()
returnCode = fork();
if (returnCode == 0) {
# I am the child.
} else {
# I am the parent.
}
-
The child thread returns executing at the exact same point that its parent called
fork()
.-
With one exception:
fork()
returns twice, the PID to the parent and 0 to the child.
-
-
All contents of memory in the parent and child are identical.
-
Both child and parent have the same files open at the same position.
-
But, since they are sharing file handles changes to the file offset made by the parent/child will be reflected in the child/parent!
-