Today's plan
- Minix fork and exit
- Minix exec
- modules
Minix fork
- fork can fail if:
- there is no process table entry (last few table entries reserved
for root), or
- the required memory cannot be allocated
if either condition holds, fork fails before allocating
any resources (similar to, but not the same as, two-phase commit
-- check first, then allocate)
- copy the data and stack segments (if the text is not shared, it
is included in the data segment) to the newly allocated memory
- copy the parent's process table entry to the child's
- modify the child's parent, traced flags, exit and signal status,
and PID (next free ID < 3000)
- ask the kernel to fork the process
- ask the FS to duplicate the file descriptors
- ask the kernel to map the process
- send a message to the child with return value 0 (p. 755, line 16904)
- return to the parent with the child's PID
Minix exit
- a process terminates when it is killed (depending on the signal
and the signal handler) or when it calls exit
- no error checking on exit -- exit always succeeds
- cancel alarms and signals
- tell FS to close all the open files and free process slot
- tell kernel to free process slot and take care of canceling message
sends
- free the memory of the process
- if the parent is waiting, wake up the parent and free the process slot
- if no parent is waiting, leave the process in a zombie state,
with only a process table slot still allocated
- either way, reparent any of this process's children to init
- in-class discussion: is it more important for exit to be fast, or
for fork to be fast, or are they equally important?
Minix wait
- wait can specify one of three possible arguments to wait for:
- for a specific process (PID)
- for any child
- for any child within a given process group
- a wait will complete if (one of) the given process has terminated,
or is being traced and has stopped
- first loop through the process table to find a child that matches
the arguments and is a zombie -- if found, clean up this process and
send a message back to the caller and return
- if found a child that matches and is stopped, send a message
back to the caller and return
- if no child is ready to report, but one or more matching processes
are running, leave the caller waiting (on receive in the system
call), mark the caller as waiting, and return, unless WNOHANG
was specified
Minix exec
- part of exec is implemented in the exec library call,
/usr/src/lib/posix/_exec.c -- this builds the initial
stack in a buffer
- it might be dangerous to trust a user library to build the stack
right, but the stack is (over)writable by the user code anyway, so there
is no loss of security
- exec:
- checks to see that the stack size is reasonable
- checks to see that the file is accessible (by asking the file system)
- reads the executable file header (via FS)
- copies the stack into an internal buffer (before freeing the memory)
- looks to see if the text can be shared (find_share, p 765)
- allocates the new memory (p. 762 and below) -- this is the commit point,
if this succeeds, we can never return, because there is no longer an old
process image to return to
- relocates the stack (see below) and copies it to the newly allocated
stack segment
- reads in the text (unless shared) and data
- changes the effective user/group ID if executing a setuid/setgid file
- sets default signal handlers
- asks FS to close "close-on-exec" files
- asks the system to initialize the new stack pointer, with the
initial valid return address
- returns, sending a message to the process which is now ready to execute
Minix exec allocation
- new_mem, page 762
- compute needed sizes, all in multiples of clicks (look at the
arithmetic carefully -- ceiling computation)
- make sure sizes are reasonable, e.g. the data+bss segment does
not overlap the stack segment (gap < 0, line 17398)
- check to see if there is an area big enough for the whole process --
if not, fail. This is overly conservative, but if we return the memory
first, we will not be able to fail.
- free the current text (unless shared) and data + bss + stack segments
- allocate the new text (if necessary) and data + bss + stack segments
- initialize the three segment descriptors (T,
D, and S)
- ask the kernel to map these segments
- clear the uninitialized parts of these segments
Minix stack relocation
- patch_ptr, page 764
- stack built by user library assumes the initial
stack pointer will be zero
- the initial stack pointer is not at virtual address zero -- the
virtual address depends on the size of the text, data, bss, gap, and
stack segments
- MM assumes that anything in the initial stack is either a 0 --
a null pointer, used e.g. to mark the end of an arguments list -- or
a relative pointer
- starting from the top of the stack
- patch_ptr adds the segment base to any nonzero entry (pointer)
- until it has seen two null pointers, one to terminate the end of
the args list, the other to terminate the environment list
- sanity check is done to make sure this update doesn't run off the
end of the stack
program execution
- when a C program begins execution, it starts at a C runtime system
call crtso, whose entire function is to call main
- crtso pushes the addresses of the three arguments to
main, argc, argv, and envp, then calls
main
- the final stack is as in Figure 4-39d (page 370), remembering
that the stack grows downward
Modules
- exec and the stack relocation
are somewhat similar to what is needed for a module
- a module is a loadable piece of code that executes in kernel
space
- the module must be relocated when loaded, or must be position-independent
code, since it is loaded dynamically at an arbitrary location in kernel space
- entry points for each module must be recorded inside the kernel,
for example an initialization routine which sets everything up, including
e.g. interrupt handlers and read/write functions
- however, the module is not (e.g. in Linux) a separate process/task,
so it does not need its own stack -- it executes on the kernel stack that
is active when it is called
- if the kernel is multithreaded, however, the module must be
coded in a thread-safe way, e.g. locking global data structures
before modifying them
- because it executes with kernel privilege, a module can be a
correctness or security risk
- modules make it much easier to install new drivers, since no
kernel recompilation is needed