1. C Sample program and Assembly Code
in C, if address of i is 4096,
int i = 0;
is the same as
lw $1, 4096 #loading what in address 4096
addi $1, $0, 0 #$1 = $0 + 0
sw $1, 4096 #i=0
in Assembly Code.
i = i + 20 in Assembly Code is,
lw $2, 4096 #load 4096 to $2,
addi $3, $2, 20 #add 20 to $2 and store it in $3
sw $3, 4096 #replace 4096 value with $3.
2. Program Execution (memory)
When a program is run, CPU reads program stored in hard disk, then loads the program onto RAM.
The program loaded onto RAM is called process.
OS allocates spaces in RAM.
The 4 types of spaces that OS allocates to RAM are code segment(.code), initialized data segment(.data), uninitialized data segment(.bss), stack
, and heap.
Life time of the variables is the same as the life time of the function.
3. Computer Organisation
The computer contains 2 pieces - memory and processor (CPU).
With the development of technology, processor (CPU) performs so fast that the memory (RAM) cannot keep up with it.
Here is where cache plays its role. Cache is located on the processor and stores frequently used data, so that processor can access cache data instead of accessing RAM which takes longer time.
There are L1, L2, and L3 caches. The higher the number, the longer it takes to retrieve data.
There are data cache and instruction cache. They used to be together (von neumann's architecture), but they are now separated after Harvard architecture, due to instructions like mov.
Memory:
- Storage for instruction and data
- Managed by OS
Cache:
- Duplicate part of memory for fast access
- Hardware managed
- data & instruction cache
CPU:
Fetch Unit:
- loads instruction to memory
- location of the instructions' addresses are indicated by Program Counter
Functional Units:
- Carry out instruction execution (ALU, Memory Unit...)
- Different units are dedicated to different instruction types.
Registers:
- Internal storage for fastest access speed
- General Purpose Register (GPR): Accessible by User Program to store small data, execute calculation and indicate data address.
- Special Register: PC, Stack Pointer, Frame Pointer, ...
4. Process Abstraction
4-1. Stack Memory
- When process is created, stack memory is also created in the user space of memory.
Control Flow Issues:
- How can I jump from one function body to another? (calling function inside a function)
- How can I resume after function call is done?
- How can I store the PC of the caller?
Data Storage Issues:
- How can I pass in the parameters to the function?
- How can I store result?
- How can I store local variables?
Stack Memory is the Solution.
The stack memory region stores information for function invocations.
Stack Frame contains:
- Return address of the caller
- Arguments (parameters) for the function
- Storage for local variables
- Stack Pointer:
- Pointer that points to the top of the stack region (where the new function invocation will be stacked on top.)
- Most CPU has a special register for this.
- Program Counter points at instructions, stack pointers point at data.
Stack Frame Setup Procedure:
(Known as function call convention)
Let's say that f is the caller and g is the callee function.
Prepare to make a function call (done by the compiler):
- Caller: Pass parameters (a and b) using registers and/or stack
- Caller: Save Return PC (next line of instruction after the c = g(a, b))
- Transfer control from caller to callee
- Callee: Save the old stack pointer (SP)
(In order to keep the order of stack frames. REMEMBER, PC is not equal to Stack Pointer)
- Callee: Allocate space for local variables of callee on stack (local var "a")
- Callee: Adjust SP to point to new stack top
Teardown Call (Returning from function call):
- Callee: Place return result in register (if needed)
- Callee: Restore saved Stack Pointer
- Transfer control back to caller using saved PC.
- Caller: Utilize return result (if needed)
- Caller: continues execution in caller.
Frame Pointer
- used to facilitate the access of various stack items.
- to access something in the middle of the frame.
- FP points to a fixed location, other items are accessed by calculating offsets.
- The usage of FP is platform dependent, some platforms may just use stack pointer.
When do we use stack or frame pointer?
Stack Pointer:
- Referencing values at the top of the stack.
- Returning the frame pointer to the original position
- Setting the frame pointer to the middle of the stack
Frame Pointer
- Accessing values in the middle of the frame
- Accessing parameters
- Can be used to store
Saved Registers
-
Subset of General Purpose Registers (GPR).
-
Also known as callee-saved registers, saved registers hold values that need to be maintained during the execution of a function to ensure consistency and allow for proper resumption of the calling code.
-
Purposes of Saved Registers are:
- Preserving the caller's state
- Maintaining variables across function calls
- Supporting nested function calls
Register Spilling
- The number of GPRs are limited.
- When GPRs are exhausted, we:
- use the memory to store the register values,
- reuse the GPRs for other purposes,
- then restore values afterwards.
- This is known as Register Spilling.
Stack Frame Setup/Teardown Updated with FP
On function call:
- Caller: Pass parameters (a and b) using registers and/or stack
- Caller: Save Return PC (next line of instruction after the c = g(a, b))
- Transfer control from caller to callee
- Callee: Save registers used by callee. Save old SP and FP.
- Callee: Allocate space for local variables of callee on stack (local var "a")
- Callee: Adjust SP to point to new stack top
On return call:
- Callee: Place return result in register (if needed)
- Callee: Restore saved SP and FP.
- Transfer control back to caller using saved PC.
- Caller: Utilize return result (if needed)
- Caller: continues execution in caller.
Dynamically Allocated Memory
- when we do not know exactly how much memory my program will need at compile-time, DAM is used to acquire memory space during execution time
- In C: malloc() function.
- In Java and C++: new keyword.
- DAM should use heap:
- Since the size is not known during the compilation time, we cannot store it in the data segment (.data)
- Since we don't know when the data should be deallocated, we cannot store it in stack as well because the stack may be deallocated after the function call.
Managing DAM
- is a lot harder due to:
- variable size,
- variable allocation & deallocation timing,
- which could lead to fragmentation, causing gaps in between memory blocks.
4-2. Process Abstraction
Process:
- is an abstraction that provides a simplified view of program execution within an operating system.
- Below are information involved while a program is running (process is running).
Process Identification:
- refers to the assignment of a unique identifier or identification number to each process running within an operating system.
- common approach is using process ID (PID).
- PIDs are reused since PID length is 32 bits, so if there are more processes than 32 bits can identify, PID must be reused.
Process State:
- A program can be running or not running.
- Process state indicates whether it is running or not.
2 States:
The process of switching from one process to executing is called context switch.
- Context means all the info needed for one process to start again (PC, value of registers, etc).
- the current execution context is backup-ed in Process Control Block (PCB), then the next process uses its PCB to execute it again.
5 States Process Model:
New:
- New process created
- Might be initializing -> might not be ready yet.
Ready:
- waiting for its turn to run
- waiting for CPU to be allocated to it.
Running:
- Process being executed on CPU by scheduler (scheduler is a crucial component in OS kernel that schedules the processes.)
Blocked:
-
Process waiting for event (Sleeping) Cannot execute until event is available
-
We CANNOT go from blocked state to running, we need to go ready. The process will wait to be picked up by the scheduler.
-
While we are waiting for some input or data (ie user i/o), the process will not occupy the CPU and switch to blocked state, letting other process to continue.
Terminated:
- Can be terminated at any point, gracefully or forcefully. May require OS cleanup
Transitions:
- Create: New process is created.
- Admit: process ready to be scheduled for running
- Switch Scheduled: process selected to run
- Switch Release CPU: Process gives up CPU voluntarily or prompted by scheduler (Time Quanta: the maximum amount of time that a process can execute. After the time has passed, the program is preempted by the OS.)
- Event Wait: Process requests events/resources which is not available at the moment.
- Event Occurs: process can continue.
With 1 CPU core:
- <= 1 process in running state
- Conceptually 1 transition at a time
With m CPU cores:
- <= m process in running state
- Possibly parallel transition
Queuing theory
- When the process is admitted, goes into the queue.
- When process released the CPU, it can either go into block, return to queue or terminate. The block itself is a queue.
- When the event occurs, the process in the blocked queue gets admitted again.
Process Table & Process Control Block
- Kernel maintains PCB for all processes.
- When process is created PCB is created in kernel, and when the process is done PCB is gone.
- Stored in PCB are:
- PID
- GPRs
- Process state
- CPU scheduling info
- Memory REgion Info
4-3. System Calls
- API to OS.
- Function wrapper vs function adaptor??
General System Call Mechanism:
-
User program invokes the lib call
-
Lib call places the system call number (system call id) in a designated location e.g register
-
Lib call executes a special instruction to switch from user mode to kernal mode (instruction known as Trap)
-
In kernel mode, the appropriate system call handler is determined:
a. Save CPU state??
b. Using system call number as index, find syscall handler.
c. Step B is usually handled by a dispatcher
- System call handler is executed. We cannot directly call the function, we need a wrapper or interface to allow us to move to the kernal mode to execute the code in the syscall. The program cannot directly invoke the kernel mode.
- Sys call handler ended
- Lib call return to user program
- Executing a machine level instruction can cause exception
- Examples:
- Arithmetic errors: Overflow, division by zero
- Memory errors: illegal memory address, misaligned memory access
- Exception is synchronous
- Occur due to program execution.
- Effects of exception:
- Have to execute an exception handler.
- Similar to a forced function call.
- External events can interrupt the execution of a program.
- Usually hardware related e.g: timer, mouse movement, keyboard press
- Interrupt is asynchronous
- Event that occurs independent of program execution.
- Timer can end at any point of time.
- Effects of interrupt:
- program execution is suspended.
- Interrupt handler executed automatically.
4-4. Unix Case Study
- unistd.h provides access to POSIX OS (UNIX).
Fork()
- Fork creates a new process, a copy of the same process but it is a child.
- Child has PPID value stored as well.
- Fork() itself is not useful (still need to provide full code for child process)
- Use exec() system calls family!
- Execv, execl, execle, execlv etc.
Execl()
- Replace current executing process image with a new one.
- known as Code replacement (aka. process image replacement)
- PID is still intact.
- Only necessary information to run the executable file is overwritten.
Path: location of executable Arg0,…
arg[n] : Command Line Argument(s)
NULL: to end list of argument
Fork() + Exec()
- Spawn off a child process using fork()
- Let child process perform a task using exec().
Master Process
- is a special initial process.
- every process has parent process.
- going up the hierarchy, at the top in UNIX there is init process.
- created at bootup time.
- traditionally has PID = 1.
- watches for other processes and respawns when needed.
Process Termination in Unix
- Status is returned to the parent process
- Parent process then takes necessary action based on the exit status (if failed do something)
- Unix convention for exit status:
- exit(0): normal termination (success)
- exit(!0): termination failure
- function does not return:
- subsequent codes will not execute.
Process on exit (or finishing execution):
- Most system resources used by process are released on exit (like file descriptor)
- Some basic process resources are not releasable:
- PID & status needed.
- for parent and child synchronization.
- process accounting info (like CPU time)
- Implicit exit():
- Most programs have no explicit exit() call.
- Return from main implicitly calls exit:
- Open files also get flushed automatically
- Any buffered info in the memory must be written in the storage before exit,
- resulting in flushing the buffered data.
Parent/Child Synchronisation
- parent process can wait child process to terminate.
- Returns PID of terminated child process
- Behavior:
- The call is blocking:
- Parent process blocks until at least one child terminates.
- The call updates the internal process management structures and releases the process table entry associated with the terminated child process
- Process table entry is not removed on exit()
- the parent process can obtain exit status and handle it appropriately.
- On exit, process becomes "zombie process".
Process state diagram in UNIX
Zombie Process and Orphan Process
- Orphan: Parent process terminates before child process
- Child does not terminate
- Init process becomes “pseudo” parent for child
- Child termination signals init which utilizes wait() to clean up
- Zombie: Child process terminates before parent but parent does not call wait.
- Child process becomes a zombie process
- Fills up process table (becomes messy to need reboot to clear table)
Issues with fork()
- Memory copy is expensive -> potentially need to copy whole memory space
- Copy on write – optimization for memory copy operation
- child pointer to point to the same location
- If child tries to modify (write), they will have individual copies
Clone()
- creates a copy of process without copying the entire address space
- Can decide what to share (key or stack)