
There are three cases where the control must be transferred from the user to kernel:
These three cases are handled by single hardware mechanism called "trap".
For example in x86, program invokes a system call by generating an interrupt using the "int" instruction.
An interrupt stops the process loop and starts new sequence called "interrupt handler".
in xv6, the term "trap" is used. trap is caused by current process, interrupt is generated by device.
On the x86, interrupt handlers are defined in the interrupt descriptor table (IDT). The IDT has 256 entries, each giving the %cs and %eip to be used when handling the corresponding interrupt.
IDT is loaded using the LIDT assembly instruction, whose argument is a pointer to an IDT Descriptor structure.
x86.h
76 static inline void
77 lidt(struct gatedesc *p, int size)
78 {
79 volatile ushort pd[3];
80
81 pd[0] = size-1;
82 pd[1] = (uint)p;
83 pd[2] = (uint)p >> 16;
84
85 asm volatile("lidt (%0)" : : "r" (pd)); //loads IDT struct
86 }
When changing protection levels from user to kernel mode, the kernel shouldn’t
use the stack of the user process, because it may not be valid.
Instead, the hardware uses the stack specified in the task segment, which is set by the kernel; Xv6 programs the x86 hardware to perform a stack switch on a trap by setting up a task segment descriptor through which the hardware loads a stack segment selector and a new value for %esp.
When a trap occurs, the processor hardware does the following.
If the processor was executing in user mode, it loads %esp and %ss from the task segment descriptor, pushes the old user %ss and %esp onto the new stack.
The processor then pushes the %eflags, %cs, and %eip registers. For some traps (e.g., a page fault), the processor also pushes an error word.
The processor then loads %eip and %cs from the relevant IDT entry.

tvinit() is called from main(), and it set up the 256 entries of the idt.
Interrupt i is handled by address of the vectors[i].
It also specially sets up entry for the system call interrupt to allow a user program to generate trap with a explicit int instruction.
trap.c
17 void
18 tvinit(void)
19 {
20 int i;
21
22 for(i = 0; i < 256; i++)
23 SETGATE(idt[i], 0, SEG_KCODE<<3, vectors[i], 0);
24 SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER);
25
26 initlock(&tickslock, "time");
27 }
Each entry that vector points to pushs an error code if the processor didn’t, pushes the interrupt number, and then jumps to alltraps.
vector.S
1 # generated by vectors.pl - do not edit
2 # handlers
3 .globl alltraps
4 .globl vector0
5 vector0:
6 pushl $0
7 pushl $0
8 jmp alltraps
9 .globl vector1
..... up to vector 255; total 256 entries.
Alltraps continues to save processor registers: it pushes %ds, %es, %fs, %gs, and the general-purpose registers. (The processor or the trap vector pushes an error number,and alltraps pushes the rest.)
The result of alltraps is 'struct trapframe' that is used to return to user process state.
5 alltraps:
6 # Build trap frame.
7 pushl %ds
8 pushl %es
9 pushl %fs
10 pushl %gs
11 pushal //contain %eax, which contain system call number
Result:

after constructing trap frame, the trap frame is passed as an argument to trap() function.
18 # Call trap(tf), where tf=%esp
19 pushl %esp //push argument to the stack
20 call trap
21 addl $4, %esp //pop argument off the stack
Devices generates interrupts and Xv6 sets up the hardware to receive the interrupts.
priodical polling would be useful, but interrupts are preferable to polling if the events are relatively rare, so that polling would waste CPU time.
With the advent of the multiprocess CPU, a new way of handling interrupts is needed.
two parts: a part that is in the I/O system (the IO APIC, ioapic.c), and a part that is attached to each processor (the local APIC, lapic.c).
timer chip is in local APIC, generating 100 interrupt per second.
lapic.c
67 lapicw(TDCR, X1);
68 lapicw(TIMER, PERIODIC | (T_IRQ0 + IRQ_TIMER)); //programming the timer,
This line tells the LAPIC to periodically generate an interrupt at IRQ_TIMER, which is IRQ 0.
69 lapicw(TICR, 10000000);
the timer interrupts through vector32.
trap.h
30 #define T_IRQ0 32 // IRQ 0 corresponds to int T_IRQ
31
32 #define IRQ_TIMER 0
trap.c
50 case T_IRQ0 + IRQ_TIMER:
51 if(cpuid() == 0){
52 acquire(&tickslock);
53 ticks++;
54 wakeup(&ticks);
55 release(&tickslock);
56 }
57 lapiceoi();
58 break
Driver is the code in an operating system that manages particular device:
it tells the device to perform operation, generate the interrupt when done.
and it also handles the interrupt from a device.
Disk driver copies data from and to disk.
Disk driver represents its block with "struct buf":
flags represent relationship between disk and memory.
buf.h
1 struct buf {
2 int flags;
3 uint dev;
4 uint blockno;
5 struct sleeplock lock;
6 uint refcnt;
7 struct buf *prev; // LRU cache list
8 struct buf *next;
9 struct buf *qnext; // disk queue
10 uchar data[BSIZE];
11 };
12 #define B_VALID 0x2 // buffer has been read from disk
13 #define B_DIRTY 0x4 // buffer needs to be written to disk
The kernel initializes disk driver with ideinit() in main.c
It calls ioapicenable() to enable IDE_IRQ interrupt.
calls idewait() to wait until the disk is ready(status bit is present on I/O port 0x1f7)
and check if disk 1 is present by writing to 0x1f6 to select the status bit of the disk 1. (disk 0 is always available because the bootloader and kernel is loaded from the disk 0) then switch back to disk 0.
ide.c
50 void
51 ideinit(void) //initialize the disk driver at boottime
52 {
53 int i;
54
55 initlock(&idelock, "ide");
56 ioapicenable(IRQ_IDE, ncpu - 1); //enable IDE_IRQ interrupt
57 idewait(0); //polls the status bit until the busy bit is clear and read bit is set.
58 //status bit is present in I/O port 0x1f7
59
60 // Check if disk 1 is present; disk 0 is always available because the bootloader is stored in disk 0.
61 outb(0x1f6, 0xe0 | (1<<4)); //write to 0x1f6 to select disk 1 and waits for status bit to show up.
62 for(i=0; i<1000; i++){
63 if(inb(0x1f7) != 0){
64 havedisk1 = 1;
65 break;
66 }
67 }
After initialization, disk is not used until iderw() function.
It updates the locked buffer as indicated by its flags.
iderw() keeps the list of pending disk request in queue.
It appends the buffer to queue and if it is first pending request, calls idestart() to send it to disk hardware.
138 void
139 iderw(struct buf *b) //updates the locked buffer as indicated by the flag.
140 {
141 struct buf **pp;
142
143 if(!holdingsleep(&b->lock))
144 panic("iderw: buf not locked");
145 if((b->flags & (B_VALID|B_DIRTY)) == B_VALID)
146 panic("iderw: nothing to do");
147 if(b->dev != 0 && !havedisk1)
148 panic("iderw: ide disk 1 not present");
149
150 acquire(&idelock); //DOC:acquire-lock
151
152 // Append b to idequeue.
153 b->qnext = 0;
154 for(pp=&idequeue; *pp; pp=&(*pp)->qnext) //DOC:insert-queue
155 ;
156 *pp = b;
157
158 // Start disk if necessary.
159 if(idequeue == b)
160 idestart(b);
....
if the flag is B_DIRTY, idestart() must moves the data to a buffer in a disk controller by instruction outsl.
if the flag is B_VALID or others, handler will read the data.
idestart() knows detailed knowledge of the IDE device(which port to write..)
74 static void
75 idestart(struct buf *b)
76 {
....
95 if(b->flags & B_DIRTY){
96 outb(0x1f7, write_cmd);
97 outsl(0x1f0, b->data, BSIZE/4);
98 } else {
99 outb(0x1f7, read_cmd);
100 }
101 }
during reading and writing, iderw() yields the CPU time for other processes.
When the disk task is done, it will generate a interrupt and trap() will handle the interrupt with ideintr().
ideintr() read data from the disk controller with insl(), wake the process waiting for this buf, and pass the next waiting buffer to disk.
104 void
105 ideintr(void)
106 {
107 struct buf *b;
108
109 // First queued buffer is the active request.
110 acquire(&idelock);
111
112 if((b = idequeue) == 0){
113 release(&idelock);
114 return;
115 }
116 idequeue = b->qnext;
117
118 // Read data if needed.
119 if(!(b->flags & B_DIRTY) && idewait(1) >= 0)
120 insl(0x1f0, b->data, BSIZE/4);
121
122 // Wake process waiting for this buf.
123 b->flags |= B_VALID;
124 b->flags &= ~B_DIRTY;
125 wakeup(b);
126
127 // Start disk on next buf in queue.
128 if(idequeue != 0)
129 idestart(idequeue);