Understanding xv6: Trap, Interrupt and Drivers

1231·2024년 4월 10일

Understanding_xv6

목록 보기

3/6

There are three cases where the control must be transferred from the user to kernel:

System call
when a user program asks OS to do something
Illegal action (Exception)
when a user program perform illegal action. ex) segmentation fault, divide by zero etc...
Interrupt
when a device wants an attention from the OS. ex) clock chip generates interrupt every 100ms to allow kernel to implement time sharing.

These three cases are handled by single hardware mechanism called "trap".
For example in x86, program invokes a system call by generating an interrupt using the "int" instruction.
An interrupt stops the process loop and starts new sequence called "interrupt handler".
in xv6, the term "trap" is used. trap is caused by current process, interrupt is generated by device.

On the x86, interrupt handlers are defined in the interrupt descriptor table (IDT). The IDT has 256 entries, each giving the %cs and %eip to be used when handling the corresponding interrupt.
IDT is loaded using the LIDT assembly instruction, whose argument is a pointer to an IDT Descriptor structure.

x86.h

 76 static inline void
 77 lidt(struct gatedesc *p, int size)
 78 {
 79   volatile ushort pd[3];
 80
 81   pd[0] = size-1;
 82   pd[1] = (uint)p;
 83   pd[2] = (uint)p >> 16;
 84
 85   asm volatile("lidt (%0)" : : "r" (pd)); //loads IDT struct
 86 }

When changing protection levels from user to kernel mode, the kernel shouldn’t
use the stack of the user process, because it may not be valid.
Instead, the hardware uses the stack specified in the task segment, which is set by the kernel; Xv6 programs the x86 hardware to perform a stack switch on a trap by setting up a task segment descriptor through which the hardware loads a stack segment selector and a new value for %esp.

When a trap occurs, the processor hardware does the following.
If the processor was executing in user mode, it loads %esp and %ss from the task segment descriptor, pushes the old user %ss and %esp onto the new stack.
The processor then pushes the %eflags, %cs, and %eip registers. For some traps (e.g., a page fault), the processor also pushes an error word.
The processor then loads %eip and %cs from the relevant IDT entry.

Trap Handlers

tvinit() is called from main(), and it set up the 256 entries of the idt.
Interrupt i is handled by address of the vectors[i].
It also specially sets up entry for the system call interrupt to allow a user program to generate trap with a explicit int instruction.

trap.c

17 void
 18 tvinit(void)
 19 {
 20   int i;
 21
 22   for(i = 0; i < 256; i++)
 23     SETGATE(idt[i], 0, SEG_KCODE<<3, vectors[i], 0);
 24   SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER);
 25
 26   initlock(&tickslock, "time");
 27 }

Each entry that vector points to pushs an error code if the processor didn’t, pushes the interrupt number, and then jumps to alltraps.
vector.S

 1 # generated by vectors.pl - do not edit
   2 # handlers
   3 .globl alltraps
   4 .globl vector0
   5 vector0:
   6   pushl $0
   7   pushl $0
   8   jmp alltraps
   9 .globl vector1
..... up to vector 255; total 256 entries.

Alltraps continues to save processor registers: it pushes %ds, %es, %fs, %gs, and the general-purpose registers. (The processor or the trap vector pushes an error number,and alltraps pushes the rest.)
The result of alltraps is 'struct trapframe' that is used to return to user process state.

5 alltraps:
  6   # Build trap frame.
  7   pushl %ds
  8   pushl %es
  9   pushl %fs
 10   pushl %gs
 11   pushal  //contain %eax, which contain system call number

Result:

after constructing trap frame, the trap frame is passed as an argument to trap() function.

18   # Call trap(tf), where tf=%esp
 19   pushl %esp //push argument to the stack 
 20   call trap
 21   addl $4, %esp //pop argument off the stack

Interrupts

Devices generates interrupts and Xv6 sets up the hardware to receive the interrupts.

priodical polling would be useful, but interrupts are preferable to polling if the events are relatively rare, so that polling would waste CPU time.

With the advent of the multiprocess CPU, a new way of handling interrupts is needed.
two parts: a part that is in the I/O system (the IO APIC, ioapic.c), and a part that is attached to each processor (the local APIC, lapic.c).

timer chip is in local APIC, generating 100 interrupt per second.
lapic.c

67   lapicw(TDCR, X1);
 68   lapicw(TIMER, PERIODIC | (T_IRQ0 + IRQ_TIMER)); //programming the timer,
 This line tells the LAPIC to periodically generate an interrupt at IRQ_TIMER, which is IRQ 0. 
 69   lapicw(TICR, 10000000);

the timer interrupts through vector32.
trap.h

30 #define T_IRQ0          32      // IRQ 0 corresponds to int     T_IRQ
 31
 32 #define IRQ_TIMER        0

trap.c

 50   case T_IRQ0 + IRQ_TIMER:
 51     if(cpuid() == 0){
 52       acquire(&tickslock);
 53       ticks++;
 54       wakeup(&ticks);
 55       release(&tickslock);
 56     }
 57     lapiceoi();
 58     break

Drivers: Disk Driver

Driver is the code in an operating system that manages particular device:
it tells the device to perform operation, generate the interrupt when done.
and it also handles the interrupt from a device.

Disk driver copies data from and to disk.
Disk driver represents its block with "struct buf":
flags represent relationship between disk and memory.
buf.h

1 struct buf {
  2   int flags;
  3   uint dev;
  4   uint blockno;
  5   struct sleeplock lock;
  6   uint refcnt;
  7   struct buf *prev; // LRU cache list
  8   struct buf *next;
  9   struct buf *qnext; // disk queue
 10   uchar data[BSIZE];
 11 };
 12 #define B_VALID 0x2  // buffer has been read from disk
 13 #define B_DIRTY 0x4  // buffer needs to be written to disk

The kernel initializes disk driver with ideinit() in main.c
It calls ioapicenable() to enable IDE_IRQ interrupt.
calls idewait() to wait until the disk is ready(status bit is present on I/O port 0x1f7)
and check if disk 1 is present by writing to 0x1f6 to select the status bit of the disk 1. (disk 0 is always available because the bootloader and kernel is loaded from the disk 0) then switch back to disk 0.

ide.c

 50 void
 51 ideinit(void) //initialize the disk driver at boottime
 52 {
 53   int i;
 54
 55   initlock(&idelock, "ide");
 56   ioapicenable(IRQ_IDE, ncpu - 1); //enable IDE_IRQ interrupt
 57   idewait(0); //polls the status bit until the busy bit is clear and read bit is set.
 58   //status bit is present in I/O port 0x1f7
 59
 60   // Check if disk 1 is present; disk 0 is always available because the bootloader is stored     in disk 0.
 61   outb(0x1f6, 0xe0 | (1<<4)); //write to  0x1f6 to select disk 1 and waits for status bit to     show up.
 62   for(i=0; i<1000; i++){
 63     if(inb(0x1f7) != 0){
 64       havedisk1 = 1;
 65       break;
 66     }
 67   }

After initialization, disk is not used until iderw() function.
It updates the locked buffer as indicated by its flags.
iderw() keeps the list of pending disk request in queue.
It appends the buffer to queue and if it is first pending request, calls idestart() to send it to disk hardware.

138 void
139 iderw(struct buf *b) //updates the locked buffer as indicated by the flag.
140 {
141   struct buf **pp;
142
143   if(!holdingsleep(&b->lock))
144     panic("iderw: buf not locked");
145   if((b->flags & (B_VALID|B_DIRTY)) == B_VALID)
146     panic("iderw: nothing to do");
147   if(b->dev != 0 && !havedisk1)
148     panic("iderw: ide disk 1 not present");
149
150   acquire(&idelock);  //DOC:acquire-lock
151
152   // Append b to idequeue.
153   b->qnext = 0;
154   for(pp=&idequeue; *pp; pp=&(*pp)->qnext)  //DOC:insert-queue
155     ;
156   *pp = b;
157
158   // Start disk if necessary.
159   if(idequeue == b)
160     idestart(b);
....

if the flag is B_DIRTY, idestart() must moves the data to a buffer in a disk controller by instruction outsl.
if the flag is B_VALID or others, handler will read the data.
idestart() knows detailed knowledge of the IDE device(which port to write..)

74 static void
 75 idestart(struct buf *b)
 76 {
....
 95   if(b->flags & B_DIRTY){
 96     outb(0x1f7, write_cmd);
 97     outsl(0x1f0, b->data, BSIZE/4);
 98   } else {
 99     outb(0x1f7, read_cmd);
100   }
101 }

during reading and writing, iderw() yields the CPU time for other processes.
When the disk task is done, it will generate a interrupt and trap() will handle the interrupt with ideintr().
ideintr() read data from the disk controller with insl(), wake the process waiting for this buf, and pass the next waiting buffer to disk.

104 void
105 ideintr(void)
106 {
107   struct buf *b;
108
109   // First queued buffer is the active request.
110   acquire(&idelock);
111
112   if((b = idequeue) == 0){
113     release(&idelock);
114     return;
115   }
116   idequeue = b->qnext;
117
118   // Read data if needed.
119   if(!(b->flags & B_DIRTY) && idewait(1) >= 0)
120     insl(0x1f0, b->data, BSIZE/4);
121
122   // Wake process waiting for this buf.
123   b->flags |= B_VALID;
124   b->flags &= ~B_DIRTY;
125   wakeup(b);
126
127   // Start disk on next buf in queue.
128   if(idequeue != 0)
129     idestart(idequeue);

1231

이전 포스트

Understanding xv6: Page table

다음 포스트

Understanding xv6: Trap, Interrupt and Drivers