Understanding xv6: Page table

1231·2024년 4월 1일
0

Understanding_xv6

목록 보기
2/6

Page table: page table is mechanism through which operating system takes control over the address space and make isolation between each processes.

Paging Hardware


in x86, page table is array of 2^20(1048576) of page table entires(PTEs).
Each PTE has 20 bits of the Physical Page Number(PPN) and some flags.

Paging hardware replace top 20 bits of the virtual address with PPN, and rest of the bits, low 12 bits are replaced with translated physical address(offset).
-> because the size of the page is 4096 bytes (2^12).

Page table is usually done in two-level tree.
1. top 10 virtual address bits to find Page Directory.
->Page Directory contains 1024 PTEs, each points to page table that contains other 1024 PTEs: 2^10^10, total 2^20 PTEs can be used.
2. next 10 bits to find Page table
3. next, lowest 12 bits to indicate offset, replaced with translated physical address.

Each PTEs has flags that indicates its permissions:
PTE_P indicates whether the PTE is present
PTE_W controls whether instructions are allowed to issue writes to the page
PTE_U controls whether user programs are allowed to use the page

note: Physical memory refers to storage cell called DRAM and paging hardware translate virtual address to physical address of the DRAM. It means there is no 'virtual memory', there is only 'virtual address'.

Process Address Space:


user process address space can grow up to KERNBASE, kernel related data is stored from the KERNBASE.
kernel stack is mapped as KERNBASE:KERNBASE+PHYSTOP - 0:PHYSTOP.

By having the kernel stack in every process's page table, the need for switching page table in every interrupt or system call is removed.
Kernel itself does not have own page table. It always borrows from some process's page table.
Every byte of RAM can consume 2 bytes of virtual address space(same physical frame can be referred to two different virtual address), so xv6 cannot
use more than 2GB of RAM (since max 32-bit virtual address space is 4GB).

Creating address space

First, main call kvmalloc() to create and switch to a page table required to run the kernel.

17 int
 18 main(void)
 19 {
 20   kinit1(end, P2V(4*1024*1024)); // phys page allocator
 21   kvmalloc();      // kernel page table

kvmalloc() calls setupkvm().
setupkvm() set up kernel part of a page table then it calls mappages() to install translation needed for kernel. It does not install any translation needed for user.
Each virtual and physical address is described in kmap array.

vm.c

140 void
141 kvmalloc(void)
142 {
143   kpgdir = setupkvm();
...

mappages() installs translation for instructions, data and device I/O related memory. It separately install for each virtual address at page intervals.
For each mapped address, mappages() uses walkpgdir() to find a PTE of the virtual address(this will be filled with PPN later).

105 static struct kmap {
106   void *virt;
107   uint phys_start;
108   uint phys_end;
109   int perm;
110 } kmap[] = {
111  { (void*)KERNBASE, 0,             EXTMEM,    PTE_W}, // I/O space
112  { (void*)KERNLINK, V2P(KERNLINK), V2P(data), 0},     // kern text+rodata
113  { (void*)data,     V2P(data),     PHYSTOP,   PTE_W}, // kern data+memory
114  { (void*)DEVSPACE, DEVSPACE,      0,         PTE_W}, // more devices
115 };
116
117 // Set up kernel part of a page table.
118 pde_t*
119 setupkvm(void)
120 {
121   pde_t *pgdir;
122   struct kmap *k;
123
124   if((pgdir = (pde_t*)kalloc()) == 0) //kalloc() allocates a page and returns the page table.
125     return 0;
126   memset(pgdir, 0, PGSIZE);
127   if (P2V(PHYSTOP) > (void*)DEVSPACE)
128     panic("PHYSTOP too high");
129   for(k = kmap; k < &kmap[NELEM(kmap)]; k++)
130     if(mappages(pgdir, k->virt, k->phys_end - k->phys_start,
...

Physical Memory Allocator

after boot up, RAM contains free list of the pages.
main() calls kinit1() and kinit2().
These functions create free page list of the memory.
kinit1() creates free page of the first 4MB memory that is mapped in "entrypgdir" for kernel use.
next, kinit2() gathers many more free pages into its free page list.
because it is difficult for machine to know the how much physical memory is available, the machine assume it has PHYSTOP(224 MB) of physical memory.
main.c

 20   kinit1(end, P2V(4*1024*1024)); // phys page allocator
 25   kinit2(P2V(4*1024*1024), P2V(PHYSTOP)); // must come after startothers()

kalloc.c

 27 // 1. main() calls kinit1() while still using entrypgdir to place just
 28 // the pages mapped by entrypgdir on free list.
 29 // 2. main() calls kinit2() with the rest of the physical pages
 30 // after installing a full page table that maps them on all cores.
 31 void
 32 kinit1(void *vstart, void *vend)
 33 {
 34   initlock(&kmem.lock, "kmem");
 35   kmem.use_lock = 0; //lockless allocation 
 36   freerange(vstart, vend);
 37 }
 38
 39 void
 40 kinit2(void *vstart, void *vend)
 41 {
 42   freerange(vstart, vend);
 43   kmem.use_lock = 1;
 44 }

freerange() add specifc range of the page to free list using kfree().
Anyone who needs a free page calls kalloc(),
When memory needs to be freed up, kfree() is called.

 46 void
 47 freerange(void *vstart, void *vend)
 48 {
 49   char *p;
 50   p = (char*)PGROUNDUP((uint)vstart);
 51   for(; p + PGSIZE <= (char*)vend; p += PGSIZE)
 52     kfree(p);
 53 }
 54 //PAGEBREAK: 21
 55 // Free the page of physical memory pointed at by v,
 56 // which normally should have been returned by a
 57 // call to kalloc().  (The exception is when
 58 // initializing the allocator; see kinit above.)
 59 void
 60 kfree(char *v)
 61 {
 62   struct run *r;
 63
 64   if((uint)v % PGSIZE || v < end || V2P(v) >= PHYSTOP)
 65     panic("kfree");
 66
 67   // Fill with junk to catch dangling refs.
 68   memset(v, 1, PGSIZE);
 69  
 70   if(kmem.use_lock)
 71     acquire(&kmem.lock); 
 // Add free page to head of free list, update free list pointer
 72   r = (struct run*)v;
 73   r->next = kmem.freelist;
 74   kmem.freelist = r;
 75   if(kmem.use_lock)
 76     release(&kmem.lock);
 77 }
 
 
 79 // Allocate one 4096-byte page of physical memory.
 80 // Returns a pointer that the kernel can use.
 81 // Returns 0 if the memory cannot be allocated.
 82 char*
 83 kalloc(void)
 84 {
 85   struct run *r;
 86
 87   if(kmem.use_lock)
 88     acquire(&kmem.lock);
 // Sets free list pointer to next page and returns first free page on list
 89   r = kmem.freelist; 
 90   if(r)
 91     kmem.freelist = r->next;   
 92   if(kmem.use_lock)
 93     release(&kmem.lock);
 94   return (char*)r;
 95 }
 96

User part of an address space

exec() first reads ELF header from the file and check it is executable.
first, exec() sets up the kernel stack of the process using setupkvm().
Then uses allocuvm() and loaduvm() for allocating page and load the data into the memory from disk; allocuvm() uses kalloc() and mappages(), loaduvm() uses readi().

10 int
 11 exec(char *path, char **argv)
 12 {
 13   char *s, *last;
 14   int i, off;
 15   uint argc, sz, sp, ustack[3+MAXARG+1];
 16   struct elfhdr elf;
 17   struct inode *ip;
 18   struct proghdr ph;
 19   pde_t *pgdir, *oldpgdir;
 20   struct proc *curproc = myproc();
 21
 22   begin_op();
 23
 24   if((ip = namei(path)) == 0){
 25     end_op();
 26     cprintf("exec: fail\n");
 27     return -1;
 28   }
 29   ilock(ip);
 30   pgdir = 0;
 31
 32   // Check ELF header
 33   if(readi(ip, (char*)&elf, 0, sizeof(elf)) != sizeof(elf))
 34     goto bad;
 35   if(elf.magic != ELF_MAGIC)
 36     goto bad;
 37
 38   if((pgdir = setupkvm()) == 0)
 39     goto bad;
 40
 41   // Load program into memory.
 42   sz = PGSIZE;
 43   for(i=0, off=elf.phoff; i<elf.phnum; i++, off+=sizeof(ph)){
 44     if(readi(ip, (char*)&ph, off, sizeof(ph)) != sizeof(ph))
 45       goto bad;
 46     if(ph.type != ELF_PROG_LOAD)
 47       continue;
 48     if(ph.memsz < ph.filesz)
 49       goto bad;
 50     if(ph.vaddr + ph.memsz < ph.vaddr)
 51       goto bad;
 52     if((sz = allocuvm(pgdir, sz, ph.vaddr + ph.memsz)) == 0)
 53       goto bad;
 54     if(ph.vaddr % PGSIZE != 0)
 55       goto bad;
 56     if(loaduvm(pgdir, (char*)ph.vaddr, ip, ph.off, ph.filesz) < 0)
 57       goto bad;
 58   }

After executable is copied to memory image, allocate 2 pages for stack; one is guard page, permissions cleared, access will trap.
Push exec arguments onto user stack for main function of new program; Stack has return address, argc, argv array (pointers to variable sized arguments), and the arguments themselves.

63   // Allocate two pages at the next page boundary.
 64   // Make the first inaccessible.  Use the second as the user stack.
 65   sz = PGROUNDUP(sz);
 66   if((sz = allocuvm(pgdir, sz, sz + 2*PGSIZE)) == 0)
 67     goto bad;
 68   clearpteu(pgdir, (char*)(sz - 2*PGSIZE));
 69   sp = sz;
 70
 71   // Push argument strings, prepare rest of stack in ustack.
 72   for(argc = 0; argv[argc]; argc++) {
 73     if(argc >= MAXARG)
 74       goto bad;
 75     sp = (sp - (strlen(argv[argc]) + 1)) & ~3;
 76     if(copyout(pgdir, sp, argv[argc], strlen(argv[argc]) + 1) < 0)
 77       goto bad;
 78     ustack[3+argc] = sp;
 79   }
 80   ustack[3+argc] = 0;
 81
 82   ustack[0] = 0xffffffff;  // fake return PC
 83   ustack[1] = argc;
 84   ustack[2] = sp - (argc+1)*4;  // argv pointer
 85
 86   sp -= (3+argc+1) * 4;
 87   if(copyout(pgdir, sp, ustack, (3+argc+1)*4) < 0)
 88     goto bad;

and switch to new page table that is pointing to new memory image..

0개의 댓글