🖥️ Background
- Importance of memory management
- In multitasking system, several processes may exist in memory
- Memory is central of modern computer system
- Memory structure
- A large array of bytes each with its own address
- Memory unit is interested only in the sequence of memory addresses
Basic Hardware
- Protection from other processes
- Each process has a separate memory space.
- Can be implemented by base register and limit register
→ Why register? : memory에 넣으면 참조하는데 오래걸림. in CPU에 두고 빠르게 참조
- CPU H/W compares every address generated
- Base / limit registers can be loaded by privileged instruction
Address Binding
- Most systems allow a user process to reside in any part of the physical memory
- Representation of address
- Source program: symbolic address
Ex) count (variable), function name
- Compiler
- Symbol → relocatable address(지역적 주소) (binding)
Ex) 14 bytes from beginning of the module
- Linkage editor or loader
- Relocatable address → absolute address(전역적 주소, 절대적 주소) (binding)
Ex) 0x74014
- Address binding(변환과정): mapping one address space to another
→ 최종 binding을 의미
- Compile ➔ linking ➔ loading ➔ execution (linking, loading을 OS에 따라 하나로 볼 수도 있음)
- Address binding time
- Compile time: absolute code ( relocation X )
- MS-DOS .com file
- If starting location changes, the program should be recompiled
- Load time: relocatable code
- If starting location changes, the program should be reloaded.
- Execution time → binding
Ex) Systems that support logical addressing
- Used by most general-purpose OS’s
- Requires H/W support for address mapping
Logical vs. Physical Address Space
- Logical address (in Process): address generated by CPU in common cases
- Physical address (전역적 주소): address seen by memory unit
- Logical address vs. physical address
- In compile-time and load-time address binding
- Logical address = physical address
- In execution-time address binding
- Logical (virtual) address = physical address
- Address spaces
- Logical address space: set of all logical addresses generated by a program
- Physical address space: set of all physical addresses corresponding to these logical addresses
- Run-time address mapping is done by memory-management unit (MMU)
- A simple method of address mapping using relocation register (base register)
- User program never sees physical addresses
→ Why? Process 별로 memory 영역이 다르기 때문 (독립적인 영역)
Dynamic Linking and Shared Library
→ 다시 load되지 않고 shared : 공유해서 사용시 주의!!
- Static linking: all object modules are combined into the binary program image
- Dynamic linking: linking is postponed until execution time
→ 불필요한 중복을 막고, 효율적으로
Ex) System libraries
- Stub: a small piece of code that indicates ...
→ 본체 대신 link, 필요할 때 load and link
- How to locate the appropriate library routine
- How to load the library if it is not ready
- After a library routine is dynamically linked, that routine is executed directly
- stub replaces itself with address of the routine and execute it.
- Advantages of dynamic linking
- Library code can be shared among processes.
- Multiple versions of a library can coexist
- Library update doesn’t require re-linking.
- Dynamic linking requires help from OS
🖥️ Swapping
- When a process starts, it must be in memory for execution
- However, a process can be swapped to a backing store and back into memory to continue execution
- A process swapped-out will be swapped back into the same memory space it occupied previously
- Exception: execution-time binding → MMU relocation register만 바꿔주면 된다
- Backing store: fast disk (usually, separated from file system → 다른 partition 필요)
- Should be large enough to accommodate copies of all memory images for all users
- Must provide direct access to memory images
- Dispatching → context switcing에 무슨 일이 일어나야 하는지
- Firstly see if the process selected by scheduler is in memory.
- If not, it swaps out a process and swaps in the desired process
- Context-switch time in swapping system
- Ex) size of a process: 100 MB → 쪼개서 time ⬇️
Disk transfer rate: 50 MB/s → 보통 SSD는 10~30배
Average latency: 8 ms → find data
Then, swapping time: (100MB / 50MB/s + 8ms) * 2(swap out, in) = 4,016 ms
- Swapping time is affected by amount of memory swapped
- We would need to swap only what is actually used.
- Time slice should be long relative to swap time
→ swap time > time slice : CPU가 놀고, Disk만 돌게 된다
- Many systems use modified versions of swapping
- Swapping is enabled only when memory usage exceeds some threshold
🖥️ Contiguous memory allocation
→ Contiguous(연속적인)
→ Externel fragmentation
- Memory is usually divided into two partitions
- Resident OS
- High (or low) memory with interrupt vector
- User processes
- The other part of memory
- Several user processes can reside in memory at the same time
→ How to allocate memory to processes waiting in input queue?
Memory Mapping and Protection
- Memory mapping and protection are provided by relocation register and limit register
- Add memory access, MMU maps the logical address dynamically.
- Relocation and limit registers are loaded by dispatcher during context switching
Memory Allocation
- Variable partition scheme (MVT) → Memory allocation 정책
- Initially all available memory forms a large block of free memory (hole)
- If a process arrives and needs memory, search a hole large enough for this process.
- If a process terminates, it releases its memory
- Adjacent holes can be merged
- MVT: Multiprogramming with a Variable number of Tasks
Fragmentation
- External fragmentation: free memory space broken into little pieces
→ Process memory 밖에 fragmentation
- Sometimes, it’s impossible to allocate large continuous memory although total size of free memory larger than the required memory
- 50-percent rule (analysis of first fit)
- Given N allocated blocks, another 0.5N blocks will be lost to fragmentation. (1/3 of memory may be unusable)
- Fragmentation of main memory also affects backing store
- Memory allocated to a process may slightly larger than requested memory.
Ex) allocating 18462 bytes(require) from hole whose size is 18464 bytes → 2 byte : internel framentation
- Reduce overhead to keep small holes.
- Generally, physical memory can be broken into fixed-size blocks.
- Internal fragmentation
- Difference between the amounts of allocated memory and the requested memory
- Remedy of external fragmentation: compaction
- Possible only if relocation is dynamic and is done at execution time
- Expensive
- Alternative remedy: permit logical address space of the processes to be noncontiguous
- Paging → 일정한 단위로 쪼갬 → Externel fragmentation 해결, but internel fragmentation
- Segmentation
🖥️ Paging
→ Internel fragmentation
- A memory-management scheme that permits the physical address space of a process to be noncontiguous.
- Physical memory: consists of frames (fixed-size blocks) → location
- Logical memory: consists of pages (fixed-size blocks) → mem content
- Pages are mapped with frames through ⭐️ page table (mapping table)
- Paging model of logical and physical memory
- Logical address consists of a page number and a page offset → page 내에서 상대적 좌표
- Size of logical address space: 2m
- Page size: 2n → Ex 256 → page offset : 8-bit
- An example of paging
- Paging hardware
- Overhead of paging
- Page table → In memory, size가 상당히 크기 때문, lookup 해서 physical memory에 접근하기 때문에 memory 2번 access
- Internal fragmentation
- In average, half page per process
Note! In paging, no external fragmentation보다는 internel이 나음
- In many systems ...
- Page size (=offset): 4 KB ~ 4 MB
- 4KB = 4096 bytes(212) = 12-bit (offset), 32-12 = 20-bit (page#) → 220 = 1,000,000개의 table
- 4MB = 222 = 22-bit(offset), 32-22 = 10(page#) → 210 = 1024개의 table
- Page table entry: 4 bytes (32 bits) → 20 + 12(속성 flag)
X86 Page-Table Entry
Frame Table
- OS keeps information about frames in a frame table
- # of total fames
- Which frames are allocated
- Which frames are available
- OS also maintains a list of free frames
Hardware Support
- Most OS’s allocate a page table to each process
- Pointer to the page table is stored with register(PTBR) values (in PCB)
- In context-switching, dispatcher also loads the correct H/W page table
- H/W implementation of page table
- A set of dedicated registers
- Efficiency is very important.
Ex) DEC PDP-11: page table (8 entries) is kept in fast registers → not feasible if page table is large
- Page-table base register (PTBR) → page table 시작 주소 저장
- Page table is stored in main memory
- PTBR points the page table (eg. CR3 register of x86)
cf. Remaining problem: overhead to access page table in main memory (requires two accesses to access a byte.)
- Translation look-aside buffer (TLB): small fast look up H/W cache
→ address translation을 위한 cache memory
- Associative, high-speed memory, consists of pairs (key(page #), value(frame #)
- If page number of logical address is in TLB, delay is less than 10% of unmapped memory access
- Otherwise (TLB miss), access page table in main memory
- Insert (page number, frame number) into TLB
- If TLB is full, OS select one for replacement
- Some TLB’s store address-space identifiers (ASID) in each entry (ex: MIPS)
→ Context switch시 page table 교체, TLB도 의미 X → ASID 사용해서 구분
→ page# + frame# + ASID
- ASID identifies process
- Processors w/o ASID (ex: x86) flush entire TLB on process switch
- Paging using TLB
Associative Memory
- Associative memory (like hash) – parallel search (H/W search, not S/W)
- Address translation (p, d)
- If p is in associative register, get frame # out
- Otherwise, get frame # from page table in memory
Protection
- Each frame has a protection bits
- Read-write / read only
- Attempt to write to a read-only page causes H/W trap
→ Extension: read-only, read-write, execute-only, ...
- Each entry of page table has valid-invalid bit
- OS does not allows to access the page if it is invalid
- Rarely does a process use all of its address range
- Page table can be wasting memory
→ some system provides page-table length register (PTLR) (valid한 table length 저장)
- Valid-invalid bit in a page table
Shared Pages
- Paging provides possibility to share common code
Ex) Process P1, P2, P3 runs the same text editor with different data (editor should be non-self-modifying code)
🖥️ Structure of page table
Hierarchical Paging
- Motivation: If a system supports large logical address space, page table itself becomes significant overhead
Ex) 32-bit address space, page size is 4 KB(212)
size of page table = 2(32−12) > 1 Million
- Difficult to allocate contiguous memory → page table은 연속적인 공간에 있어야함
- Hierarchical paging
- Page table itself is also paged
- Structure of logical address
Hashed Page Table
- Use hashing to for page table
- Common approach for large address spaces (> 32bits) → 크기 때문에
- Each location of hash table consists of linked list of page table entries
🖥️ Segmentation
- Memory from user’s view: collection of variable-size segments, with no necessary ordering
Ex) stack, math library, main program, ...
- Segmentation: memory-management scheme that supports user view of memory
- Logical address space is a collection of segments
- Each segment has a name (number) and length
Logical address = <segment-number, offset>
Ex) C compiler might create segments such as ...
- Code
- Global variables, heap, stacks
- Standard C library
- Example
Hardware for Segmentation
- 2D address should be mapped 1D sequence of bytes
- Segment table
- Consists of segment base and segment limit
→ base address and size→ memory mapping을 segment 단위로
Segmentation Architecture
- Segment table – maps two-dimensional physical addresses; each table entry has:
- base – contains the starting physical address where the segments reside in memory
- limit – specifies the length of the segment
- Segment-table base register (STBR) points to the segment table’s location in memory
- Segment-table length register (STLR) indicates number of segments used by a program;
segment number s is legal if s < STLR
- Protection and sharing is also possible for segmentation
🖥️ Example: Intel Pentium
- Some architectures supports both paging and segmentation
- Intel x86: pure segmentation, segmentation with paging
Pentium Segmentation
- Maximum size of a segment: 4 GB → 32-bit 머신이 가지는 가장 큰 address (232 = 4G)
- Maximum # of segments: 16 K (=16,384)
- 1st partition for private segments (8 K)
→ kept in local descriptor table (LDT) → process마다 8K segment table
- 2nd partition 8 K for shared segments (8 K)
→ kept in global descriptor table (GDT)
- Pentium has ...
- Six segment registers (CS(Code), SS(Stack), DS(Data), ES(Extra), FS, GS)
→ segment #를 저장, offset은 pointer
- Six 8-byte microprogram registers to hold descriptors from LDT or GDT
- Cache for LDT/GDT entry → segment table을 caching
- Segment descriptor
- 64 bit information that describes attribute of a segment
- Logical address: pair of (selector, offset)
- Selector: 16 bit number
- Segment number (13 bits=8 k)
- GDT or LDT (1 bit)
- Protection (2 bits)
- Offset: 32-bit number
Pentium Paging
- Pentium supports a page size of 4 KB or 4 MB
- Page directory entry includes page size flag
- 4 KB page: two-level paging (page size flag = 0)
- PSE bit of CR4 register
- 4 MB page (page size flag = 1)
HGU 전산전자공학부 김인중 교수님의 23-1 운영체제 수업을 듣고 작성한 포스트이며, 첨부한 모든 사진은 교수님 수업 PPT의 사진 원본에 필기를 한 수정본입니다.