Linker and Loaders

김관주·2023년 10월 18일
0

시스템 프로그래밍

목록 보기
7/12

Translating and Starting a Program

  • A translation hierarchy for high-level language program like C

    • First compiled into an assembly language program and then assembled into an object module in machine language
    • 어셈블리 언어 프로그램으로 컴파일된 다음 기계 언어로 객체 모듈로 assemble 된다.
    • The linker combines multiple modules with library routines to resolve all references
    • The loader then places the machine code into the proper memory locations for execution by the processor
  • To speed up the translation process, some steps are skipped or combined, but these are the logical 4 phases that programs go through:

    • Some compilers produce object modules directly, and some systems use linking loaders that perform the last two steps
    • 일부 컴파일러는 객체 모듈을 직접 제작하고, 일부 시스템은 마지막 두 단계를 수행하는 연결 로더를 사용합니다

Linkers

  • Separate compilation
    • Splitting a program across many files, each of which contains a logically related collection of subroutines and data structures that form a module in a larger program
    • 프로그램을 여러 파일로 분할하는 것은 프로그램의 큰 모듈에서 논리적으로 관련된 하위 루틴과 데이터 구조의 모음을 포함하는 각 파일을 말합니다. 이를 통해 프로그램을 더 모듈화하고 유지보수하기 쉽게 만들 수 있습니다. 각 파일은 특정 작업이나 기능을 담당하며, 전체 프로그램은 이러한 모듈들의 조합으로 이루어집니다. 이로써 코드를 구조화하고 협업을 용이하게 하며, 큰 규모의 프로젝트를 관리하기 훨씬 쉽게 만들 수 있습니다.
    • A file, also called a module, can be compiled and assembled without knowledge of what is in the other files, so changes to one module do not require recompiling the entire program
    • Separate compilation necessitates the additional step of linking to combine object files from separate modules and fix their unresolved references

  • Linker
    • A system program that combines independently assembled machine
      language programs, and resolves all undefined labels into an executable file (by using the relocation information and symbol table in each object module)
    • The linker produces an executable file that can be run on a computer. Typically, this file has the same format as an object file, except that it contains no unresolved references
  • In summary, a linker performs 3 tasks:
    • Searches the program libraries(= prewritten routines) to find library routines used by the program
    • Determines the memory locations that each object module will occupy and relocates its instructions by adjusting absolute references
    • Resolves external references among object files

링커를 이용하는 이유?

Loader

  • A program produced by the linker can be run (i.e., executable). Before being run, the program resides in a file on secondary storage, such as a disk. The loader in an operating system brings a program into memory and starts it running

  • Thus, to start a program, the loader performs the following steps:

  1. Read the executable file’s header to determine the size of the text and data segments
  2. Creates a new address space (for the program) large enough for the text and data segments, along with a stack segment
  3. Copies the instructions and data from the executable file into the new address space (i.e., memory)
  4. Copies the parameters (if any) to the main program onto the stack
  5. Initializes the machine registers and sets the stack pointer to the first free location
  6. Jumps to a start-up routine that copies the parameters from the stack to registers and calls the main routine of the program. When returning from the main routine, the start-up routine terminates the program with the exit system call
  • Loader
    • A system program of performing the loading function, which places an object program in main memory for execution
  • Many loaders also support relocation and linking.
    • Relocation modifies the absolute addresses in the object program.
    • Linking combines two or more separate object programs.
  • Two kinds of a loader:
    • Absolute loader
    • Linking loader

Basic Loader Functions

  • Absolute Loader
    • The simplest loader with neither linking nor program relocation operations
  • All functions are accomplished in a single pass.
    • Header record is checked to verify that the correct program has been presented for loading and that it will fit into the available memory
    • As each Text record is read, the object code it contains is moved to the indicated(표시된) address in memory
    • When the End record is encountered, the loader jumps to the specified address to begin execution of the loaded program

Fig. 3.1 Loading of Absolute Program

  • (a) Example of the object program (same as Fig. 2.3)

  • (b) The program loaded in memory

Fig. 3.2 Absolute Loader Algorithm

Boot Loader (Bootstrap loader)

OS를 컴퓨터에 로딩하는 역할을 하므로 absolute loader 중의 하나.

  • A special type of absolute loaders, executed when a computer is first turned on.
    • Loads the first program to be executed (usually, an operating system)
  • The h/w logic reads the boot loader from address 0 of ROM
    • ROM is installed by the manufacturer
    • ROM contains bootstrapping program (boot loader) and some other routines that controls h/w (e.g., BIOS)
  • No relocation & no linking

Absolute Loader: Pros & Cons

  • Pros
    • Simplicity & Efficiency
  • Cons
    • Programmers must specify (when the program is assembled) the actual address at which the program will be loaded into memory.
      • Not possible with larger and more advanced machines
    • All subroutines must have pre-assigned absolute addresses.
      • Makes it difficult to use subroutine libraries efficiently
    • Cannot run several programs together

Machine-Dependent Loader Features

  • Linking Loader
    • More complex loader, suitable for use on a SIC/XE system and on some advanced computers
    • Program relocation, linking, and simple loading functions
  • Relocation
    • The way relocation is implemented in a loader is dependent on machine characteristics.
    • Two methods for specifying relocation as part of the object program -> use of “Modification record” and “Relocation bit

Relocation with M record

  • A general method for specifying relocation as part of object program: “Modification record scheme” for SIC/XE
    • Fig. 3.4, the source program equal to Fig. 2.6
      • Most of the instructions use relative or immediate addressing
    • Fig. 3.5, the object program corresponding to Fig. 3.4
  • Not well suited for all machine architectures, such as SIC basic version
    • Fig. 3.6, the relocatable program written for the standard SIC
      • Most instructions must be modified for program relocation
      • This requires 31 Modification records, which results in an object program more than twice as large as the one in Fig.3.5

Fig. 3.5 Object Program with Relocation by Modification Records

  • There is one Modification record for each value that must be changed during relocation (in this case, 3 instructions)
    • Each Modification record specifies the starting address and length of the field whose value is to be altered. It then describes the modification to be performed
    • Here, all modifications add the value of the symbol COPY, which represents the starting address of the program

Fig. 3.6 Relocatable Program for a standard SIC Machine



Relocation with Bit Mask

  • An alternative method for specifying relocation as part of object program: “Relocation bit mask” for SIC
    • Useful for a machine primarily using direct addressing and having a fixed instruction format
    • Observation of modification record
      • Record format: [address to be modified, length]
        • If instruction format is fixed, the length part can be removed
        • Instead of address field, bit-vector (= relocation bit) can be used.
    • The relocation bit can be set either to 1 or 0:
      • “1” -> program’s starting address is to be added to the address field when relocated
      • “0” -> no modification is necessary

Fig. 3.7

10개의 비트만 사용하고 나머지 2개 비트는 사용하지 않는비트이므로 항상 0.

Program linking

  • You should review Section 2.3.5 Control Sections and Program Linking
  • Brief Summary on “program linking”
    • A program can be thought as a logical entity that combines all of the related control sections.
    • However, from the loader’s point of view, there are only control sections that are to be linked, relocated, and loaded – i.e., no such thing as a program in this sense

EXTDEF & EXTREF

  • EXTDEF (External Definition)
    • The EXTDEF statement in a control section names symbols, called external symbols, that are defined in this (present) control section and may be used by other sections
    • e.g.) EXTDEF BUFFER, BUFFEND, LENGTH
      EXTDEF LISTA, ENDA
  • EXTREF (External Reference)
    • The EXTREF statement names symbols used in this (present) control section and are defined elsewhere.
    • e.g.) EXTREF RDREC, WRREC
      EXTREF LISTB, ENDB, LISTC, ENDC

Fig. 3.8

  • Sample programs illustrating linking and relocation
  • Each control section defines a list of items and the end of the list:
    • Control section PROGA: LISTA, ENDA
    • Control section PROGB: LISTB, ENDB
    • Control section PROGC: LISTC, ENDC
  • Each control section contains exactly the same set of references to these lists:
    • REF1 through REF3: instruction operands 3개
    • REF4 through REF8: values of data words 5개

이전에는 하나의 프로그램에서 CSECT를 통해 여러 control section으로 나누었다. 이번에는 여러 프로그램끼리 분리되어 있으면 CSECT를 사용하지 않아도 자동으로 control section이 분리 된다는 것을 기억하자.

Program A

Program B

Program C

각 control section이 동일한 expression을 가지고 있지만 어떻게 다르게 처리하는지 Fig. 3.9에서 확인해보자.

Fig. 3.9

  • Object programs corresponding to Fig. 3.8
  • Find the differences in the way these identical expressions are handled within the three control sections.
    • REF1: Simply a local reference in control section A (thus, it can be assembled using a PC-relative addressing, and no modification for relocation or linking is necessary), but an external reference in control sections B and C (thus, it cannot be assembled and just can be modified later by the loader)
    • Also, look at REF2 and REF3


REF6과 REF8에 expression에 사용된 LISTA는 로컬심볼 (주의. 변수가 아니라 심볼이에요)이긴 하나, 이 expression들은 PROGA에 relative한 expression들이에요.
즉, PROGA의 시작위치에 따라 그 값들이 바뀔 수 있는 값들입니다.
따라서 REF6과 REF8을 위한 modification record를 생성하여 PROGA의 시작 주소를 더해서 relocation을 해줘야 돼요.
반면, REF3, REF4, REF7 같은 경우에는 PROGA의 시작주소에 따라 값이 바뀌지는 않죠. ENDA - LISTA 이 부분은 absolute 값으로 evaluate될테니깐요.
그리고 REF1은 format 3 instruction이니깐 relocation 대상이 아니죠.

수정 레코드에서 어떨때는 1을 더하고 어떨때는 1을 더하지 않는지 반드시 확인

Fig. 3.10(a)

  • General approach
    • Assembler evaluates as much of the expression as it can.
      LISTB+4에서 +4는 처리 가능하니 address field에 넣어준다는 의미
    • The remaining terms are passed on to the loader via M records.
    • As an example, consider REF4
  • Fig 3.10(a) shows the programs of Fig 3.8 in memory after linking and loading.
    • After relocation and linking are performed, each of REF4 through REF8 has resulted in the same value in each of the 3 control sections.
    • However, for the references that are instruction operands, the calculated values after loading do not always appear to be equal -> why?

아래에 녹색 형광펜으로 칠한 부분을 유의해서 보자.
이 부분은 LDT LISTB+4 instruction을 object code로 나타낸 것이다.

program A와 C의 경우 format 4를 사용하고, B의 경우 format 3를 사용하므로 다른 것을 확인할 수 있다.

Fig. 3.10(b)

  • It shows how the value for reference REF4 is computed with relocation and linking operations

Algorithm & Data Structures for a Linking Loader

  • We use Modification records for relocation so that the linking and relocation functions are performed using the same mechanism.
  • The algorithm for a linking loader is more complicated than the absolute loader algorithm.
    • Usually makes two passes over its input

Linking Loader Algorithm

  • Pass 1: Assign addresses to all external symbols
    • Decides where each module will be located (PROGADDR & CSADDR).
      • PROGADDR is the beginning address in memory where the linked program is to be loaded
        » This information may be given by OS.
      • CSADDR is the starting address assigned to the control section currently being scanned by the loader
    • Prepare an external symbol table (ESTAB), for each external symbol.
      • ESTAB is used to store information of external symbols
      • [control section, symbol name, symbol address]

Fig. 3.11

  • Pass 2: Perform actual loading, relocation, and linking
    • For a modification record, search the external symbol table
    • Modify absolute addresses as defined in modification record in the code itself.
    • Transfer the control to the program loaded
  • Refer to Fig 3.11 for the linking loader algorithm

Dynamic Linking

  • Postpone linking process until the object code is actually used
    • A subroutine or library is loaded and linked to the rest of the program when it is first called
  • Achieve substantial savings of time and memory space
    • No need to load and link parts of a library that will not be used by the program
    • Allow several executing programs to share one copy of a subroutine or library
  • Object codes are stored in a “dynamic link library”

An Example of Dynamic Linking

user program이 실행되고 있는 상황에서 error handle이라는 subroutine을 호출하고 싶음.

A) error handle은 링킹이 이루어지지 않아 다이나믹 링킹이 필요하다.
user program은 os에 load-and-call request를 보낸다.

B)os에서는 dynamic linking의 로직을 수행하는 dynamic loader가 호출되서 library에 저장되어 있는 error handle을 메모리에 대신 load 해준다.

c) dynamic loader가 flow를 error handle로 jump 시켜서 error handle을 실행하게 한다.

d) error handle의 결과 값을 dynamic loader에 전달이 되고 다시 그 값을 user program으로 전달하는 방식

e) user program이 load-and-call 호출 -> error handle 실행하는 방식으로 subroutine이 진행된다.

링커를 이용하는 이유
링커와 로더의 역할.
boot loader, absolute loader 장단점
왜 비트 마스크를 적용할 수 없는지. 문제를 어떻게 해결하는지 이해하기
프로그램 링킹 뒤 이야기는 했던 이야기라 안중요
다른 예제 3개 각 컨트롤 섹션이 동일한 expression을 사용하지만,
컨트롤 섹션에 따라 처리되는 결과도 다름
3.10(a)
링킹이 일어나는 과정 3.10(b)
링킹 로더 알고리즘 2pass 처리
슈도 코드 외울 필요 없는데 어떤 의미인지 알아야한다!!!
다이나믹 링킹과 스태틱 링킹

0개의 댓글