[CSAPP] 8.1 Exceptions

JunHyeok Kim·2024년 4월 23일

8.1 Exceptions

Exceptions are a form of exceptional control flow that are implemented partly by the hardware and partly by the operating system. Because they are partly implemented in hardware, the details vary from system to system. However, the basic ideas are the same for every system. Our aim in this section is to give you a general understanding of exceptions and exception handling and to help demystify what is often a confusing aspect of modern computer systems.

예외처리는 부분적으로는 하드웨어,부분적으로는 운영차에에 의해 구현되는 예외적인 제어 흐름입니다. 부분적으로 하드웨어에 의해 구현되기 때문에 시스템에 따라 그 세부사항이 다를 수 있습니다. 그러나 기본적인 아이디어는 모든 시스템에 대해 동일합니다.

이 섹션의 주요 목적은 현대 컴퓨터 시스템의 난해한 안개를 걷어주는 '예외상황'과 '예외처리'에 대한 일반적인 이해를 하는 것에 있습니다.

An exception is an abrupt change in the control flow in response to some change in the processor’s state. Figure 8.1 shows the basic idea.
In the figure, the processor is executing some current instruction Icurr when a significant change in the processor’s state occurs. The state is encoded in various bits and signals inside the processor. The change in state is known as an event.

예외상황 (Exception) 은 프로세스의 상태 변화에 따른 급진적인 제어 흐름 변화에 대한 대응 입니다. 위의 사진에서, 프로세서는 Icurr 인스트럭션을 실행하고 있는 와중에, 프로세서 상태의 중요한 변화가 일어나고 있습니다.
이 상태는 다양한 비트와 신호로 프로세서 안에서 인코딩 됩니다. 이 상태 변화를 '이벤트' 라고 합니다.

The event might be directly related to the execution of the current instruction. For example, a virtual memory page fault occurs, an arithmetic overflow occurs, or an instruction attempts a divide by zero. On the other hand, the event might be unrelated to the execution of the current instruction. For example, a system timer goes off or an I/O request completes.

이 이벤트는 현제 인스트럭션 실행과 직접적인 연관이 있을 수 있습니다. 예를 들어, 가상 메모리 페이지 폴트가 발생하거나, 연산 오버플로우가 발생하거나 인스트럭션 연산이 0을 나누는 상황입니다. 반면에, 현재 인스트럭션과 연관이 없을 수도 있습니다. 예를 들어서 시스템 타이머가 꺼지고 I/O 요청이 완료된 상황이죠!

In any case, when the processor detects that the event has occurred, it makes an indirect procedure call (the exception), through a jump table called an exception table, to an operating system subroutine (the exception handler) that is specifically designed to process this particular kind of event. When the exception handler finishes processing, one of three things happens, depending on the type of event that caused the exception:

어떤 경우에서든지, 프로세서가 이벤트가 발생한 것을 감지하면, 예외 테이블 (Exception table) 이라고 불리는 점프 테이블을 통해서 이 특정 종류의 이벤트를 처리하기 위해 설계된 운영체제 서브루틴(Exception handler)으로 간접 프로시저를 호출합니다.

The handler returns control to the current instruction Icurr, the instruction that was executing when the event occurred.
2.The handler returns control to Inext,the instruction that would have executed next had the exception not occurred.
The handler aborts the interrupted program.

Exception Table

내가 따로 정리한 내용! 위의 내용은 너무 추상적이네..!

Event Detection : 프로세서가 이벤트를 감지함!
Indirect Procedure Call :
이벤트가 감지되면, 프로세서는 제어권을 해당 이벤트를 처리할 수 있는 코드에 이관을 해야합니다. 그러기 위해서는 우리가 함수를 호출 하듯이 직접 부르는 것이 아닌 "간접적"으로 호출해야 합니다! 그렇기 때문에 Indirect Procedure Call 이라고 부르는 것 입니다.
Exception Table :
그런데 어디로 가야할까요? 이에 대한 정보가 바로 Exception Table에 담겨있답니다! 프로세서는 현재 처한 이벤트 상황에 맞는 처리 루틴을 실행하기 위해 이 테이블에 담긴 Address로 Jump 하게 되는 것 입니다!
Exception Handler :
프로세서가 Jump해서 도착한 이 곳은 바로 "예외 처리기" 입니다. 이 곳에서 프로세서는 해당 Exception을 처리하기 위한 Instrcutions들을 실행합니다.
Processing :
하지만 중요한 한 가지 사실을 기억해야 합니다! 예외 처리기는 프로세스 제어권을 이관 받은 후에 Exception을 처리하기 위한 코드들을 실행하는 것입니다!

핸들러의 작업이 끝난 후 아래의 3가지 중 한 가지의 상황이 발생합니다!

Resume Execution:
만약 해당 이벤트가 Non-Fatal 하다면, 프로그램은 안전하게 하던 작업을 이어서 진행하게 됩니다. 따라서 Icurr 혹은 Inext 계속 실행되겠지요!
Abort Program:
만약 해당 이벤트가 Fatal 하거나 프로그램이 안전하게 실행 될 수 없다면 Exception Handler는 운영체제에게 해당 프로그램을 종료하게 하도록 합니다.

8.8.1 Exception Handling

Exceptions can be difficult to understand because handling them involves close cooperation between hardware and software. It is easy to get confused about which component performs which task. Let’s look at the division of labor between hardware and software in more detail.

예외처리를 다루는 것은 하드웨어와 소프트웨어의 긴밀한 협업을 포함하고 있기 때문에 이해하기 어려울 수 있다. 즉, 어떤 요소가 어떤 작업을 수행하는지 혼란이 생기기 쉽다.

Each type of possible exception in a system is assigned a unique nonnegative integer exception number. Some of these numbers are assigned by the designers of the processor. Other numbers are assigned by the designers of the operating system kernel (the memory-resident part of the operating system). Examples of the former include divide by zero, page faults, memory access violations, break- points, and arithmetic overflows. Examples of the latter include system calls and signals from external I/O devices.

각각의 예외 발생 가능한 타입이 시스템 안에 고유한 "Exception Number"로 할당되어있다. 몇몇은 프로세서 설계자에 의해, 몇몇은 운영체제 시스템의 커널 설계자에 의해 할당된 것이다.
1. Hardware case : divide by zero, page fault, memory access violation, arithmetic overflow 등이 있다.
2. Software case : System call, I/O device signal 등이 있다.

At system boot time (when the computer is reset or powered on), the operating system allocates and initializes a jump table called an exception table, so that entry K contains the address of the handler for exception K. Figure 8.2 shows the format of an exception table.

시스템을 부팅할 때, OS는 exception table을 할당하고 초기화한다, 이에 따라 엔트리 K는 예외상황 K에 대한 핸들러의 주소를 갖는다.

At run time (when the system is executing some program), the processor detects that an event has occurred and determines the corresponding exception number k. The processor then triggers the exception by making an indirect procedure call, through entry k of the exception table, to the corresponding handler. Figure 8.3 shows how the processor uses the exception table to form the address of the appropriate exception handler. The exception number is an index into the exception table, whose starting address is contained in a special CPU register called the exception table base register.

시스템이 프로그램을 구동 할 때, 프로세서는 이벤트가 발생하는 것을 감지하고, 이에 대응 되는 예외번호 K를 결정한다. 그 후, 프로세서는 Exception Table의 엔트리 K와 상응하는 핸들러를 간접 프로시저 호출 하고, 이는 예외처리를 발생시킵니다.

예외상황은 프로시저 콜과 유시하지만, 다음과 같이 중요한 차이점이 있습니다!

As with a procedure call, the processor pushes a return address on the stack before branching to the handler. However, depending on the class of exception, the return address is either the current instruction (the instruction that was executing when the event occurred) or the next instruction (the instruction that would have executed after the current instruction had the event not occurred).

프로세서는 프로시저 콜을 사용해서 핸들러로 분기하기 전에 스택에 리턴주소를 푸시합니다. 하지만 예외의 종류에 따라 리턴 주소는 현재 인스트럭션이거나 다음 인스트럭션이 될 수 있습니다.

The processor also pushes some additional processor state onto the stack that will be necessary to restart the interrupted program when the handler returns. For example, an x86-64 system pushes the EFLAGS register containing the current condition codes, among other things, onto the stack.

프로세서는 인터럽트된 프로그램의 재실행에 필요한 추가적인 프로세서 상태 정보를 스택에 푸쉬할 수 있습니다. x86-64 시스템을 예로 들면, 현재 조건 코드를 포함하는 EFLAGS 레지스터를 푸쉬합니다.

When control is being transferred from a user program to the kernel, all of these items are pushed onto the kernel’s stack rather than onto the user’s stack.

제어권이 유저 프로그램에서 커널로 옮겨질 때, 이 모든 정보들은 사용자 스택이 아니라 커널의 스택으로 푸쉬됩니다.

Exception handlers run in kernel mode (Section 8.2.4), which means they have complete access to all system resources.

예외처리기는 커널모드에서 구동되는 데, 이는 모든 시스템 자원에 접근할 수 있음을 의미합니다. (그렇기 때문에 심각한 예외 상황이 발생하면 프로그램을 종료할 수 있는건가?) -> 맞다.

Once the hardware triggers the exception, the rest of the work is done in software by the exception handler. After the handler has processed the event, it optionally returns to the interrupted program by executing a special “return from interrupt” instruction, which pops the appropriate state back into the processor’s control and data registers, restores the state to user mode (Section 8.2.4) if the exception interrupted a user program, and then returns control to the interrupted program.

하드웨어가 예외처리 트리거를 발생시키면, 예외 처리기에 의해 남은 작업들은 소프트웨어로 진행됩니다. 핸들러는 예외처리로 인해 발생된 이벤트를 처리한 후 경우에 따라서는 "인터럽트에서 원래 위치로 복귀" 라는 특별한 인스트럭션을 실행시켜서 인터럽트에 의해서 중단된 프로그램으로 돌아간다. 이 인스트럭션은 인터럽트가 발생되기 전의 원래 프로세서의 제어상태와 데이터 레지스터 상태를 스택으로부터 팝해서 돌려주고, 만일 예외가 사용 프로그램을 중단했으면 상태를 사용자 모드로 되돌리고, 최종적으로 제어권을 중단되었던 프로그램에 리턴해준다.

8.1.2 Classes of Exceptions

Interrupts

Interrupts occur asynchronously as a result of signals from I/O devices that are external to the processor. Hardware interrupts are asynchronous in the sense that they are not caused by the execution of any particular instruction. Exception handlers for hardware interrupts are often called interrupt handlers.

인터럽트는 프로세서 외부의 I/O 장치들의 신호에 인해 비동기적(asynchronously) 으로 일어난다. 하드웨어 인터럽트는 인스트럭션 실행에 의해 발생된 것이 아니라는 뜻으로 비동기적 인터럽트이다. 하드웨어 인터럽트에 의한 예외 핸들러는 인터럽트 핸들러(Interrupt handler) 라고 불린다

Figure 8.5 summarizes the processing for an interrupt. I/O devices such as network adapters, disk controllers, and timer chips trigger interrupts by signaling a pin on the processor chip and placing onto the system bus the exception number that identifies the device that caused the interrupt.

그림 8.5는 인터럽트의 과정을 요약한다. 네트워크 어댑터, 디스크 컨트롤러, 타이머 칩의 핀에 신호를 보내서 인터럽트를 발생시키고, 인터럽트를 발생시킨 디바이스를 식별하는 예외번호를 시스템 버스에 보냅니다.

After the current instruction finishes executing, the processor notices that the interrupt pin has gone high, reads the exception number from the system bus, and then calls the appropriate interrupt handler. When the handler returns, it returns control to the next instruction (i.e., the instruction that would have followed the current instruction in the control flow had the interrupt not occurred). The effect is that the program continues executing as though the interrupt had never happened.
The remaining classes of exceptions (traps, faults, and aborts) occur synchronously as a result of executing the current instruction. We refer to this instruction as the faulting instruction.

현재 인스트럭션이 끝난 후, 프로세서는 인터럽트 핀이 'high'가 됐음을 알아차립니다.

시스템 버스를 통해 예외 번호를 읽은 뒤, 적절한 인터럽트 핸들러를 호출합니다.

핸들러가 리턴을 하고나면, 제어권을 다음 인스트럭션에게 넘겨줍니다.

이는 프로그램이 인터럽트(방해) 없이 지속되는 것 처럼 보여주게 해줍니다.

나머지 예외들은 (트랩, 폴트, 중단)은 동기적으로 일어납니다.

Traps and System Calls

Traps are intentional exceptions that occur as a result of executing an instruction. Like interrupt handlers, trap handlers return control to the next instruction. The most important use of traps is to provide a procedure-like interface between user programs and the kernel, known as a system call.

트랩은 '의도적인' 예외로, 어떤 인스트럭션을 실행시켰을 때 실행됩니다.
인터럽트 핸들러와 마찬가지로, 트랩 또한 다음 인스트럭션으로 제어권을 넘겨줍니다. 트랩의 가장 중요한 점은 System Call 이라고 알려진 사용자 프로그램과 커널 사이의 프로시저와 유사한 인터페이스를 제공하는 것 입니다.

User programs often need to request services from the kernel such as reading a file (read), creating a new process (fork), loading a new program (execve), and terminating the current process (exit). To allow controlled access to such kernel services, processors provide a special syscall "n" instruction that user programs can execute when they want to request service "n". Executing the syscall instruction causes a trap to an exception handler that decodes the argument and calls the appropriate kernel routine.

파일 읽기, 새로운 프로세스 생성 (fork), 새 프로그램 로드 (execve) 그리고 현재 프로그램을 종료하기 위해서 유저 프로그램은 커널에 서비스를 요청해야 할 때가 있습니다.
이러한 커널 서비스의 제한된 접근을 하기 위해서 프로세서는 특별한 "n" 인스트럭션을 제공하며, 이들은 사용자 프로그램이 서비스 "n" 을 요청하고자 할 때 사용자 프로그램이 사용할 수 있는 인스트럭션이다.
syscall 인스트럭션을 실행하면 트랩을 커널 루틴의 콜과 매개변수를 해독하는 예외 핸들러로 이동하게 하게한다.

From a programmer’s perspective, a system call is identical to a regular function call. However, their implementations are quite different. Regular functions run in user mode, which restricts the types of instructions they can execute, and they access the same stack as the calling function. A system call runs in kernel mode, which allows it to execute privileged instructions and access a stack defined in the kernel. Section 8.2.4 discusses user and kernel modes in more detail.

프로그래머의 관점에서, 시스템 콜은 일반 함수와 동일해 보이지만, 구현은 사뭇 다르다. 일반 함수는 실행할 수 있는 인스트럭션의 타입이 제한되며 이 함수를 호출하는 함수와 동일한 스택을 사용한다. 시스템 콜은 커널 모드에서 구동되며, 이로 인해 커널 내에서 정의된 스택에 접근하며, 특권(Previlge)을 가진 인스트럭션을 실행할 수 있도록 해준다.

Faults (오류 Error와는 엄연히 다르다!)

Faults result from error conditions that a handler might be able to correct. When a fault occurs, the processor transfers control to the fault handler. If the handler is able to correct the error condition, it returns control to the faulting instruction, thereby re-executing it. Otherwise, the handler returns to an abort routine in the kernel that terminates the application program that caused the fault. Figure 8.7 summarizes the processing for a fault.

오류는 핸들러가 수정할 수 있을 가능성이 있는 에러 조건으로부터 발생한다. 오류가 발생했을 때, 프로세서는 제어권을 오류 핸들러로 이관한다. 만약 핸들러가 해결할 수 있다면, 제어권을 인스트럭션(오류를 발생시킨)으로 리턴하여 재실행한다.
오류를 해결할 수 없다면, 핸들러는 커널 내부의 중단 루틴으로 리턴해서 오류를 발생시킨 프로그램을 종료시킨다.

A classic example of a fault is the page fault exception, which occurs when an instruction references a virtual address whose corresponding page is not resident in memory and must therefore be retrieved from disk. As we will see in Chapter 9, a page is a contiguous block (typically 4 KB) of virtual memory.

전형적인 오류는 페이지 폴트 예외이다! 이는 인스트럭션이 가상메모리 테이블을 참조했을 때 대응되는 실제 메모리 page가 존재하지 않으며, 결과적으로 디스크에서 가져와야 하는 상황이다. Page는 가상 메모리의 연속적인 블록이다.

The page fault handler loads the appropriate page from disk and then returns control to the instruction that caused the fault. When the instruction executes again, the appropriate page is now resident in memory and the instruction is able to run to completion without faulting.

페이지 폴트 핸들러는 디스크로부터 적절한 페이지를 로드한 뒤 인스트럭션에 제어권을 다시 리턴해준다. 인스트럭션이 재실행되면, 적절한 페이지가 이제 메모리에 상주하게 되고, 인스트럭션이 성공적으로 실행 될 것이다!

Abort 중단

Aborts result from unrecoverable fatal errors, typically hardware errors such as parity errors that occur when DRAM or SRAM bits are corrupted. Abort handlers never return control to the application program. As shown in Figure 8.8, the handler returns control to an abort routine that terminates the application program.

중단은 DRAM 이나 SRAM이 고장날 때 발생하는 패리티 에러와 하드웨어 같은 복구할 수 없는 치명적인 에러에서 발생한다. 중단 핸들러는 절대로 프로그램에 제어권을 리턴하지 않고 중단 루틴에 제어권을 넘겨준다.

8.1.3 리눅스/ x86-64 시스템에서의 예외상황

Divide error.

A divide error (exception 0) occurs when an application attempts to divide by zero or when the result of a divide instruction is too big for the destination operand. Unix does not attempt to recover from divide errors, opting instead to abort the program. Linux shells typically report divide errors as “Floating exceptions.”

프로그램이 0으로 나누려 할 때, 혹은 나눗셈 인스트럭션의 결과가 너무 클 때 발생한다. Unix는 나누기 에러가 발생하면 프로그램을 중단한다. Linux Shell은 나누기 에러를 "Floating exceptions" 로 보고한다.

General protection fault.

The infamous general protection fault (exception 13) occurs for many reasons, usually because a program references an undefined area of virtual memory or because the program attempts to write to a read-only text segment. Linux does not attempt to recover from this fault. Linux shells typically report general protection faults as “Segmentation faults.”

악명높은 일반 보호 오류. 프로그램이 가상메모리의 정의되지 않은 영역을 참조하거나, 프로그램이 read-only 세그먼트에 write 하려는 상황에 발생한다. Linux에서는 "Segementation faults"로 보고한다

Page fault.

A page fault (exception 14) is an example of an exception where the faulting instruction is restarted. The handler maps the appropriate page of virtual memory on disk into a page of physical memory and then restarts the faulting instruction. We will see how page faults work in detail in Chapter 9.

페이지 오류는 페이지 오류를 발생시킨 인스트럭션이 재시작하는 예외이다.
핸들러는 필요한 디스크의 가상메모리에서 해당 페이지를 물리메모리 페이지로 매핑하고, 그 후 오류 인스트럭션들을 다시 시작한다.

Machine check.

A machine check (exception 18) occurs as a result of a fatal hardware error that is detected during the execution of the faulting instruction. Machine check handlers never return control to the application program.

머신 체크 오류는 오류 인스트럭션을 실행하는 동안에 검출된 치명적인 하드웨어 에러의 결과로 발생한다. 이 또한 제어권을 프로그램으로 돌려주지 않는다.
-> 그럼 종료한다는 뜻이다. (Abort)

리눅스 /x86-64 시스템 콜

Linux provides hundreds of system calls that application programs use when they want to request services from the kernel, such as reading a file, writing a file, and creating a new process. Figure 8.10 lists some popular Linux system calls. Each system call has a unique integer number that corresponds to an offset in a jump table in the kernel. (Notice that this jump table is not the same as the exception table.)

새로운 프로세스 생성, 파일 읽기, 파일 쓰기 등 리눅스는 사용자 프로그램이 커널에 요청할 수 있도록 수백가지의 시스템 콜 함수들을 제공한다. 각각의 시스템 콜은 고유의 정수 넘버를 갖고 있고 이는 점프 테이블의 오프셋에 대응된다.

C programs can invoke any system call directly by using the syscall function. However, this is rarely necessary in practice. The C standard library provides a set of convenient wrapper functions for most system calls. The wrapper functions package up the arguments, trap to the kernel with the appropriate system call instruction, and then pass the return status of the system call back to the calling program. Throughout this text, we will refer to system calls and their associated wrapper functions interchangeably as system-level functions.

C 프로그램은 syscall 함수를 사용하여 시스템 콜을 직접 호출 할 수 있지만, 실무에서는 거의 필요 안쓴답니다. C 라이브러리에서는 대부분의 시스템 콜 함수들에 대한 편리한 wrapper 함수를 제공합니다. 이 래퍼 함수들은 인자들을 패킹하고, 커널을 적절한 시스템 콜 인스트럭션으로 트랩을 걸고, 호출하는 프로그램으로 시스템 콜의 리턴 상태를 전달합니다.
앞으로 시스템콜에 대응하는 래퍼 함수들을 system-level functions 라고 부르겠습니다!

System calls are provided on x86-64 systems via a trapping instruction called syscall. It is quite interesting to study how programs can use this instruction to invoke Linux system calls directly. All arguments to Linux system calls are passed through general-purpose registers rather than the stack.

x86-64 시스템에서 시스템콜은 syscall 이라고 부르는 트랩 인스트럭션을 통해 제공된다. 프로그램들이 리눅스 시스템을 직접 호출하기 위해서 이 인스트럭션을 어떻게 사용할 수 있는지 배우는 것은 상당히 흥미롭다 (?).
리눅스 시스템 콜에 전달되는 모든 인자들은 스택보다는 범용 레지스터를 통해서 이뤄진다.

By convention, register %rax contains the syscall number, with up to six arguments in %rdi, %rsi, %rdx, %r10, %r8, and %r9. The first argument is in %rdi, the second in %rsi, and so on. On return from the system call, registers %rcx and %r11 are destroyed, and %rax contains the return value. A negative return value between −4,095 and −1 indicates an error corresponding to negative errno.

관습적으로, %rax는 시스템의 넘버와 총 6개의 매개변수는 각각 %rdi, %rsi, %rdx, %r10, %r8 %r9에 담긴다. 시스템콜에서 리턴 될 때 레지스터 %rcx와 %r11은 파괴된다. 그리고 %rax은 리턴 값을 담고 있다. -1 ~ -4095 리턴 값은 Error Number에 상응하는 에러를 나타낸다.

JunHyeok Kim

이전 포스트

[CSAPP] 8.0 Exceptional Control Flow

다음 포스트