[CA] CH2. Instructions: Language of the Computer

실버버드·2024년 10월 24일

Computer Architecture

목록 보기
2/7

Chapter 2: Instructions: Language of the Computer

Instuction: the words of a computer's language
(Instruction set: its vocabulary)
The RISC-V(Five) Instruction Set

Stored-Program Concept

: The idea that instructions and data of many types can be stored in memory as
numbers and thus be easy to change, leading to the stored-program computer.

RISC-V Assembly Language

1. Arithmetic Operations

Design Principle 1: Simplicity favors regularity.
• Regularity makes implementation simpler.
• Simplicity enables higher performance at a lower cost.

RISC-V Registers

Register Operands
32 x 32-bit register file
32-bit data; word, 64-bit data; doubleword
Design Principle 2: Smaller is faster.
• c.f. main memory: millions of locations.

2. Memory Operations

• Main memory used for composite data: Arrays, structures, dynamic data.
• To apply arithmetic operations,
Load values from memory into registers. (load)
Store result from register to memory. (store)
• Memory is byte addressed: Each address identifies an 8-bit byte.
• RISC-V is Little Endian: Least-significant byte at least address of a word. (higher number, lower address)
c.f. Big Endian: most-significant byte at least address. (opposite)
• RISC-V does not require words to be aligned in memory. (정렬x) Unlike some other ISAs.

Registers vs. Memory
Registers in processor(CPU) vs Memory
Compiler must use registers for variables as much as possible.
• Only spill to memory for less frequently used variables.
• Register optimization is important!

RISC-V Operands

Memory words: accessed only by data transfer instructions. In RISC-V, byte addresses, sequential word +4, memory holds data structures, arrays, and spilled register

Numbers

Unsigned Numbers

Range 0 ~ (2n2^n-1)
Using 32 bits: 0 ~ +4,294,967,295.
Using 64 bits: 0 ~ +18,446,774,073,709,551,615.

2s-Complement Signed Integers

Range: -2n12^{n-1}~ (2n12^{n-1}−1).
Using 32 bits: 2,147,483,648 ~ +2,147,483,647.
Using 64 bits: −9,223,372,036,854,775,808 ~ +9,223,372,036,854,775,807.
Bit 31 is sign bit: 1 for negative numbers/ 0 for non-negative numbers
Most-negative: 1000../ Most-positive: 0111...

Signed Negation

Complement and add 1
+2 = 0000 0010 (Flip bits)
-2 = 1111 1101 + 12_2
-2 = 1111 1110

Sign Extension

Replicate the sign bit to the left
-1: 1111 1101 -> 1111 1111 1111 1110

Hexadecimal
Base 16 (16진수)

Instructions: encoded in binary

RISC-V Reference Data

Instruction format

Instruction set

Opcodes

1) RISC-V R-format Instructions

Instruction fields:
• opcode: operation code.
• rd: destination register number.
• funct3: 3-bit function code (additional opcode).
• rs1: the first source register number.
• rs2: the second source register number.
• funct7: 7-bit function code (additional opcode).

2) RISC-V I-format Instructions

  • Immediate arithmetic and load instructions.
    • rs1: source or base address register number.
    • immediate: constant operand, or offset added to base address. 2s-complement, sign extended.

  • Design Principle 3: Good design demands good compromises.
    • Different formats complicate decoding, but allow 32-bit instructions uniformly.
    • Keep formats as similar as possible.

+) immediate 211 (2111)-2^11~(2^11-1)
if load instructions, immediate; a byte offset
We see that more than 32 registers would be difficult in this format, as the rd and rs1 fields would each need another bit, making it harder to fit everything in one word.

3) RISC-V S-format Instructions

Different immediate format for store instructions.
• rs1: base address register number.
• rs2: source operand register number.
• immediate: offset added to base address.
• Split so that rs1 and rs2 fields always in the same place.

Machine Language

• Machine language: Binary representation used for communication within a computer system.
• Machine code: A sequence of such instructions.
• Instruction format: Layout of the instruction

Translation Example

  • Binary compatibility: instructions and data stored in memory in binary, work on different computers

3. Logical Operations

Shift Operations

•immed: how many positions to shift.
• Shift left logical:
Shift left and fill with 0 bits.
slli by i bits multiplies by ×2i\times 2^i.
• Shift right logical.
Shift right and fill with 0 bits.
srli by i bits divides by ÷2i\div 2^i (unsigned only).

AND, OR Operations

and x9, x10, x11

XOR Operations

같으면 0 다르면 1

  • NOT Operation: xor x9, 10, x11 (x11 = 1111 1111...)

4. Conditional Operations

beq rs1, rs2, L1: branch if equal
bne rs1, rs2, L2: branch if not equal

Example1. if else

if (i == j) 
	f = (g + h);
else 
	f = g - h;

(i 22, j 23, f 19, g 20, h 21)

bne x22, x23, Else
add x19, x20, x21
beq x0, x0, Exit
Else: 
sub x19, x20, x21
Exit:

Example2. While Loop

while (save[i] == k)
	t += 1;

save address 25,i 22, k 24, t 10

lw s, i(x)
Loop:
slli x10, x22, 2 // left shift 2, *4
add x10, x10, x25 // x10 = save[i] address
lw x9, 0(x10)
bne x9, x24, Exit
addi x22, x22, 1
beq x0, x0, Loop
Exit:

basic block: a sequence of instructions without branches
In comilation, breaking the program into basic blocks compilation

More Conditional Operations
blt rs1, rs2, L1: branch if rs1 is less than rs2 (rs1 < rs2)
bge rs1, rs2, L1: branch if rs1 is greater than rs2 (rs1 >= rs2)
bltu, bgeu unsigned

Procedures

: A stored subroutine that performs a specific task based on the parameters with which it is provided.
• Six steps to follow:
1. Put parameters in a place where the procedure can access them
2. Transfer control to the procedure
3. Acquire the storage resources needed for the procedure
4. Perform the desired task
5. Put the result value in a place where the calling program can access it
6. Return control to the point or origin, since a procedure can be called from several pointers in a program
(put parameters near procedure, transfer control to procedure, make storage, perform, put result to calling program, return control

Registers
: fastest place to hold data in a computer

• x10-x17: eight parameter registers in which to pass parameters or return values.
• x1: one return address register to return to the point of origin. (원래 함수 위치 저장)

5. Unconditional Jump Instructions

  • jal x1, X: jump and link instruction- jumps to the address and saves the address of the following instruction in register x1
    X- callee, caller- the program call this instruction
  • jalr x0, 0(x1): jump and link register- jumps to the address specified in x1 (돌아감 branch)

Term
• Return address: A link to the calling site that allows a procedure to return to the proper address; in RISC-V it is stored in register x1.
• Program Counter (PC): The register containing the address of the instruction in the program being executed (instruction address register).
• Caller: The program that instigates a procedure and provides the necessary
parameter values.
• Callee: A procedure that executes a series of stored instructions based on
parameters provided by the caller and then returns control to the caller.

Leaf Procedure Example

int leaf_example(int g, int h, int i, int j){
	int f;
    f = (g + h) - (i + j);
    return f;
}

arguments g, h, i, j in x10, x11, x12, x13
f in x20
temporaries x5, x6
need to save x5, x6, x20 on stack (밑으로 쌓음)

leaf_example:
addi sp, sp, -12 
sw x5, 8(sp) // stack에 넣어놓고 계산, sp에서 +8한 위치
sw x6, 4(sp)
sw x20, 0(sp)
add x5, x10, x11 // x5 = x10 + x11
add x6, x12, x13 // x6 = x12 + x13
sub x20, x5, x6 // x20 = x5 - x6
addi x10, x20, 0 // x10 = x20
lw x20, 0(sp) // stack에서 다시 가져와서 return
lw x6, 4(sp)
lw x5, 8(sp)
addi sp, sp, 12
jalr x0, 0(x1)

Register Usage

x5-x7, x28-x31: temporary registers
x8-x9, x18-x27: saved registers- callee saves and restores them

Non-Leaf Procedures

: Procedures that call other procedures.
For nested call, caller needs to save on the stack: Its return address, Any arguments and temporaries needed after the call.
Restore from the stack after the call.

Example

int fact (int n){
	if (n < 0) return 1;
    else return n * fact(n-1);
}

argument n in x10
result in x10

fact:
addi sp,sp,-8
sw x1,4(sp)
sw x10,0(sp)
addi x5,x10,-1
bge x5,x0,L1 // if n >= 0, L1, else (n<0) continue
addi x10,x0,1 // x10 = 1
addi sp,sp,8 // stack에 store한 이번 x1, x10의미 없음, 전으로 이동
jalr x0,0(x1) // go back to next to jal x1,fact (recursion end)
L1:
addi x10,x10,-1
jal x1,fact // fact로
addi x6,x10,0 // x6 = x10 == 1
lw x10,0(sp) 
lw x1,4(sp)
addi sp,sp,8
mul x10,x10,x6 
jalr x0,0(x1) // ??

register x1, x5, x6, x10, stack 뭔지 알아보기

Character Data

• Byte-encoded character sets.
ASCII: 128 characters: 95 graphic, 33 control.
Latin-1: 256 characters: ASCII, +96 more graphic characters.
• Unicode: 32-bit character set.
Used in Java, C++ wide characters, …
Most of the world’s alphabets, plus symbols.
UTF-8, UTF-16: variable-length encodings.

Wide Immediates and Addresses

: optimization for instruction addresses used in branches

1. Wide Immediate Operand

lui: load upper immediate (U type)
To load a 20-bit constant into bits 12 through 31 (bit 12~31) of a register.
The leftmost 32 bits are filled with copies of bit 31, and the rightmost 12 bits are filled with zeros.
(leftmost 32 bits are bit 31, rightmost 12 bits are zeros, then 20-bit constant into bit to bit 12~31 and addi 12-bit constant)

Example.
wide immediate: 0000 0000 0011 1101 0000/ 0101 0000 0000
left 20-bit constant = 976
x19 = 0000 0000 0000 0000 0000 0000 0000 0000 0000
lui x19, 976
right 12-bit constant = 1280
addi x19, x19, 1280

2. Addressing in Branches

PC-relative addressing: Target address = PC + immediate x 2
bne x10,x11,2000: 2000 = immediate x 2

imm[12:1] = 0011 1110 10002_2 = 1000 (imm = 2000)

Uj-type format
jal x0, 2000: 2000 = immediate x 2

0000 0000 0111 1101 000(0임을 확신) = 2000, imm[0] == 0인줄 알고 생략

PC Relative Addressing

If addresses of the program had to fit in this 20-bit field, it would mean that no program could be bigger than 2202^{20} (far too small).
• An alternative would be to specify a register that would always be added to the branch offset.
Since the program counter (PC) contains the address of the current instruction,
we can branch within ±210\pm 2^{10} words of the current instruction,
or jump within ±218\pm 2^{18}' words of the current instruction,
if we use the PC as the register to be added to the address.

RISC-V Addressing Mode Summary

• Immediate addressing, where the operand is a constant within the instruction itself.
• Register addressing, where the operand is a register.
• Base or displacement addressing, where the operand is at the memory location whose address is the sum of a register and a constant in the instruction.
• PC-relative addressing, where the branch address is the sum of the PC and a constant in the instruction.

Fallacies, Remarks

(short, not complex, simple, modern, assembly code is good)
• Powerful instruction -> higher performance.
Fewer instructions required.
But complex instructions are hard to implement. May slow down all instructions, including simple ones.
Compilers are good at making fast code from simple instructions.
• Use assembly code for high performance.
But modern compilers are better at dealing with modern processors.
More lines of code -> more errors and less productivity.

• Design principles:
Simplicity favors regularity.
Smaller is faster.
Good design demands good compromises.
• Make the common case fast.
• Layers of software/hardware: Compiler, assembler, hardware.

0개의 댓글