생성일: 2021년 9월 12일 오후 6:31
- Response time(Latency, execution time) : time elapsed between start and end of a program
- Throughput(처리율) : amount of work done in a fixed time
- faster processor will improve both
- More processors will likely only improve throughput
- Some policies will improve throughput and worsen response time
Execution Time
- Consider a system "X" executing a fixed workload "W"
Speedup and Improvement Representation
System X executes a program in 10 seconds, system Y executes the same program in 15 seconds
- System X is 1.5 times faster than system Y
- The speedup of system X over system Y is 1.5 (the ratio)
- The performance improvement of X over Y is 1.5 - 1 = 0.5 = 50%
- The execution time reduction for the program, compared to Y is (15 - 10)/15 = 33%
- The execution time increase, compared to X is (15-10)/10 = 50%
방정식 1의 CPU clock cycles에 대입하면 다음과 같다.
앞의 방정식에서 사용된 변수들
- Clock cycle time: manufacturing process (how fast is each transistor), how much work gets done in each pipeline stage (more on this later)
- Number of instrs: the quality of the compiler, the instruction set architecture
- CPI: the nature of each instruction and the quality of the architecture implementation
Benchmark Suites
- SPEC rating : System Performance Evaluation Corporation
- 시스템이 baseline machine에 비해 얼마나 빠른지 명시
- SPEC rating 600 is 1.5 times faster than SPEC rating 400
- 어떻게 29가지의 앱의 성능을 하나의 성능 수치로 나타낼 수 있을까?
- SPEC uses arithmetic(산수의) mean(AM) : the average of each program's execution time
- Weighted arithmetic mean(가중산술평균) : the execution times of some programs are weighted to balance priorities
Common principles
- 시스템은 평상시에도 에너지를 소비한다. (leak energy)
- Frequency(cpu clock speed)은 leakage energy에는 영향 X
- 성능 개선은 일반적으로 에너지 개선으로 귀결됨
- 90-10 rule : 10% of the program accounts for 90% of execution time (10%의 프로그램이 90%의 실행시간을 차지함)
- Principle of locality : the same data/code will be used again (temporal locality), nearby data/code will be touched next (spatial locality)