Intro. Parallel Programming

skang6283·2021년 1월 31일

목록 보기

1/4

Throughput: number of computing tasks per time unit.
Latency: delay between invoking the operation and getting the response.

Fundamentally different design philosophy of two processors

CPU

execl at sequence of operations (threads) asap
can exectue tens of these trheads in parallel
data caching and flow control to reduce intstruction and data access latencies
designed to minimize latency at the cost of increasesd use of chip area and power (latency oriented style)

GPU

execl at executing thousdands of them in parallel
data processing
memory bandwidth x10 that of CPU (data can be transfered from memory system to processors more quickly)
reducing latency is more expensive than increasing throughput, thus make hardware smaller to put more ALUs on chips (throughput oriented style)

***

Challenges

it can be difficult to design parallel algorithms that works better than sequential one.
execution speed of many applications are limited by memory access speed
execution speed of parallel program is often sensitive to the input data characteristics than their sequential counter parts.
many real world problems are most naturally described with mathematical recurrences. Parallelizing these problems often require non-intuitive way of thinking about the problem and may require redundant work during execution.

Amdahl's Law

The level of speedup one can achieve through parallel execution can be severely limited by the parallelizable portion of the application.

Hi :) I'm Max