FLOps: FLoating Point Operations Per Second

Eunbin Park·2022년 9월 21일

FLOps: 플롭스

컴퓨터 성능을 수치로 나타낼 때 주로 사용되는 단위
컴퓨터가 1초동안 수행할 수 있는 부동소수점 연산의 횟수를 기준으로 삼는다.
CPU Architecture 구조에 따라 클럭 당 연산 속도가 다르기 때문에 객관적 성능 비교 시 사용된다.

연산식

\text{FLOPS} = \text{cores} \times \text{clock} \times {\text{FLOPS} \over \text{cycle} }

Example

Figure 1.

In Deep Learning

플롭스의 본 의미는 1초 당 수행할 수 있는 부동소수점 연산 횟수지만, 딥러닝에서는 실제 연산량에 따른다. 하여 딥러닝에서는 Floating Point Operations로 명명한다.

Linear Layers

MAC: Multiply-Accumulate
$\text{MAC} = \text{output.shape} \times \text{input.shape}$
$\text{ADD} = \text{output.shape}$ (for bias)

➡️ $\text{FLOps} = 2 \times \text{MAC} + \text{ADD}$

Convolution Layers

3차원 텐서인 경우 $H \times W \times C$ 를 갖는다.
여기서 kernel size 가 $K \times K$ 일 때, FLOps는 $K \times K \times C_{in} \times C_{out} \times H_{out} \times W_{out}$ 이 된다.

$\text{N conv OPS} = input_h \times {input_w \over stride} = output^2$
$\text{MAC/filter} = kernelsize^2 \times InputChannels \times OutputChannels$
$\text{ADD} = OutputChannels$ (for bias)