Amortized analysis (1)

KVV·2024년 11월 6일

각 연산들을 수행하는데 걸리는 평균 수행 시간을 나타낸다.

Amortized analysis 를 사용하는 이유는 T(n)=O(n) 인 경우, 항상 O(n)의 연산이 발생하는 것이 아니라 대부분 O(1)의 연산으로 처리되는데 모든 연산을 O(n)으로 처리하는 것이 아깝기 때문이다.

여러 연산 중 특정 연산의 시간복잡도가 매우 커도 개별 연산자들의 시간복잡도를 평균적으로 보면 효율적이게 동작함을 보일 수 있다.

Worst case에도 개별 연산자에 대한 평균적인 비용은 효율적임을 보일 수 있다.

Amortized Analysis 의 방법

1. Aggregate analysis

n개의 임의의 시퀀스가 있을 때, $c_i$ : i번째 연산을 수행하는 데 걸리는 실제 수행 시간이라고 하자

$T(n) \ge \displaystyle\sum_{i=1}^{n} c_i$ 를 만족하는 T(n), 즉 upper bound 를 찾아 T(n)/n 을 구하는 방법

이 방법에서는 모든 각각 연산의 평균적인 수행 시간은 동일하다.

Example of Aggregate analysis

1. Stack operation

push: O(1)
pop: O(1)
multipop: O(n), stack이 가득 차 있는 경우

MULTIPOP(S, k)
	while not STACK-EMPTY(S) and k>0
    	POP(S)
        k = k - 1
        
// cose of MULTIPOP(S, k) 는 O(k) 이다.
// 만약 k가 현재 stack 에 들어있는 데이터 개수보다 크다면  
// O(현재 stack 에 들어있는 데이터 개수)

T(n) 을 구하자

n개의 연산은 PUSH, POP, MULTIPOP 으로만 이루어져 있다.
MULTIPOP은 k개의 POP과 다를 바 없다.
$n \ge$ #(push) $\ge$ #(pop)
$2n \ge$ #(push) $+$ #(pop)
T(n) = O( $2n *1$ ) = O(n)

Amortized cost

$O(n) / n$ = $O(1)$

이를 통해 stack operation(PUSH, POP, MULTIPOP)의 worst case: O(n) (multipop(S, n)) 이지만, amortized analysis를 통해 평균적으로는 O(1)임을 확인할 수 있다.

2. Incrementing binary bit counter

A[0]이 가장 낮은 자리를 나타내는 A[0...k-1] 을 k-bit binary counter로 생각하자.

INCREMENT(A) //0...2^k-1
	i = 0
    while i < A.length and A[i] == 1
    	A[i] = 0 //Bit flip, O(1)
        i = i + 1
    if i < A.length
    	A[i] = 1 //Bit flip, O(1)

각 INCREMENT operation의 cost는 직전 binary code와 비교해서 bit flip 된 bit의 개수이다.

T(n) 을 구하자

각 INCREMENT operation의 시간 복잡도는 최대 k개의 bit flip이 발생할 수 있기 때문에 O(k)이다.
INCREMENT operation의 개수는 $\theta(n)$ 이다.
전체의 upper bound는 $T(n) = \theta(n) * O(k) = O(nk)$ 이다.

위 T(n)보다 tight한 upper bound를 구할 수 있다.

그림을 보면 A[0] = n번, A[1] = $\lfloor n/2 \rfloor$ 번, A[2] = $\lfloor n/4 \rfloor$ 번 bit flip이 발생한다.
이를 일반화하면 A[i] = $\lfloor n/2^{i} \rfloor$ 번 bit flip이 발생한다.
Tighter upper bound
- $\displaystyle\sum_{i=0}^{k-1} \lfloor n/2^{i} \rfloor < \displaystyle\sum_{i=0}^{\infty} n/2^{i} = 2n = O(n)$

Amortized cost

$O(n)/n = O(1)$

이를 통해 INCREMENT 연산의 worst case: O(k) (All bits are flipped) 이지만, amortized analysis 를 통해 평균적으로는 O(1)임을 확인할 수 있다.

2. Accounting method

Aggregate analysis와 다르게 여러 개의 연산 유형이 존재할 때, 각 유형에 맞게 amortized cost를 부여하는 방법

각 연산의 amortized cost가 actual cost 보다 크면 된다.

$\hat{c_i}$ : amortized cost, $c_i$ : actual cost 일 때,

$\displaystyle\sum_{i=1}^{n} \hat{c_i}$ $- \displaystyle\sum_{i=1}^{n}c_i \ge 0$ 을 만족해야한다.

위와 같은 부등식을 만족해야하는 이유는 특정 연산에서 amortized cost - actual cost 가 음수라면 다른 연산에서 남는 cost(credit)를 사용할 수 있도록 하여 total cost가 유지될 수 있도록 하기 위해서이다.

Credit = $\hat{c_i}$ - $c_i$
Total credit = $\displaystyle\sum_{i=1}^{n} \hat{c_i}$ $- \displaystyle\sum_{i=1}^{n}c_i \ge 0$

각 연산 진행 후 credit도 자료구조에 같이 저장한다.

Example of Accounting analysis

1. Stack operation

Actual cost
- PUSH: 1
- POP: 1
- MULTIPOP(S, k): min(k,s)
Amortized cost
- PUSH: 2
- POP, MULTIPOP = 0
PUSH credit = 2 - 1 = 1, 이 credit은 POP, MULTIPOP에 사용된다.
- POP, MULTIPOP은 항상 PUSH 이후에 실행되므로 가능하다.
Credit은 음수가 될 수 없다.
- stack은 nonnegative objects 이기 때문이다.
- 따라서 the total amortized cost 는 total actual cost의 upper bound 이다.

Total amortized cost: O(2n) = O(n)
Total actual cost: O(n+m) = O(n)

2. Incrementing binary bit counter

Actual cost
- Bit set(0->1): 1
- Bit reset(1->0): 1
Amortized cost
- Bit set(0->1): 2
- Bit reset(1->0): 0
Bit set credit = 2 - 1 = 1, 이 credit은 Bit reset에 사용된다.
- Bit reset은 항상 Bit set 이후에 수행되기 때문에 가능하다.
Credit은 음수가 될 수 없다.
- Counter의 1의 개수는 nonnegative 이기 때문이다.
- 따라서 the total amortized cost 는 total actual cost의 upper bound 이다.

Total amortized cost: O(n)
Total actual cost: O(n)

Running time: O(n)

3. Potential method

Accounting method와 비슷하지만 credit 대신, potential 을 사용한다는 점에서 차이가 있다.

Credit: 각 연산마다 데이터 구조 내의 특정 객체나 연산에 할당된다.

each credit is associated with a specific object within the data structure

Potential: 특정 객체나 연산에 대해 설정된 것이 아니라, 데이터 구조 전체의 "상태"를 나타낸다.

The potential is associated with the whole data structure

Credit은 각 연산마다 자료구조에 같이 저장된다면, Potential은 전역 변수로써 존재한다고 생각할 수 있다.

$D_0$ : Initial data structure
$D_i$ : i번째 operation을 적용한 후의 data structure
$\Phi(D_i)$ : $D_i$ 에 해당하는 potential

Potential difference
i번째 operation을 수행하고 Potential의 차이
$\Phi(D_i) - \Phi({D_i-1})$

양수
- The potential of the data structure increases
음수
- The decrease in the potential pays for the actual cost of the operation

Amortized cost
$\hat{c_i} = c_i + \Phi(D_i) - \Phi({D_i-1})$

Total amortized cost
$\displaystyle\sum_{i=1}^{n} \hat{c_i}$
= $\displaystyle\sum_{i=1}^{n} (c_i + \Phi(D_i) - \Phi({D_i-1}))$
= $\displaystyle\sum_{i=1}^{n} (c_i) + \Phi(D_n) - \Phi({D_0})$

Accounting method에서와 같이 $\displaystyle\sum_{i=1}^{k} \hat{c_i} \ge \displaystyle\sum_{i=1}^{k} {c_i}$ ( $1 \le k \le n)$ 이 성립해야 한다.

$\Phi(D_i) - \Phi({D_0}) \ge 0$ 이 모든 i에 대해 성립해야 한다.

Example of Potential method

1. stack operation

$\Phi(D_i)$ 는 i번째 operation 이후 stack에 들어있는 데이터의 개수이다.
- $\Phi(D_0)$ = 0
- stack은 nonnegative objec 이기 때문에 $\Phi(D_i) \ge 0$ 이다.
PUSH
- $\Phi(D_i) - \Phi({D_{i-1}}) = s - (s-1) = 1$
- s는 stack에 들어있는 데이터 개수
- Amortized cost
  - $\hat{c_i}$ = $c_i + \Phi(D_i) - \Phi({D_{i-1}})$
  - = $1 + (s+1) - s$ = 2
POP
- $\Phi(D_i) - \Phi({D_{i-1}}) = (s-1) - s= -1$
- Amortized cost
  - $\hat{c_i}$ = $c_i + \Phi(D_i) - \Phi({D_{i-1}})$
  - = $1 + (s-1) - s$ = 0
MULTIPOP(S, k)
- $k'$ : $min(s, k)$
- $\Phi(D_i) - \Phi({D_{i-1}}) = - min(s, k) = -k'$
- Amortized cost
  - $\hat{c_i}$ = $c_i + \Phi(D_i) - \Phi({D_{i-1}})$
  - = $k' - k'$ = 0

2. Increment binary bit counter

$\Phi(D_i)$ : i번째 INCREMENT 연산 수행 후, 1의 개수 = $b_i$
$t_i$ : i번째 INCREMENT 연산 수행 후 , reset된 bit 개수
Bit counter를 고려해보면 하나의 INCREMENT 연산이 수행될 때, 최대 한 개의 비트만 set 되고 $t_i$ 개의 비트가 reset 된다.
Amortized cost
- $\hat{c_i}$ = $c_i + \Phi(D_i) - \Phi({D_{i-1}})$

A. $c_i \le t_i + 1$ 가 성립함을 먼저 보이자.

$t_i$ 는 reset bit의 개수, + 1 은 최대 set bit의 개수를 나타낸다.
따라서 $c_i \le t_i + 1$ 는 자명하다.

B. $\Phi(D_i) - \Phi({D_{i-1}}) \le -t_i + 1$ 가 성립함을 보이자

Case 1: $b_i > 0$
- $\Phi(D_i) - \Phi({D_{i-1}}) = -t_i + 1$
- $t_i$ 개의 비트를 reset하고, 1개의 비트를 set 한다.
Case 2: $b_i = 0$
- $\Phi(D_i) - \Phi({D_{i-1}}) = -t_i$
- $t_i$ 개의 비트를 reset 하기만 하면 된다.
- 111...1 > 000...0