1.9 Some Special Expectation
E ( X k ) E(X^k) E ( X k ) : k-th moment
μ = E ( X ) : m e a n \mu = E(X): mean μ = E ( X ) : m e a n
σ 2 : = E ( X − μ ) 2 : v a r i a n c e \sigma^2 := E{(X-\mu)^2}: variance σ 2 : = E ( X − μ ) 2 : v a r i a n c e
E ( X − μ ) k E{(X-\mu)^k} E ( X − μ ) k : k-th central moment
Def 1.9.3(적률 생성 함수)
Moment generating function M x ( t ) M_x(t) M x ( t )
X : r . v . X: r.v. X : r . v .
M x ( t ) : = E [ e t x ] M_x(t) := E[e^{tx}] M x ( t ) : = E [ e t x ] ∣ t ∣ < h , h > 0 \ \ \ \ \left | t \right | < h, h>0 ∣ t ∣ < h , h > 0
= ∫ − ∞ ∞ e t x f ( x ) d x ( c o n t i n u o u s c o n d i t i o n ) = \int_{-\infin}^{\infin}e^{tx}f(x)dx\ \ (continuous\ condition) = ∫ − ∞ ∞ e t x f ( x ) d x ( c o n t i n u o u s c o n d i t i o n )
Thm 1.9.1(uniqueness of mgf)
M x ( t ) M_x(t) M x ( t ) : mgf of r.v. X
M y ( t ) M_y(t) M y ( t ) : mgf of r.v. Y
F X ( t ) = F Y ( t ) ∀ t < = = > M X ( t ) = M Y ( t ) F_X(t) = F_Y(t) \forall t <==> M_X(t) = M_Y(t) F X ( t ) = F Y ( t ) ∀ t < = = > M X ( t ) = M Y ( t )
--> 어떤 확률분포에 대해 mgf는 하나만 존재한다. (존재한다는 전제 하에)
Taylor Expansion을 이용한 특수한 경우
As a special case, 아래와 같이 테일러급수를 표현할 수 있다.
(i) f ( x ) = ∑ j = 0 k f ( y ) ( x 0 ) j ! ( x − x 0 ) j + f ( k + 1 ) ( ξ ) ( k + 1 ) ! ( x − x 0 ) k + 1 , ( ξ : b e t w e e n x a n d x 0 ) f(x) = \sum_{j = 0}^k {f^{(y)}(x_0) \over j!}(x-x_0)^j + {f^{(k+1)}(\xi) \over (k+1)!}(x-x_0)^{k+1},\ (\xi: between\ x\ and\ x_0) f ( x ) = ∑ j = 0 k j ! f ( y ) ( x 0 ) ( x − x 0 ) j + ( k + 1 ) ! f ( k + 1 ) ( ξ ) ( x − x 0 ) k + 1 , ( ξ : b e t w e e n x a n d x 0 )
여기서 summation기호가 붙지 않은 두번째 항을 remainder term(negligible term)이라고 한다.
이 remainder 항은 x x x 와 x 0 x_0 x 0 이 충분히 가까울 때, 즉 두 수의 차가 0 < x − x 0 < 1 0<x-x_0<1 0 < x − x 0 < 1 일때, k차 이상의 값들이 매우 작은 값으로 수렴하므로 무시할 수 있는 항으로 취급한다.
(ii) f : R d → R 1 f: R^d \to R^1 f : R d → R 1 x = x 0 x=x_0 x = x 0 에서 미분 가능
f ( x ) = f ( x 0 ) + ∇ f ( x 0 ) T ( x − x 0 ) + 1 2 ! ( x − x 0 ) T H ( x − x 0 ) + R n w h e r e ∇ f ( x ) = ∂ f ( x ) ∂ x = [ ∂ f ( x ) ∂ x 1 ⋅ ⋅ ⋅ ∂ f ( x ) ∂ x d ] f(x) = f(x_0) + \nabla f(x_0)^T(x-x_0) + {1 \over 2!}(x-x_0)^TH(x-x_0) + R_n\\ where \\ \nabla f(x)= {\partial f(x) \over \partial x} = \begin{bmatrix}{\partial f(x) \over \partial x_1}\\\cdot\\\cdot\\\cdot\\{\partial f(x) \over \partial x_d}\end{bmatrix} f ( x ) = f ( x 0 ) + ∇ f ( x 0 ) T ( x − x 0 ) + 2 ! 1 ( x − x 0 ) T H ( x − x 0 ) + R n w h e r e ∇ f ( x ) = ∂ x ∂ f ( x ) = ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎡ ∂ x 1 ∂ f ( x ) ⋅ ⋅ ⋅ ∂ x d ∂ f ( x ) ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎤
H ( h e s s i a n m a t r i x ) = ∂ 2 f ( x ) ∂ x ∂ x t = [ ∂ 2 f ( x ) ∂ x 1 2 ⋅ ⋅ ⋅ ∂ 2 f ( x ) ∂ x 1 ∂ x d ⋅ ⋅ ⋅ ∂ 2 f ( x ) ∂ x d 2 ] H(hessian\ matrix)= {\partial^2 f(x) \over \partial x \partial x^t} = \begin{bmatrix}{\partial^2 f(x) \over \partial x_1^2} & \cdot\cdot\cdot & \partial^2 f(x) \over \partial x_1 \partial x_d\\ \cdot\\\cdot\\\cdot\\{\partial^2 f(x) \over \partial x_d^2}\end{bmatrix} H ( h e s s i a n m a t r i x ) = ∂ x ∂ x t ∂ 2 f ( x ) = ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎡ ∂ x 1 2 ∂ 2 f ( x ) ⋅ ⋅ ⋅ ∂ x d 2 ∂ 2 f ( x ) ⋅ ⋅ ⋅ ∂ x 1 ∂ x d ∂ 2 f ( x ) ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎤
Mgf may not exist
X : r . v . w i t h p d f f ( x ) = X − 2 I ( x > 1 ) X: r.v.\ with\ pdf\ f(x) = X^{-2}I(x>1) X : r . v . w i t h p d f f ( x ) = X − 2 I ( x > 1 )
M X ( t ) = ∫ 1 ∞ e t x x − 2 d x = l i m b → ∞ ∫ 1 b ( 1 + t x + t 2 x 2 2 + ⋅ ⋅ ⋅ ) x − 2 d x M_X(t) = \int_1^{\infin}e^{tx}x^{-2}dx = lim_{b \to \infin} \int_1^b(1+tx+{t^2x^2 \over 2} + \cdot\cdot\cdot)x^{-2}dx M X ( t ) = ∫ 1 ∞ e t x x − 2 d x = l i m b → ∞ ∫ 1 b ( 1 + t x + 2 t 2 x 2 + ⋅ ⋅ ⋅ ) x − 2 d x --> 3차항부터 x가 살아있으며, 이것이 무한대로 발산하므로 적분이 불가능하다.
moment 생성의 의미는?
M X ( t ) = E [ e t x ] = E [ 1 + t x + t 2 x 2 2 ! + ⋅ ⋅ ⋅ ] = 1 + t E ( X ) + t 2 2 E ( X 2 ) + ⋅ ⋅ ⋅ = 1 + μ t + μ 2 2 t 2 + ⋅ ⋅ ⋅ < = = μ k = E ( X k ) ∂ M x ( t ) ∂ t = μ + μ 2 t + ⋅ ⋅ ⋅ = = > M x ′ ( 0 ) = μ = = > M X ′ ′ ( 0 ) = μ 2 M_X(t) = E[e^{tx}]\\ = E[1+tx+{t^2x^2\over 2!} + \cdot\cdot\cdot]\\ = 1 + tE(X) + {t^2 \over 2} E(X^2) + \cdot\cdot\cdot\\ = 1 + \mu t + {\mu_2\over2}t^2 + \cdot\cdot\cdot <==\ \mu_k = E(X^k)\\ {\partial M_x(t)\over \partial t} = \mu + \mu_2t + \cdot\cdot\cdot \\ ==> M'_x(0) = \mu\\ ==> M''_X(0) = \mu_2 M X ( t ) = E [ e t x ] = E [ 1 + t x + 2 ! t 2 x 2 + ⋅ ⋅ ⋅ ] = 1 + t E ( X ) + 2 t 2 E ( X 2 ) + ⋅ ⋅ ⋅ = 1 + μ t + 2 μ 2 t 2 + ⋅ ⋅ ⋅ < = = μ k = E ( X k ) ∂ t ∂ M x ( t ) = μ + μ 2 t + ⋅ ⋅ ⋅ = = > M x ′ ( 0 ) = μ = = > M X ′ ′ ( 0 ) = μ 2
In general, M x ( k ) ( 0 ) = μ k , k = 1 , 2 , ⋅ ⋅ ⋅ M_x^{(k)}(0) = \mu_k,\ k=1,2,\cdot\cdot\cdot M x ( k ) ( 0 ) = μ k , k = 1 , 2 , ⋅ ⋅ ⋅
i.e M X ( t ) = ∑ j = 0 ∞ μ j j ! t j M_X(t) = \sum_{j=0}^\infin {\mu_j \over j!}t^j M X ( t ) = ∑ j = 0 ∞ j ! μ j t j : power series expansion
(4) Characteristic function
특성함수로, 어떠한 함수라도 존재한다.
ψ X ( t ) : = E [ e i t x ] \psi_X(t) := E[e^{itx}] ψ X ( t ) : = E [ e i t x ]
e i θ = c o s ( θ ) + i ∗ s i n ( θ ) e^{i\theta} = cos(\theta) + i*sin(\theta) e i θ = c o s ( θ ) + i ∗ s i n ( θ )
i.e. ψ X ( t ) = E [ c o s ( t x ) + i ∗ s i n ( t x ) ] \psi_X(t) = E[cos(tx) + i*sin(tx)] ψ X ( t ) = E [ c o s ( t x ) + i ∗ s i n ( t x ) ]
claim: 특성함수는 항상 존재한다.(증명을 위해선 적분의 절댓값이 절댓값의 적분보다 작거나 같다라는 것과, a 2 + b 2 \sqrt{a^2 + b^2} a 2 + b 2 가 복소수 a+bi임을 알아야 한다.)
(5) cgf(cumulant generating function)
ψ X ( t ) : = l o g M X ( t ) \psi_X(t) := logM_X(t) ψ X ( t ) : = l o g M X ( t ) cgf of r.v. X
M X ( t ) = ∑ j = 0 ∞ μ j j ! t i M_X(t) = \sum_{j=0}^{\infin}{\mu_j\over j!}t^i M X ( t ) = ∑ j = 0 ∞ j ! μ j t i : power series : μ j \mu_j μ j = j − t h j-th j − t h moment
-> log를 취하더라도 미분가능하므로 똑같이 써줄 수 있다.
ψ X ( t ) = ∑ j = 0 ∞ k j j ! t i \psi_X(t) = \sum_{j=0}^{\infin}{k_j\over j!}t^i ψ X ( t ) = ∑ j = 0 ∞ j ! k j t i : power series : k j : j − t h k_j: j-th k j : j − t h cumulant
Relationship between moments and cumulants
ψ ( t ) = k 0 + k 1 + k 2 2 t 2 + k 3 6 t 3 = l o g ( M X ( t ) ) = l o g ( 1 + μ t + μ 2 2 t 2 + μ 3 6 t 3 ⋅ ⋅ ⋅ = l o g ( 1 + x ( x 는 t 를 포함한 모든 나머지항들 ) = l o g ( 1 + x ) = x − x 2 2 + x 3 3 − x 4 4 + ⋅ ⋅ ⋅ = ∑ j = 1 ∞ ( − 1 ) j − 1 x j = ( μ t + μ 2 2 t 2 + μ 3 6 t 3 ⋅ ⋅ ⋅ ) − 1 2 ( μ t + μ 2 2 t 2 + μ 3 6 t 3 ⋅ ⋅ ⋅ ) 2 + ⋅ ⋅ ⋅ \psi(t) = k_0 + k_1 + {k_2 \over 2} t^2 + {k_3 \over 6} t^3\\ = log(M_X(t))\\ = log(1+\mu t + {\mu_2 \over 2} t^2 + {\mu_3 \over 6} t^3\cdot\cdot\cdot\\ = log(1+x(x는\ t를\ 포함한\ 모든\ 나머지항들) \\ = log(1+x) = x - {x^2 \over 2}+{x^3 \over 3}-{x^4 \over 4}+\cdot\cdot\cdot = \sum_{j=1}^{\infin}(-1)^{j-1}x^j\\ = (\mu t + {\mu_2 \over 2} t^2 + {\mu_3 \over 6} t^3\cdot\cdot\cdot) -{1 \over2}(\mu t + {\mu_2 \over 2} t^2 + {\mu_3 \over 6} t^3\cdot\cdot\cdot)^2 + \cdot\cdot\cdot ψ ( t ) = k 0 + k 1 + 2 k 2 t 2 + 6 k 3 t 3 = l o g ( M X ( t ) ) = l o g ( 1 + μ t + 2 μ 2 t 2 + 6 μ 3 t 3 ⋅ ⋅ ⋅ = l o g ( 1 + x ( x 는 t 를 포 함 한 모 든 나 머 지 항 들 ) = l o g ( 1 + x ) = x − 2 x 2 + 3 x 3 − 4 x 4 + ⋅ ⋅ ⋅ = ∑ j = 1 ∞ ( − 1 ) j − 1 x j = ( μ t + 2 μ 2 t 2 + 6 μ 3 t 3 ⋅ ⋅ ⋅ ) − 2 1 ( μ t + 2 μ 2 t 2 + 6 μ 3 t 3 ⋅ ⋅ ⋅ ) 2 + ⋅ ⋅ ⋅
= 맨 첫 식과 같은 다항식이므로 모든 계수가 일치해야함.
==> k 0 = 0 , k 1 = μ , k 2 = μ 2 − μ 2 , k 3 = μ 3 − 3 μ 2 μ 1 + 2 μ 3 k_0 = 0, k_1 = \mu, k_2 = \mu_2-\mu^2, k_3 = \mu_3 - 3\mu_2\mu_1+2\mu^3 k 0 = 0 , k 1 = μ , k 2 = μ 2 − μ 2 , k 3 = μ 3 − 3 μ 2 μ 1 + 2 μ 3
==>
k_i -> ith moment
(6) skewness and kurtosis
ρ 3 = E [ ( X − μ ) 3 ] / σ 3 \rho_3 = E[(X-\mu)^3]/\sigma^3 ρ 3 = E [ ( X − μ ) 3 ] / σ 3 : skewness(왜도)
ρ 4 = E [ ( X − μ ) 4 ] / σ 4 − 3 ( − 3 은 o p t i o n a l f o r s t a n d a r d ) \rho_4 = E[(X-\mu)^4]/\sigma^4 -3(-3은 optional for standard) ρ 4 = E [ ( X − μ ) 4 ] / σ 4 − 3 ( − 3 은 o p t i o n a l f o r s t a n d a r d ) : kurtosis(첨도)