Sufficiency

deejayosamu·2025년 7월 29일

통계 기본 개념

목록 보기
12/20

Def)
A statistic T(X)T(\underline{X}) is a sufficient statistic for θ\theta if the conditional distribution of X\underline{X} given T(X)T(\underline{X}) does not depend on θ.\theta.

ex1) X1,X2iidb(1,θ)X_1,X_2 \overset{iid}{\sim} b(1,\theta)
T1(X1,X2)=X1=>P(X1,X2T1)=P(X1=x1,X2=x2,X1=t1)P(X1=t1)=P(X1=t1,X2=x2)P(X1=t1)=P(X2=x2)=θx2(1θ)1x2T_1(X_1,X_2) = X_1 \\ => P(X_1,X_2 | T_1)=\frac{P(X_1=x_1, X_2=x_2, X_1=t_1)}{P(X_1=t_1)} = \frac{P(X_1=t_1, X_2=x_2)}{P(X_1=t_1)} = P(X_2=x_2) = \theta^{x_2} (1-\theta)^{1-x_2}: depends on θ\theta

T2(X1,X2)=X1+X2=>P(X1=x1,X2=x2,X1+X2=t2)P(X1+X2=t2)=P(X1=x1)P(X2=t2x1)P(X1+X2=t2)=θx1(1θ)1x1θt2x1(1θ)1t2+x1(2t2)θt2(1θ)2t2=1(2t2)T_2(X_1,X_2)=X_1 + X_2 \\ => \frac{P(X_1=x_1, X_2=x_2, X_1+X_2=t_2)}{P(X_1+X_2=t_2)} = \frac{P(X_1=x_1)P(X_2=t_2-x_1)}{P(X_1+X_2=t_2)}=\frac{\theta^{x_1} (1-\theta)^{1-x_1} \theta^{t_2-x_1} (1-\theta)^{1-t_2+x_1}}{ \begin{pmatrix} 2 \\ t_2 \end{pmatrix} \theta^{t_2} (1-\theta)^{2-t_2}} = \frac{1}{\begin{pmatrix} 2 \\ t_2 \end{pmatrix}}: does not depend on θ\theta

Theorem)
T(X)T(\underline{X}) is a sufficient statistic if P(x1,...,xn;θ)g(t;θ)\frac{P(x_1,...,x_n;\theta)}{g(t;\theta)} does not depend on θ\theta

P(x1,...,xn;θ)P(x_1,...,x_n;\theta): joint pdf or pmf of X\underline{X}
g(t;θ)g(t;\theta): pdf or pmf of T(X)T(\underline{X})

(If gg: 1-1 function, TT: sufficient statistic => g(T)g(T): sufficient statistic)

pf)
pf

ex1) X1,...,Xniidf(x)X_1,...,X_n \overset{iid}{\sim} f(x)
order statistic X(1),...,X(n)X_{(1)},...,X_{(n)} are sufficient statistic for ff
fX(x)fY(y)=f(x1)...f(xn)n!1...1f(y1)...f(yn)=1n!(b/c i=1nf(xi)=i=1nf(yi))\frac{f_{\underline{X}}(\underline{x})}{f_{\underline{Y}}(\underline{y})}=\frac{f(x_1)...f(x_n)}{ \frac{n!}{1...1} f(y_1)...f(y_n)}=\frac{1}{n!}(b/c \space \prod_{i=1}^{n} f(x_i)=\prod_{i=1}^{n} f(y_i))

Theorem) Factorization theorem
x,θ\forall x,\theta,
T(X)T(\underline{X}): sufficient statistic for θ\theta <=> f(x;θ)=g(t;θ)h(x)f(\underline{x};\theta)=g(t;\theta)h(\underline{x})
pf)
pf-fct

ex1) X1,...,XniidN(μ,1)X_1,...,X_n \overset{iid}{\sim} N(\mu,1)
f(x;μ)=i=1n12πexp((xiμ)22)=(12π)nexp(12i=1n(xiμ)2)=(12π)nexp(12i=1nxi2)exp(12(nμ22μi=1nxi))h(x)=(12π)nexp(12i=1nxi2)g(t;μ)=exp(12(nμ22μi=1nxi))=>T=i=1nxif(\underline{x};\mu) = \prod_{i=1}^{n} \frac{1}{\sqrt{2 \pi}} exp(-\frac{(x_i-\mu)^2}{2}) = (\frac{1}{\sqrt{2 \pi}})^n exp(-\frac{1}{2} \sum_{i=1}^{n} (x_i-\mu)^2) = (\frac{1}{\sqrt{2 \pi}})^n exp(-\frac{1}{2} \sum_{i=1}^{n} x_i^2) exp(-\frac{1}{2}(n \mu^2 - 2 \mu \sum_{i=1}^{n} x_i)) \\ \\ h(\underline{x})= (\frac{1}{\sqrt{2 \pi}})^n exp(-\frac{1}{2} \sum_{i=1}^{n} x_i^2) \\ g(t;\mu) = exp(-\frac{1}{2}(n \mu^2 - 2 \mu \sum_{i=1}^{n} x_i)) \\ => T=\sum_{i=1}^{n} x_i : sufficient statistic for μ\mu

ex2) X1,...,XniidU[0,θ]X_1,...,X_n \overset{iid}{\sim} U[0,\theta]
f(x;θ)=i=1n1θI(0xiθ)=(1θ)ni=1nI(xiθ)i=1nI(0xi)h(x)=i=1nI(0xi)g(t;θ)=(1θ)ni=1nI(xiθ)=>T=i=1nI(xiθ)f(\underline{x};\theta)=\prod_{i=1}^{n} \frac{1}{\theta} I_{(0 \leq x_i \leq \theta)} = (\frac{1}{\theta})^n \prod_{i=1}^{n} I_{(x_i \leq \theta)} \prod_{i=1}^{n} I_{(0 \leq x_i)} \\ \\ h(\underline{x}) = \prod_{i=1}^{n} I_{(0 \leq x_i)} \\ g(t;\theta) = (\frac{1}{\theta})^n \prod_{i=1}^{n} I_{(x_i \leq \theta)} \\ => T=\prod_{i=1}^{n} I_{(x_i \leq \theta)} : sufficient statistic for θ\theta

Theorem)
If f(x;θ)f(x;\underline{\theta}) belongs to exponential family s.t. f(x;θ)=g(x)c(θ)exp(j=1kwj(θ)tj(x)) (jk)f(x;\underline{\theta}) = g(x) c(\underline{\theta}) exp(\sum_{j=1}^{k} w_j(\underline{\theta}) t_j(x)) \space (j \leq k),
(i=1nt1(xi),...,i=1ntk(xi))(\sum_{i=1}^{n} t_1(x_i),...,\sum_{i=1}^{n} t_k(x_i)): sufficient statistic for θ\underline{\theta}
pf)
pf-exp

0개의 댓글