[Mathematical Statistics] 9. Inference based on normal models

박경민·2024년 10월 9일

[Mathematical Statistics]

목록 보기

24/24

9. Inference based on normal models

이 절에서 다룰 것은 어떤 확률벡터 $X_d$ 가 normal 을 따를 때, 이를 가지고 만든 $A^TXA$ (quadratic form) 은 무엇을 따를까 하는 것이다. Multivarate normal distribution 에서의 linearity 에서 다음과 같은 건 다룬 적이 있었다.

따라서 이어질 내용은 다음과 같다.

Quadratic forms (이차형식이 무엇인가? 를 recall.)
The distributions of certain quadratic forms (이차형식은 어떤 분포를 따르는가?)
Independence between two quadratic forms (이차형식끼리의 독립성은 어떻게 판단하는가?)

순서로 다뤄보자.

[1] Quadratic forms (이차 형식)

As in the linear algebra, the quadratic form (이차 형식) of random variables $X_{1}, \ldots, X_{d}$ refers to a statistic written as

where $\mathbf{X}=\left(X_{1}, \ldots, X_{d}\right)^{T} \in \mathbb{R}^{d}$ is a $d$ -dimensional random vector and $\mathbf{A}=$ $\left[a_{i j}\right] \in \mathbb{R}^{d \times d}$ is a $d \times d$ deterministic squared matrix.

만약 다항식과 같은 개념을 생각한다면.. 이차식 또는 이차식 x constant 의 항들로만 이루어진 식을 생각하면 된다.
따라서 X나 A의 값으로 random 이 아닌 non-random 한 값들이 들어가게 된다면 이차형식은 scala 값으로 주어진다.
A는 symmetric 이라 가정해도 좋은데, 왜냐하면 만약 A가 symmetric이 아닐 경우 A를 $(A+A^T)/ 2$ 로 항상 대체해도 되기 때문이다.

이차형식의 예들을 보자.

$X_{1}^{2}+X_{2}^{2}+X_{3}^{2}-2 X_{1} X_{2}$ is a quadratic form of $X_{1}, X_{2}$ and $X_{3}$ .
$\left(X_{1}-1\right)^{2}+\left(X_{2}-2\right)^{2}=X_{1}^{2}+X_{2}^{2}+2 X_{1}-4 X_{2}+5$ is a quadratic form of $\left(X_{1}-1\right)$ and $\left(X_{2}-2\right)$ but not of $X_{1}$ and $X_{2}$ .
Is sample variance $S^{2}=\frac{1}{n-1} \sum_{i=1}^{n}\left(X_{i}-\bar{X}\right)^{2}$ is a quadratic form of $X_{1}, \ldots, X_{n}$ ? (sol) Yes. Let $\mathbf{X}=\left(X_{1}, \ldots, X_{d}\right)^{T}$ . Then,

Sample variancㄷ 는 언뜻보기에 이차형식이 아니지만 잘 전개하면 이차형식임을 보일 수 있다.

$(X_1-\bar{X},..., X_n-\bar{X} )$ 를 a로 치환하고 나서는 다음을 잠깐 이용하도록 하자.

다시 돌아와서 풀면 X traspose 와 X 사이에 A가 들어가있는 이차형식으로 정리할 수 있다. Sample variance 의 이차형식 유도에선 다음과 같이 어떤 matrix B의 자기자신 곱이 다시 자기자신으로 주어지는 $B^2 = B$ 를 확인할 수 있는데, 이것이 다음 이어지는 멱등행렬의 정의가 된다.

[2] The distributions of certain quadratic forms
이차형식이 어떤 분포를 따르는지 보이기 위해선 대각합과 멱등행렬의 review 가 좀 필요하다.

1) Remark (Review of trace). We make use of the trace (대각합) of a square matrix. If $\mathbf{A}=\left[a_{i j}\right]$ is an $n \times n$ matrix, then we define the trace of $\mathbf{A},(\operatorname{tr} \mathbf{A})$ , to be the sum of its diagonal entries; i.e.,

대각합이란 n x n matrix 에서 말그대로 i,i에 있는 성분만 더한 것을 말한다 대각합은 대상이 되는 A matrix 에 대해 tr(A)로 그 합을 표시하며, 다음과 같은 성질들을 만족한다.

Linearity: $\operatorname{tr}(a \mathbf{A}+b \mathbf{B})=a \operatorname{tr}(\mathbf{A})+b \operatorname{tr}(\mathbf{B})$ for any scalar $a, b$ .
Commutativity (inside the operator): $\operatorname{tr}(\mathbf{A B})=\operatorname{tr}(\mathbf{B A})$
Interchangeability with expectation: For a random matrix $\mathbb{X}, E(\operatorname{tr}(\mathbb{X}))=$ $\operatorname{tr}(E(\mathbb{X}))$
$\operatorname{tr}(a)=(a)$ for any scalar $a$ .

자리를 자연스럽게 바꿀 수 있고, expectation 과 trace 의 자리도 바꿀 수 있음을 기억하자. 만약 scala 라면 trace 를 벗겨도 됨을 기억하자.

그러면 이어지는 이차형식에 대한 다음 기댓값의 형태를 구할 수 있다.

Theorem Suppose the $\mathbf{X}$ is a d-dimensional random vector with mean $\boldsymbol{\mu}$ and covariance matrix $\boldsymbol{\Sigma}$ . Then, for a symmetric deterministic matrix $\mathbf{A} \in \mathbb{R}^{d \times d}$ ,

E\left(\mathbf{X}^{T} \mathbf{A X}\right)=\operatorname{tr} \mathbf{A} \boldsymbol{\Sigma}+\boldsymbol{\mu}^{T} \mathbf{A} \boldsymbol{\mu}

Proof.

2. Review of idempotent matrix.

A symmetric matrix $\mathbf{A}$ is called an idempotent matrix (멱등행렬) if $\mathbf{A}^2 = \mathbf{A}$

Matrix A x A 가 자기자신 A로 튀어나오면 그 행렬 A를 멱등행렬이라 한다. 멱등행렬과 관련한 사실관계는 다음과 같다.

1) 만약 matrix A가 멱드앵렬이라면, A의 고윳값은 0또는 1이다.

\lambda \mathbf{v}=\mathbf{A} \mathbf{v}=\mathbf{A}^{2} \mathbf{v}=\lambda \mathbf{A} \mathbf{v}=\lambda^{2} \mathbf{v}

Hence $\lambda(\lambda-1) \mathbf{v}=\mathbf{0}$ . Since $\mathbf{v} \neq \mathbf{0}, \lambda=0$ or 1

2) rank(A)는 matrix A에서 대각선으로 non-zero 인 eigenvalue 의 개수를 센 것이다. 따라서 A의 대각합으로 볼 수도 있다. (i.e. A의 고윳값이 0 또는 1이므로 합 = 1인 개수와 같다.)

\operatorname{tr}(\mathbf{A})=\operatorname{tr}\left(\boldsymbol{\Lambda} \mathbf{P P}^{T}\right)=\operatorname{tr}(\boldsymbol{\Lambda})=\sum_{i=1}^{n} \underbrace{\lambda_{i}}_{\text {either } 0 \text { or } 1}=\operatorname{rank}(\mathbf{A})

멱등행렬의 예시는 다음과 같다.

Examples: $\begin{aligned} & -\frac{1}{n} \mathbf{1} \mathbf{1}^{T} \\ & -\mathbf{I}-\frac{1}{n} \mathbf{1 1}^{T} \\ & -\mathbf{A}\left(\mathbf{A}^{T} \mathbf{A}\right)^{-1} \mathbf{A}^{T} for any \mathbf{A} \end{aligned}$

특히 $\mathbf{A}\left(\mathbf{A}^{T} \mathbf{A}\right)^{-1} \mathbf{A}^{T}$ 같은 경우는 어떠한 A에 대해서도 멱등행렬이면서, 이를 x와 곱했을 때 얻을 수 있는 기하학적 해석 덕분에 자주 등장하는 form 이 된다. $\mathbf{A}\left(\mathbf{A}^{T} \mathbf{A}\right)^{-1} \mathbf{A}^{T}x$ 는, x를 col(A)로 정사영했을 때 얻을 수 있는 벡터로 해석이 가능하다. 이때 Col(A)라는 건 A의 열벡터들로 span 된 subspace이며, x를 이 space에 정사영했을 때 얻는 벡터가 전체 결과가 된다.

따라서 $H: \mathbf{A}\left(\mathbf{A}^{T} \mathbf{A}\right)^{-1} \mathbf{A}^{T}$ 라 할 수 있으며 $I-H$ 역시 멱등행렬에 속한다. 여기에 각각 x를 곱하게 되면, 이번엔 Col(A) 로의 정사영이 아닌 Col(A)의 transpose, 즉 직교하는 공간으로의 정사영으로 해석하면 된다.

이제 2차형식이 어떤 분포를 따르는지 보자.

Theorem Suppose that $\mathbf{Z} \sim N_{d}\left(\mathbf{0}_{d}, \mathbf{I}_{d}\right)$ . Let $\mathbf{A}$ be a real symmetric matrix. Then, $\mathbf{Z}^{T} \mathbf{A Z} \sim \chi^{2}(r)$ if and only if $\mathbf{A}^{2}=\mathbf{A}$ and $r=\operatorname{rank}(A)$ .

Proof. We prove "if ( $\Leftarrow$ )" part only. From the spectral decomposition of idempotent matrix A of rank $r$ ,

r은 가정으로 eigenvalue 에서 1의 개수이자 tr(A) 를 뜻한다. rank(A) = r 이라면 자연스럽게 A가 들어간 이차형식 역시 자유도 r의 카이제곱분포를 따른다고 정리하면, 되겠다.

[3] Independence between two quadratic forms

(i) Suppose that $\mathbf{X} \sim N_{d}\left(\boldsymbol{\mu}, \sigma^{2} \mathbf{I}\right)$ , and let $\mathbf{A}$ and $\mathbf{B}$ be real symmetric and idempotent matrices. If $\mathbf{A B}=0$ , then, $\mathbf{X}^{T} \mathbf{A} \mathbf{X}$ and $\mathbf{X}^{T} \mathbf{B} \mathbf{X}$ are independent.

X가 normal 을 따르고, A, B가 멱등행렬이고 AB=0 이라면 이들의 이차형식은 독립이다.

(ii) Suppose that $\mathbf{Z} \sim N_{d}(\mathbf{0}, \mathbf{I})$ , and let $\mathbf{A}$ and $\mathbf{B}$ be real symmetric matrices. Then, $\mathbf{Z}^{T} \mathbf{A} \mathbf{Z}$ and $\mathbf{Z}^{T} \mathbf{B Z}$ are independent if and only if $\mathbf{A B}=0$ .

X가 standard normal 을 따르고, A, B가 멱등행렬이고 AB=0 이라면, 이들의 이차형식 또한 독립이다.

증명은 quadratic form 자체가 아닌 AX가 BX 와 독립을 보임으로써 증명하게 된다.

Theorem (Fisher-Cochran). Assume that $\mathbf{Z} \sim N_{d}(\mathbf{0}, \mathbf{I})$ . Then, the following three conditions are equivalent. Assume that A, B are idempotent, $rank(A) = r1$ , $rank(B) = r2$ .
(i) $\mathbf{Z}^{T} \mathbf{A} \mathbf{Z}$ and $\mathbf{Z}^{T} \mathbf{B Z}$ are independent $\chi^{2}\left(r_{1}\right)$ and $\chi^{2}\left(r_{2}\right)$ random variables, respectively.
(ii) $\mathbf{Z}^{T}(\mathbf{A}+\mathbf{B}) \mathbf{Z} \sim \chi^{2}(r)$ and $r=\operatorname{rank}(\mathbf{A})+\operatorname{rank}(\mathbf{B})$ .
(iii) $\mathbf{Z}^{T}(\mathbf{A}+\mathbf{B}) \mathbf{Z} \sim \chi^{2}(r), \mathbf{Z}^{T} \mathbf{A} \mathbf{Z} \sim \chi^{2}\left(r_{1}\right)$ and $\mathbf{B}$ is positive semidefinite.

더한 것이 카이제곱 r을 따르고, A가 카이제곱 r1을 따른다면 이 둘을 뺀 B는 카이제곱 r2를 따를 것이다.

박경민

Mathematics, Algorithm, and IDEA for AI research🦖

이전 포스트

[Mathematical Statistics] 9. Inference based on normal models

[Mathematical Statistics]

9. Inference based on normal models

[Mathematical Statistics] 6.5 Multiparameter case: Testing

0개의 댓글