선형대수학 정리

유승우·2023년 9월 26일

수학

노션 링크

Vector space

vector space의 조건

There exist an additive identity $0$ (영벡터가 존재)
For each $x \in V$ , there exists an additive inverse $-x$ (역벡터 존재)
There exist a multiplicative identitiy in $\mathbb{R}$ such that $1x = x$ for all $x \in V$
Commutativity(교환 법칙): $x + y = y + x$ for all $x, y \in V$
Associativity(결합 법칙): $(x + y) + z = x + (y+z)$ and $\alpha(\beta x) = (\alpha\beta)x$ for all $x, y, z\in V$ and $\alpha,\beta \in \mathbb{R}$
Distributivity(분배법칙): $\alpha (x+y)=\alpha x+\alpha y$ and $(\alpha + \beta)x = \alpha x +\beta x$ for all $x, y, z\in V$ and $\alpha,\beta \in \mathbb{R}$

sparsity

A vector is sparse if many of its entries are 0

요소 중에 0이 많은 벡터

span

$span$ of $V$ : all possible linear combination of the vectors

$span\{v_1, ...,v_n\} = \{v \in V:\exists \alpha _1,...,\alpha_n$ such that $\alpha_1v_1 + ... + \alpha_nv_n = v\}$

Superposition

superposition(linear function)

$f: R^n \rightarrow R$ satisfies superposition property if

f(\alpha x+\beta y)=\alpha f(x) + \beta f(y)

→ f is a linear function!

inner product function

⁍

f(\alpha x+\beta y) = a^T(\alpha x + \beta y) = a^T(\alpha x) + a^T(\beta y) = \alpha(a^T x) + \beta (a^T y) = \alpha f(x) + \beta f(y)

→ f is a linear function!

Affine function

a function that is linear plus a constant

f(x) = a^Tx+b

Untitled

affine 함수는 $\alpha + \beta = 1$ 일 때만 $f(\alpha x+\beta y)=\alpha f(x) + \beta f(y)$ 을 만족한다. ( $\alpha f(x) + \beta f(y)$ 가 직선 위에 있을 때 (내분점, 외분점))

Norm

Euclidean norm

\Vert x\Vert = \sqrt{x_1^2+x_2^2+...+x_n^2} = \sqrt{x^Tx}

homogeneity: $\Vert \beta x \Vert = \vert\beta\vert\Vert x\Vert$
triangle inequality(삼각 부등식): $\Vert x+y\Vert \leq \Vert x\Vert + \Vert y \Vert$
non-negativity: $\Vert x\Vert \geq 0$
definiteness: $\Vert x \Vert = 0$ only if $x = 0$

Norms

크기를 정의한 다양한 방식들

\Vert x \Vert_p = \left(\sum_{i=0}^n\vert x\vert^p\right)^{\frac{1}{p}}

\Vert x\Vert_\infin = \underset{1\leq i \leq n}{max} \vert x_i \vert

Untitled

For two norms A and B, there exist constants alpha > 0, beta > 0 such that: $\alpha \Vert x \Vert _A \leq \Vert x \Vert_B \leq \beta \Vert x \Vert _A$

→ $\Vert x \Vert _A \approx \Vert x \Vert _B$ 이라는 말

Linear independence

Linear dependence

set of n-vectors $\{a_1, ... , a_k\}$ (with k $\geq$ 1) is linearly dependent if

\beta_1a_1 + ... + \beta_ka_k = 0

$\beta_1, ..., \beta_k$ are not all zero!

→ $a_i$ 는 다른 벡터들의 선형 결합이다. (있는 벡터들로 다른거 만들 수 있음.)

3차원 평면이 원점을 지나면 그냥 2차원임. (그래서 벡터 3개면 dependence 될 수 밖에)

Linear independence

정의

Linear dependence가 아니면 Linear independence

$\beta_1a_1 + ... + \beta_ka_k = 0$ holds only when $\beta_1 = ... =\beta_k = 0$

→ $a_i$ 는 다른 벡터들의 선형 결합이 아니다.

ex) $e_1, ..., e_2$ is linearly independent

특성

$x = \beta_1\mathbf{a}_1 + ...+\beta_k\mathbf{a}_k$ the coefficients(계수) $\beta_1,...,\beta_k$ are unique(유일하다!)
- pf) $\mathbf{x} = \gamma_1\mathbf{a_1} + ... +\gamma_k\mathbf{a_k} (\gamma_i \neq \beta_i)$ 인 $\gamma$ 가 존재하면 $(\beta_1-\gamma_1)a_1+...+(\beta_k-\gamma_k)a_k = 0$ , linear dependent하므로 모순.
a linearly independent set of n-vectors can have at most n elements.
- ex)3차원에서 벡터 4개는 무조건 dependent

Dimension

Basis

정의

a set of n linearly independent n-vectors $a_1,...,a_n$
any n-vector b can be expressed as a linear combination of them: $x = \beta_1a_1 + ...+\beta_ka_k$ 앞서 언급했듯이, 계수는 유일하다.
basis generates vector space.

Dimension

정의

$\mathbf{v}_1, ..., \mathbf{v}_n \in V,$ If $\mathbf{v}_1,...,\mathbf{v}_n$ 가 선형 독립이면 $\{\mathbf{v}_1,...,\mathbf{v}_n\}$ forms a basis for $V$ → $span\{\mathbf{v}_1,...,\mathbf{v}_n\} = V$
basis for a vector space V의 벡터 개수: the dimension of V, dim V

Subspace

정의

$S \subseteq V$ is a subspace of V(V는 vector space) if

$0 \in S$
$S$ is closed under addition : $x, y \in S \implies x+y \in S$
$S$ is closed under scalar multiplication : $x \in S, \alpha \in \mathbb{R} \implies \alpha x \in S$

Untitled

연산과 dimension

If $U, W$ 가 $V$ 의 subspace $\implies$ $U + W = \{\mathbf{u} + \mathbf{w} \vert \mathbf{u} \in U, \mathbf{w} \in W \}$ , $U + W \in V$
- If $U \cap W = \{0\}$ , the sum is said to be direct sum, $U \oplus W$

Untitled

dimension
- dim( $U+W$ ) = dim $U$ + dim $W$ - dim( $U \cap W$ )
- dim $(U \oplus W )$ = dim $U$ + dim $W$ ( $\because dim(\{0\}) = 0)$

Linear Maps

정의

$T :V \rightarrow W$ , where $V$ 와 $W$ 는 vector space
1. $T(\mathbf{x}+\mathbf{y}) = T\mathbf{x} + T\mathbf{y}, \forall \mathbf{x},\mathbf{y} \in V$
2. $T(\alpha \mathbf{x}) = \alpha T\mathbf{x}, \forall \mathbf{x} \in V, \forall\alpha \in \mathbb{R}$
$T: V \rightarrow V$ 이면 $T$ 는 linear operator
행렬 $A \in \mathbb{R}^{m \times n}$ 은 $T : \mathbb{R}^n \rightarrow \mathbb{R}^m$ 인 linear map이다.
- 증명? 생각해보면 맞음

Null space, Range

null( $T$ ) = $\{v \in V \vert Tv=0\}$
- 행렬의 경우, row space의 orthogonal complement→ 시각적으로 이해)
range( $T$ ) = $\{\mathbf{w} \in W \vert \exists\mathbf{v} \in V$ such that $T \mathbf{v} = \mathbf{w}\}$

Untitled

column space of a matrix $A \in \mathbb{R}^{m \times n}$ :
- span of its columns
- $A = [\mathbf{a_1},...,\mathbf{a_n}]$ , column space = $span\{\mathbf{a_1},...,\mathbf{a_n}\}$
- range( $A$ )
  - 가중치
row space of a matrix $A \in \mathbb{R}^{m \times n}$ :
- range( $A^T$ )
rank $(A)$ = dim range( $A$ ) = dim range( $A^T$ )
- 뒤에 두개는 같을 수밖에 없다. 행과 열이 각각 m개, n개의 일차독립일 수는 없기 때문이다.

Orthogonal

Orthogonal Complement

정의

$V$ 가 inner product space(inner product가 정의된 vector space)일 때, $S \subseteq V$ 인 S의 orthogonal complement $S^\bot$ 는 $S$ 의 모든 원소(basis)와 직각인 ( $V$ 안의)벡터들의 집합이다.

S^\bot = \{\mathbf{v} \in V \vert \forall \mathbf{s} \in S.\mathbf{v} \bot \mathbf{s}\}

특성

모든 $\mathbf{v} \in V$ 는 $\mathbf{v} = \mathbf{v}_S + \mathbf{v}_\bot$ 의 형태로 unique 하게 표현 될 수 있다. ( $\mathbf{v}_S \in S, \mathbf{v}_\bot \in S^\bot$ )
- $V = S \oplus S^\bot$ (참고)

Orthonormal

Orthonormal vectors

Orthogonal set:
- set of n-vectors $a_1,..., a_k, a_i \neq a_j$ for $i \neq j$
Orthonormal set:
- Orthogonal set인데 $\Vert a_i \Vert = 1$
$a_i^Ta_j = \begin{cases} 1 & i=j \\ 0& i\neq j\end{cases}$
Orthonormal set들은 linearly independent.
k $\leq$ n (참고), if k = n 이 set은 orthonormal basis!

Orthonormal expansion

$a_1, ...a_n$ 이 orthonormal basis이면, 임의의 $x(x \in \mathbb{R}^n)$ 는

x = (a_1^Tx)a_1 + ... + (a_n^Tx)a_n

과 같이 표현할 수 있다. 이를 Orthonormal expansion of x라 부름.

ex) $x = \begin{pmatrix} 1 \\ 2 \\ 3\\4 \end{pmatrix} = \begin{pmatrix} 1&2&3&4 \end{pmatrix} \cdot \begin {pmatrix} 1\\0\\0\\0\end{pmatrix}\begin {pmatrix} 1\\0\\0\\0\end{pmatrix} ...$

Gram-schmidt orthogonalization

linearly independent한 벡터 $(a_1, ..., a_k)$ 들을 orthonormalize하는 방법

orthogonalize: $\tilde{q_i} = a_i - (q_1^Ta_i)q_1 - ... - (q^T_{i-1}a_i)q_{i-1}$
normalization: $q_i = \tilde{q_i} / \Vert\tilde{q_i}\Vert$

Orthogonal Projection

⁍

Ps\mathbf{v} = <\mathbf{v}, \mathbf{u_1}>\mathbf{u_1} + ... + <\mathbf{v}, \mathbf{u_m}>\mathbf{u_m}

( $u_i$ is orthonormal basis of $S$ , $\mathbf{v} \in V$ )

S에 정사영하는 것.

$\mathbf{v} - Ps\mathbf{v} \bot S$
$Ps\mathbf{x} = \underset{i=0}{\overset{m}{\sum}}x^Tu_iu_i = \underset{i=0}{\overset{m}{\sum}}u_iu_i^Tx = \left(\underset{i=0}{\overset{m}{\sum}}u_iu_i^T\right)x = UU^Tx$ ( $U = [u_1, ...u_n]$ )
- $U^T$ 는 basis에 정사영시킨 크기를 구하고, $U$ 는 그쪽 성분 벡터를 구함.

Matrix

columns and row

jth columns is the m-vector
ith row is the n-row-vector
slice of matrix: $A_{p:q,r:s}$ 은 (q - p +1) * (s - r + 1) matrix

Block matrix

Special matrices

zero matrix: 모든 요소가 0, $0_{m \times n}$
identity matrix: $I_{ii} = 1, I_{ij} = 0$ 인 square matrix ( $i \neq j)$
- ex) $\begin{pmatrix} 1 & 0&0 \\ 0 & 1&0 \\ 0&0&1 \end{pmatrix}$
sparse matrix: 대부분 요소가 0
diagonal matrix: $A_{ij} = 0 (i \neq j)$
- ex) $diag(0.2, 3) = \begin{pmatrix} 0.2 & 0 \\ 0 & 3 \end{pmatrix}$
triangular matrix
- lower triangular matrix: $A_{ij} = 0 (i < j)$
- upper triangular matrix: $A_{ij} = 0 (i > j)$
  - ex) lower: $\begin{pmatrix} 3 & 0&0 \\ 2.4 & 3.7&0 \\ 280&300&1 \end{pmatrix}$ upper: $\begin{pmatrix} 128 & -2.7&6.5 \\ 0 & 8&7 \\ 0&0&9.1 \end{pmatrix}$

Norm (Frobenius)

\Vert A \Vert_F = \left(\underset{i=1}{\overset{m}{\sum}}\underset{j=1}{\overset{n}{\sum}}a_{ij}^2\right)

당연히 이 성질들도 만족!

distance between two matrices: $\Vert A-B\Vert$

Matrix-vector

product

$y = Ax$ 는 $y = x_1a_1 + ... + x_na_n$ 의 꼴로 나타낼 수 있다. ( $a_1,...,a_n$ 은 A의 columns ex) $Ae_j = a_j$
- A의 columns가 linearly independent하다면 $Ax = 0 \rightarrow x=0$
$A = \begin{pmatrix}
1 - {1\over n}& - {1\over n}&...&- {1\over n} \
- {1\over n} & 1- {1\over n}&...&- {1\over n} \
  \vdots & & \ddots &\vdots\
{1\over n}&- {1\over n}&...&1- {1\over n}
\end{pmatrix} $이면 $\tilde{x} = Ax$ 는 de-meaned(정규화..?) version
$D = \begin{pmatrix} -1& 1&0&...&0 \\ 0 & -1&1&...&0 \\ \vdots & & & \ddots &\vdots\\ 0&0&0&...&1 \end{pmatrix}$ , $D \in \mathbb{R}^{(n-1)\times n}$ 이면 $Dx = \begin{pmatrix} x_2-x_1 \\ x_3-x_2 \\ \vdots\\x_n-x_{n-1} \end{pmatrix}$ 이다.

Eigenvalue, Eigenvector

정의

square matrix $A \in \mathbb{R}^{n \times n}$ 에 대해

Ax= \lambda x

를 만족하는 영벡터가 아닌 $x \in \mathbb{R}^n$ 을 eigen vector, 그에 대응하는 스칼라 $\lambda$ 를 eigen value라 한다.

특징

임의의 실수 $\gamma$ 에 대해 $x$ 는 $A + \gamma I$ 의 eigen vector이다. 이때 eigen value는 $\lambda + \gamma$
A가 invertable하다면( $A^{-1}$ 가 존재한다면), $x$ 는 $A^{-1}$ 의 eigen vector이고, eigen value는 $\lambda^{-1}$
$A^kx = \lambda^kx$ for any $k \in \mathbb{Z}$ $(A^0 = I)$
$x$ 는 실수배가 된 형태로 나타남
하나의 eigen value에 대해서 여러개의 eigen vector가 존재할 수 있다. vector space( $span(v_1...)$ )이 다차원일 수 있는 것이다.

eigen value는 eigen vector의 방향으로 얼마나 축소/확대 되었는지 알려준다!

trace, determinant

trace:
- square matrix의 대각선(diagonal) entries들의 합
- tr( $A$ ) = $\underset{i=1}{\overset{n}{\sum}}A_{ii}$
  - tr( $A +B)$ = tr( $A$ ) + tr( $B$ )
  - tr( $\alpha A)$ = $\alpha$ tr( $A$ )
  - tr( $A^T$ ) = tr( $A$ )
  - tr( $ABCD$ ) = tr( $BCDA$ ) = $\cdots$
  - tr( $A$ ) = $\underset{i}{{\sum}}\lambda_i(A)$
    - $A = Q\Lambda Q^T$ → tr( $A$ ) = $\Vert\lambda_1q_1\Vert^2 + \cdots + \Vert\lambda_nq_n\Vert^2 = \underset{i}{{\sum}}\lambda_i(A)$
Determinant:
- det( $I$ ) = 1
- det( $A^T$ ) = det( $A$ )
- det( $AB$ ) = det( $A$ )det( $B$ )
- det( $A^{-1}$ ) = det( $A$ ) $^{-1}$
- det( $\alpha A$ ) = $\alpha ^n$ det( $A$ )
- det( $A$ ) = $\underset{i}\Pi \lambda_i(A)$

Linear equation

Ax = b

particular and general solution

X_{general} = X_{particular} + X_{null}

particular solution 하나와 null space를 찾으면 Linear equation system을 풀 수 있다.

particular solution

Row-Echelon Form

Untitled

행렬 가장 밑에 전부 0인 열이 있다면, 그 위 열들은 적어도 하나의 0이 아닌 원소를 포함한다.
각 열의 가장 처음 오는 0이 아닌 요소(pivoit)은 바로 위의 열의 pivot보다 오른 쪽에 있다.
0만 있는 열이 만들어지는 것은 열들이 linearly dependent하다는 것.

$\left( \begin{array} {ccccc | c} 1 & -2&1&-1 &1&0 \\ 0 & 0&1&-1&3&-2\\ 0&0&0&1&-2&1\\ 0&0&0&0&0&0 \end{array} \right)$ 과 같이 Row-Echelon Form으로 만들어지면 particular solution은 쉽게 구할 수 있다.(단, 마지막 행의 연산결과는 반드시 0이여야 함)

\begin{cases} x_1 - 2x_2 + x_3-x_4+x_5 = 0\\ \quad\quad\quad\quad\quad x_3 - x_4 - 3x_5 = -2 \\ \quad\quad\quad\quad\quad\quad\quad\ x_4 -2x_5 = 1 \end{cases}

과 같은 형태이기 때문이다.

general solution

Reduced Row-Echelon Form

\begin{pmatrix} 1 & -2&0&0&-2 \\ 0&0 & 1&0&1 \\ 0&0&0&1&-2 \end{pmatrix}

와 같이 pivot이 있는 행도 모두 0인 형태

-1 Trick to find Ax = 0
- $A = \begin{pmatrix} 1 & -2&0&0&-2 \\ 0&0 & 1&0&1 \\ 0&0&0&1&-2 \end{pmatrix}, \ \tilde{A} = \begin{pmatrix} 1 & -2&0&0&-2 \\0&-1&0&0&0\\ 0&0 & 1&0&1 \\ 0&0&0&1&-2\\ 0&0&0&0&-1 \end{pmatrix}$ 과 같이 non-pivot columns에 -1을 붙인 열을 추가한다. 그리고 non-pivot column을 읽는다.
- Solution of $Ax = 0 :\{x = \lambda_1\begin{pmatrix} -2\\-1\\0\\0\\0 \end{pmatrix}+\lambda_2\begin{pmatrix} -2\\0\\1\\-2\\-1, \end{pmatrix},\quad \lambda_1,\lambda_2 \in \mathbb{R}\}$
  - 단순한 스킬이지만 이해를 해보자면, non-pivot 열은 다른 문자를 표현하는 데에 사용되는데, 다른 문자에 non-pivot열의 계수값을 넣어주는 방법이다.
  - null space의 원소는 더 많겠지만, (m-n)개의 원소만 알면 span으로 모두 나오는 듯 하다.(다른게 나와도 dependent한 것)
  - number of unknown( $\lambda$ 의 개수) = dim null(A)
  - row space의 orthogonal complement (range( $A^{T}$ ) $^\bot$ )
  - number of equations(known) = dim range(A)
  - (Fundamental theorem of linear algebra)

Inverse Matrix

If $A$ is square and invertable

x = A^{-1}b

Gaussian Elimination

역행렬을 찾는 기술

⁍

(A\vert I_n) \rightarrow \cdots \rightarrow (I_n\vert A^{-1}) \\ \because if \ A=B,\ E_nA = E_bB\quad

example $A = \begin{pmatrix} 1 & 0&1&0 \\ 0 & 1&1&0 \\ 1&1&0&1 \\ 1&1&1&0 \end{pmatrix}$ $\left( \begin{array} {cccc | cccc} 1 & 0&1&0 &1&0&0&0 \\ 0 & 1&1&0&0&1&0&0 \\ 1&1&0&1&0&0&1&0 \\ 1&1&1&0&0&0&0&1 \end{array} \right) \\ \left( \begin{array} {cccc | cccc} 1 & 0&1&0 &1&0&0&0 \\ 0 & 1&1&0&0&1&0&0 \\ 1&1&0&1&0&0&1&0 \\ 0&0&1&-1&0&0&-1&1 \end{array} \right) \\ \left( \begin{array} {cccc | cccc} 1 & 0&1&0 &1&0&0&0 \\ 0 & 1&1&0&0&1&0&0 \\ 1&1&1&0&0&0&0&1 \\ 0&0&1&-1&0&0&-1&1 \end{array} \right) \\ \left( \begin{array} {cccc | cccc} 1 & 0&1&0 &1&0&0&0 \\ 0 & 1&1&0&0&1&0&0 \\ 0&0&-1&0&-1&-1&0&1 \\ 0&0&1&-1&0&0&-1&1 \end{array} \right) \\ \left( \begin{array} {cccc | cccc} 1 & 0&0&0 &0&-1&0&1 \\ 0 & 1&0&0&-1&0&0&1 \\ 0&0&-1&0&-1&-1&0&1 \\ 0&0&0&-1&-1&-1&-1&2 \end{array} \right) \\ \left( \begin{array} {cccc | cccc} 1 & 0&0&0 &0&-1&0&1 \\ 0 & 1&0&0&-1&0&0&1 \\ 0&0&1&0&1&1&0&-1 \\ 0&0&0&1&1&1&1&-2 \end{array} \right) \\$ $⁍$

만약 과정 중에 한 열이라도 모두 소거되면 invertable하지 않은 것! (full rank가 아님)

If A is not square and $A^TA$ is invertable,

Ax = b\\ \Leftrightarrow A^TAx=A^Tb\\ \Leftrightarrow x=(A^TA)^{-1}A^Tb

차원을 낮춰서 생각하는 방법

$A^TA$ 가 invertable 하려면? $A \in \mathbb{R}^{m \times n}$ 이면 $m \geq n$ (flat한 형태)이여야 한다. 즉, 등식의 개수가 미지수의 개수보다 많아야 한다. 만약 $m < n$ 이라면(tall) 정보가 과도하게 팽창된 형태가 되기 때문에 full rank가 아니다. 그 외에도 또 있겠지?

Eigen Decomposition

Symmetric/Orthogonal matrix

Symmetric matrix

: $A \in \mathbb{R}^{n \times n}\ if\ A = A^T, \ A$ is symmetric matrix

특징

eigen vector들이 orthogonal함.
eigen vector들로 $A$ 의 column space의 basis를 얻을 수 있다.

A[q_1,q_2,\cdots,q_n] = [\lambda_1q_1,\lambda_2q_2, \cdots,\lambda_nq_n]\\ A[q_1,q_2,\cdots,q_n] = [q_1,q_2,\cdots,q_n]\begin{pmatrix} \lambda_1 &0&\cdots&0\\ 0&\lambda_2&\cdots&0\\ \end{pmatrix} = Q\Lambda

와 같이 표현( $AQ = Q\Lambda$ )할 수 있는데, $A$ 가 symmetric matrix라면 다음을 만족한다.

$Q^TQ = QQ^T = I$
- $Q^TQ = \begin{pmatrix} \Vert q_1\Vert^2&q_1q_2&\cdots&q_1q_n\\ \vdots&\vdots&\ddots&\vdots\\ q_nq_1&q_nq_2&\cdots&\Vert q_n\Vert^2 \end{pmatrix} = I_n$

Orthogonal matrix

$Q^TQ = QQ^T = I$ 을 만족한다면, $Q$ 를 Orthogonal matrix라 한다.

ex) symmetric matrix의 eigen vector로 이루어진 matrix

특징

$Q^T=Q^{-1}$
$(Qx)^T(Qy)=x^TQ^TQy=x^TIy=x^Ty$
- inner product is preserved
$\Vert Qx\Vert_2 = \sqrt{(Qx)^T(Qx)} = \sqrt{x^Tx} = \Vert x\Vert _2$
- norm is preserved
따라서, Orthogonal matrix를 곱하는 것은 벡터를 돌리거나 뒤집는 것이다.

Eigen decomposition

Symmetric matrix에 관함.(symmetric이 아니라면 eigen value가 허수이고, eigen vector들이 orthogonal하지 않기 때문에 여기선 논하지 않음.)

A = Q\Lambda Q^T\\ = \underset{i=1}{\overset{n} \sum }\lambda_iq_iq_i^T \\

( $Q$ 는 eigen vectors, $\Lambda$ 는 eigen values)

의미를 이해해보자 $Ax = Q\Lambda Q^{-1}x$
1. $y = Q^{-1}x$ 라 하면, $Qy=x$ 이다. 즉 y는 x를 만들기 위해 각 eigen vector에 곱해야 하는 값이다. 여기서는 (2, 1)이다.
2. $z=\Lambda y$ 라 하면, 각 방향에 eigen value를 곱하는 것이다. 그러면 (-2, 2)가 된다.
3. $Qz$ 는 계산된 벡터를 $Q$ 로 선현 변형 시키는 것이다.
  
  요약하자면, eigen vector들로 선형변환을 해제(?)시키고 eigen value를 반영해서 다시 선형변환시키는 것이다.
  
  다르게 이해하자면 eigen vector에 대한 coordinate으로 변환시켰다가 다시 원래 basis에 대한 coordinate로 변환시키는 것이다. ( $A$ 를 곱하는 것을 기본 basis의 coordinate로 변환시키는 것으로 본다면)

$A$ 가 symmetric matrix 라면

QQ^T=Q^TQ=I\\\because q_i \bot q_j \ (i \neq j)

solving system of linear equation

Ax = b\\ (Q\Lambda Q^T)x=b\\ \Lambda Q^Tx = Q^{-1}b=Q^T b \\ x = Q\Lambda^{-1}Q^Tb\\\quad(\Lambda^{-1}=diag(\lambda_1^{-1},\cdots,\lambda_n^{-1}))

if $A$ is invertable, $\Lambda$ is also invertable. 이유: 링크

계산의 시간복잡도가 매우 줄어든다.

$n^3 + n^2 \Rightarrow n^2 +n+n^2$ (행렬 간의 계산은 $n^3$ , 벡터-행렬은 $n^2$ )

( $O(n^3) \Rightarrow O(n^2)$ )

Fundamental theorem of linear algebra

$A \in \mathbb{R}^{m \times n}$ 에 대해

null( $A$ ) = range( $A^T$ ) $^\bot$
null( $A$ ) $\oplus$ range( $A^T$ ) = $\mathbb{R}^{n}$
dim range( $A$ ) + dim null( $A$ ) = n ( $\because$ dim( $\mathbb{R}^n$ ) = n, range( $A$ ) = range( $A^T$ ))

Untitled

null( $A$ )은 row space의 ortogonal complement

모든 $x \in \mathbb{R}^n$ 은 unique하게 다음과 같이 표현될 수 있다.

x = A^Tv +w\\ (v \in \mathbb{R}^m, \ w \in null(A))

$A$ 가 invertable하다는 것

injective한 map이다. (basis로 유일하게 표현됨)
dim null( $A$ ) = dim null( $A^T$ ) = 0 (null( $A$ ) = { $0$ })
0을 eigen value로 갖지 않는다.
$Ax=b$ 가 유일한 근을 갖는다.
$Ax = b$ 를 eigen decomposition을 이용해 풀 수 있다.( $A^{-1}$ 을 직접계산 x)
각 행과 열이 각각 lineary dependent하다.
full rank를 갖는다.(rank(A) = n)

Singular value decomposition

eigen decomposition이 square(symmetric) matrix에 대해 다루었다면, Singular value decomposition은 non-symmetric matrix에 대해 다룬다.

$A \in \mathbb{R}^{m \times n}$

A=U\Sigma V^T\\ =\overset r{\underset {i=1} \sum}\sigma_iu_iv_i^T

$U \in \mathbb{R}^{m \times m}$ : left singular vectors

$V \in \mathbb{R}^{n \times n}$ : right singular vectors

$\Sigma \in \mathbb{R}^{m \times n}$ : singular values

$r$ = rank( $A$ )

$U, V$ 는 orthogonal matrix, $\Sigma$ 는 diagonal matrix

앞에서 $r$ 개의 singular value만 0이 아니고, 정렬되어 있음.

$\sigma_1 \geq \sigma_2 \geq \cdots \geq \sigma_r > \sigma_{r+1} = \cdots = 0$

SVD by Eigen decomposition

$A$ 대신 $AA^T$ 혹은 $A^TA$ 를 이용하면 square matrix가 되므로 eigen dicomposition을 이용할 수 있다.

$A^TA$

$A^TA = (U\Sigma V^T)^T(U\Sigma V^T) \\ =V\Sigma U^TU\Sigma V^T\\ =V\Sigma^2V^T$
- $A$ 의 $V$ (right singular vectors)는 $A^TA$ 의 eigen vectors이다.
- $A$ 의 $\Sigma$ (singular values)는 $A^TA$ 의 eigen values의 양의 제곱근이다.
$AA^T$
$AA^T= (U\Sigma V^T)(U\Sigma V^T)^T \\ =U\Sigma V^TV\Sigma U^T\\ =U\Sigma^2U^T$

$A$ 의 $U$ (left singular vectors)는 $A^TA$ 의 eigen vectors이다.
$A$ 의 $\Sigma$ (singular values)는 $A^TA$ 의 eigen values의 양의 제곱근이다.

$\lambda_i(A^TA)$ or $\lambda_i(AA^T)$ 는 항상 0보다 같거나 크다! ← 이유

Rayleigh quotient

$A \in \mathbb{R}^{n\times n}$ be a symmetric matrix

quadratic form: $x^TAx$ ← scalar
Rayleigh quotient:
$R_A(x) = {{x^TAx}\over{x^Tx}}$
- scale invariance(불변): $R_A(x) = R_A(\alpha x)\quad (x \neq 0, \alpha\neq0)$
- $x$ 가 $\lambda$ 를 eigen value로 가지는 eigen vector라면, $R_A(x) = \lambda$
- For all $x \neq 0$ , $\lambda_{min}(A) \leq R_A(x) \leq\lambda_{max}(A)$
  - 등호는 $x$ 가 eigen vector 일 때만 성립.

Positive (semi-)definite matrix

$A \succeq 0$

:positive semi-definite

for all $x \in \mathbb{R}^n, \ x^TAx \geq 0$

↔A의 모든 eigen value가 0 이상이다 $(\because{{x^TAx}\over{x^Tx}} \geq0 \Rightarrow \lambda_{min}(A) \geq 0)$
$A \succ0$

: positive definite

for all non-zero $x \in \mathbb{R}^n, \ x^TAx > 0$

↔A의 모든 eigen value가 0보다 크다
- pf) $x^TAx \geq 0 \leftrightarrow \forall_\lambda \geq 0$ i) $x^TAx \geq 0 \rightarrow \forall_\lambda \geq 0$ let $x$ be an eigen vector of $A$ with eigen value $\lambda$ . $0 \leq x^TAx =x^T(\lambda x)=\lambda x^Tx = \lambda\Vert x \Vert^2_2$ $x$ 는 eigen vector이므로 $x\neq 0.\ \therefore \lambda\geq 0$ ii) $x^TAx \geq 0 \leftarrow \forall_\lambda \geq 0$ $0 \leq \lambda_{min}(A) \leq R_A(x)$

$A \in \mathbb{R}^{m \times n}$ 일 때, $A^TA$ 는 positive semi-difinite이며, null( $A$ ) = {0}이면 $A^TA$ 는 positive definite이다.

pf) $x \in \mathbb{R}^n$ , $x^T(A^TA)x = (Ax)^TAx=\Vert Ax\Vert^2_2 \geq 0$ 이므로 $A^TA$ 는 positive semi-difinite이다. null( $A$ ) = {0} 이라면, $Ax \neq 0 \ (x \neq 0)$ 이므로 $\Vert Ax \Vert_2^2 > 0$ . 따라서 $A^TA$ 는 positive definite이다.

→ $A^TA$ 의 eigen value, 즉 $A$ 의 sigular value의 제곱은 항상 음이 아니다.

Operator norm of a matrix

If $T : V \rightarrow W$ is a linear map, operator norm is $\Vert T \Vert _{op} = \underset {x\in V,x \neq 0}{max}{\Vert Ax\Vert_p \over \Vert x\Vert_p}$ (참고만)

For a matrix $A \in \mathbb{R}^{m \times n}$ the matrix p-norm is

\Vert A \Vert_p = \underset {x\neq0}{max} {\Vert Ax\Vert_p \over \Vert x\Vert_p}

$\Vert A \Vert_1 = \underset {1 \leq j\leq n }{max}\ \overset m{\underset {i=1}{\sum}} |A_{ij}|$ : 열의 합 중 최대
$\Vert A \Vert_\infin = \underset {1 \leq i\leq m }{max}\overset n{\underset {j=1}{\sum}} |A_{ij}|$ : 행의 합 중 최대
$\Vert A \Vert_2 = \sigma_1(A)$ :largest singular value of $A$
- $\Vert A \Vert_2 = \underset {x\neq0}{max} {\Vert Ax\Vert_2 \over \Vert x\Vert_2} = \underset {x\neq0}{max}{x^TA^TAx\over x^Tx} = \underset {x\neq0}{max} R_{A^TA}(x) = \sigma_1(A)$

성질

$\Vert Ax\Vert_p = \Vert A\Vert_p\Vert x \Vert_p$
$\Vert AB\Vert_p = \Vert A\Vert_p\Vert B \Vert_p$
- pf) $\Vert ABx\Vert_p \le \Vert A\Vert_p\Vert Bx\Vert_p \le \Vert A\Vert_p\Vert B\Vert_p\Vert x\Vert_p$ $\Vert AB\Vert_p = \underset {x\neq0}{max} {\Vert ABx\Vert_p \over \Vert x\Vert_p} \le \underset {x\neq0}{max} {\Vert A\Vert_p\Vert B\Vert_p\Vert x\Vert_p \over \Vert x\Vert_p} = \Vert A\Vert_p\Vert B \Vert_p$

Fronenius norm과의 관계

\Vert A \Vert_F = \sqrt{\overset m{\underset {i=1}{\sum}}\overset n{\underset {j=1}{\sum}}A^2_{ij}}=\sqrt{tr(A^TA)} = \sqrt{{\underset {i}{\sum}}\lambda_i(A^TA)} = \sqrt{\overset {min(m,n)}{\underset {i=1}{\sum}}\sigma_i^2(A)}

참고

추가 설명 $tr(A^TA) = tr(V\Sigma^T\Sigma V^T) = tr(V^TV\Sigma^T\Sigma) = tr(\Sigma^T\Sigma) ={\overset {min(m,n)}{\underset {i=1}{\sum}}\sigma_i^2(A)}$

정리 안 한 것

Low-rank approximation

Moore-Penrose pseudoinverse

유승우

ㅎㅇㄹ

이전 포스트

케라스 완전 정복하기

다음 포스트

선형대수학 정리

Vector space

vector space의 조건

sparsity

span

Superposition

superposition(linear function)

inner product function

Affine function

Norm

Euclidean norm

Norms

Linear independence

Linear dependence

Linear independence

정의

특성

Dimension

Basis

정의

Dimension

정의

Subspace

정의

연산과 dimension

Linear Maps

정의

Null space, Range

Orthogonal

Orthogonal Complement

정의

특성

Orthonormal

Orthonormal vectors

Orthonormal expansion

Gram-schmidt orthogonalization

Orthogonal Projection

Matrix

columns and row

Block matrix

Special matrices

Norm (Frobenius)

Matrix-vector

product

Eigenvalue, Eigenvector

정의

특징

trace, determinant

Linear equation

particular and general solution

particular solution

Row-Echelon Form

general solution

Reduced Row-Echelon Form

Inverse Matrix

Gaussian Elimination

Eigen Decomposition

Symmetric/Orthogonal matrix

Symmetric matrix

Orthogonal matrix

Eigen decomposition

solving system of linear equation

Fundamental theorem of linear algebra

AAA가 invertable하다는 것

Singular value decomposition

SVD by Eigen decomposition

Rayleigh quotient

Positive (semi-)definite matrix

Operator norm of a matrix

성질

Fronenius norm과의 관계

정리 안 한 것

케라스 완전 정복하기

확률과 통계 정리

0개의 댓글

$A$ 가 invertable하다는 것