Random Vector

Rainy Night for Sapientia·2023년 11월 4일

Mathematics for AI

목록 보기

2/3

Random variable

Random variable is a variable of which values depend on outcomes of a random phenomenon
Formally, it is a measurable function defined on a probabiltiy space that maps from the sample space to the real numbers; probabilty for discrete variables or density for continous variables.
Random vector refers to multi-dimensional generalisation of the concept of random variable; a vector of which elements are random variables.
In general, a random vector can be described with measures similar to those defined for scalar random variables.

Multivariate Statistics

For a random vector of d random variables, $\chi = ( \chi_1, \chi_2, \chi_3, ..., \chi_d)$ , a sample of X that has N points is $X = (X_{ij})_{d \times N}$ .
We can call this matrix have $d$ features and $N$ examples.

Mean Vector (empirical)
$\mathbf{m} = \begin{bmatrix} m_1\\ m_2\\ m_3\\ ...\\ m_d \end{bmatrix} = \begin{bmatrix} \frac{1}{N} \sum_{n=1}^N X_{1n}\\ \frac{1}{N} \sum_{n=1}^N X_{2n}\\ \frac{1}{N} \sum_{n=1}^N X_{3n}\\ ...\\ \frac{1}{N} \sum_{n=1}^N X_{dn} \\ \end{bmatrix} = \frac{1}{N} \sum_{n=1}^N \mathbf{x}_{n} \:\:\text{where} \: \mathbf{x}_{n} = \begin{bmatrix} X_{1n}\\ X_{2n}\\ X_{3n}\\ ...\\ X_{dn} \\ \end{bmatrix}, n = 1,2,3,... n$
Covariance Matrix (empirical)
$S_{d \times d} = \begin{bmatrix} C_{11} & C_{12} & ... &C{1d}\\ C_{21} & C_{22} & ... &C{2d}\\ C_{31} & C_{32} & ... &C{3d}\\ ...\\ C_{d1} & C_{d2} & ... &C{dd} \end{bmatrix}$ $C_{ij} = \begin{cases} \frac{1}{N-1} \sum_{n=1}^{N}(X_{in} - m_i) (X_{jn} - m_j) & \text{ if } i \neq j\\ \frac{1}{N-1} \sum_{n=1}^{N}(X_{in} - m_i)^2\leftrightarrow \Sigma^2 &\text{ if } i = j \end{cases}$
The covariance elements can be expressed by $C_{ik} = \rho_{ik} \Sigma_i \Sigma_k$ (Sigma is sd), where $\rho$ is the correlation coefficient.
Thus the covariance has several important properties as follows:
- If $C_{ik} > 0$ , $\Chi_i$ and $\Chi_k$ tend to increase together
- If $C_{ik} < 0$ , $\Chi_i$ tends to increase when $\Chi_k$ tend to decrease
- If $C_{ik} = 0$ , $\Chi_i$ and $\Chi_k$ are uncorrelated
  (but it means not! they are independent. they could be linearly non-dependent but they could be dependent as non-linearly)
- $-1 \leq \rho \leq 1$ hence $|C_{ik}| \leq \Sigma_i\Sigma_k$

Rainy Night for Sapientia

Artificial Intelligence study note

이전 포스트

Random Vector

Mathematics for AI

Random variable

Multivariate Statistics

Linear Algebra Basics

0개의 댓글