Random variable
- Random variable is a variable of which values depend on outcomes of a random phenomenon
- Formally, it is a measurable function defined on a probabiltiy space that maps from the sample space to the real numbers; probabilty for discrete variables or density for continous variables.
- Random vector refers to multi-dimensional generalisation of the concept of random variable; a vector of which elements are random variables.
- In general, a random vector can be described with measures similar to those defined for scalar random variables.
Multivariate Statistics
For a random vector of d random variables, χ=(χ1,χ2,χ3,...,χd), a sample of X that has N points is X=(Xij)d×N.
We can call this matrix have d features and N examples.
-
Mean Vector (empirical)
m=⎣⎢⎢⎢⎢⎢⎡m1m2m3...md⎦⎥⎥⎥⎥⎥⎤=⎣⎢⎢⎢⎢⎢⎢⎡N1∑n=1NX1nN1∑n=1NX2nN1∑n=1NX3n...N1∑n=1NXdn⎦⎥⎥⎥⎥⎥⎥⎤=N1n=1∑Nxnwherexn=⎣⎢⎢⎢⎢⎢⎡X1nX2nX3n...Xdn⎦⎥⎥⎥⎥⎥⎤,n=1,2,3,...n
-
Covariance Matrix (empirical)
Sd×d=⎣⎢⎢⎢⎢⎢⎡C11C21C31...Cd1C12C22C32Cd2............C1dC2dC3dCdd⎦⎥⎥⎥⎥⎥⎤
Cij={N−11∑n=1N(Xin−mi)(Xjn−mj)N−11∑n=1N(Xin−mi)2↔Σ2 if i=j if i=j
-
The covariance elements can be expressed by Cik=ρikΣiΣk (Sigma is sd), where ρ is the correlation coefficient.
-
Thus the covariance has several important properties as follows:
- If Cik>0, Xi and Xk tend to increase together
- If Cik<0, Xi tends to increase when Xk tend to decrease
- If Cik=0, Xi and Xk are uncorrelated
(but it means not! they are independent. they could be linearly non-dependent but they could be dependent as non-linearly)
- −1≤ρ≤1 hence ∣Cik∣≤ΣiΣk
