# [BigData] Statistics

u_u·2022년 10월 11일
0

## BigData

목록 보기
4/9

### Information Theory

• Information Theory : Quantifying how much information is present in a signal

• Self-Information of x
- I(x) = -log P(x).

### Information Theory : Entropy

엔트로피는 불확실성

• Entropy : Expectation of self-information
-> -log(p)에 일어날 확률을 곱
• For Bernoulli variable,
- H(x) = -plogp - (1-p)log(1-p)

### KL Divergence

• Measure the difference of two distributions P(x) and Q(x)

### Cross-Entropy

• H(P,Q) = H(P) + Dkl(P||Q) = -Ex~p logQ(x)

### Personal Correlation Coeff

• A measure of the strength of he linear relation between two variables x and y

관련성이 높으면 1 낮으면 -1에가까움

## Hyphothesis Tesing

• THe purpose of the hypothesis test is to decide between two explanations:

• All-statistical tests have five elements:
Assumptions, hyphothesis, test statistics, p-value, and conclusion.

### Assumptions

• The statement that the statistic test relies on
• Every test has assumptions, and cannot use if assumptions are violated
• There also exists some statistic test to assure assumptions

### Hyphothesis

• Considering two hyphothesis about the value of a population parameter.
• Two-sided vs. One-sided 한 대립가설의 존재는 같지 않다 vs 한 대립가설의 존재는 보다 작다.