[P&R] 02. Random Variable(2)

Bumjin Kim·2023년 10월 5일
0

확률변수론

목록 보기
4/5
post-thumbnail

■ Expectation

  • For a discrete RV, E[X]=xx×PX(x)E[X] = \sum_{x} x \times P_X(x)
    Indicates the center of gravity of PMF

  • For a continuous RV, E[X]=x×fX(x)E[X] = \int\limits_{-\infin}^{\infin} x \times f_X(x) dxdx

■ Expectation of a function of a RV

Let YY = g(X),g(X), then YY is also a RV.
However, what is the E[Y]?E[Y]?

  • For a discrete RV, E[Y]=yy×PY(y)E[Y] = \sum_{y} y \times P_Y(y)
    xg(x)×PX(x)\Leftrightarrow \sum_{x}g(x) \times P_X(x) \rightarrow We can use this when we don't know PY(y)!P_Y(y)!

  • Similarity, for a continuous RV, E[Y]=g(x)×fX(x)E[Y] = \int\limits_{-\infin}^\infin g(x)\times f_X(x) dxdx


■ Properties

  • E[]E[·] is a linear operator. ( Calculate the Summation or Integral)
  • E[aX]=a×E[X]E[aX] = a \times E[X]
  • E[aX+b]=a×E[X]+bE[aX + b] = a \times E[X] + b
  • In general, E[g(X)]g(E[X])E[g(X)] \ne g(E[X])

■ Variance (Distance to bias from the mean)

  • Var[X]=E[(xμx)2]=(xμx)2×fx(x)Var[X] = E[(x-\mu_x)^2] = \int\limits_{-\infin}^\infin (x-\mu_x)^2 \times f_x(x) dxdx
  • $\sigma^2_X =-+

[x^2 -2x\mu + \mu^2] = E[x^2] - 2\mu E[X] + \mu^2 $∴ E[X^2] - \mu^2

  • σX=E[X2]μ2\sigma_X = \sqrt{E[X^2]-\mu^2}
  • Variance measures the deviation of X from its
  • Var[]Var[·] is NOT a linear operator
  • Var[aX+b]=Var[Y],Var[aX + b] = Var[Y], (Y=aX+b)(Y = aX +b)
    E[(YμY)2]\Leftrightarrow E[(Y-\mu_Y)^2]
    E[(aX+baμX+b)2,\Leftrightarrow E[(aX + b - a\mu_X + b)^2, (( μY=E[Y]=E[aX+b])∵ \mu_Y = E[Y] = E[aX+b])
    E[a2(XμX)2]\Leftrightarrow E[a^2(X-\mu_X)^2]
    a2×E[(Xμ)2]\Leftrightarrow a^2\times E[(X-\mu)^2]
    a2Var[X]∴ a^2Var[X]

    Interestingly, the result shows that variance does not change by the bias 'b'
    If the RV shift by b, Mean also shift.
    Therefore the devariation(length) does not change.


■ Moments

  • The nthn^{th} moment of a random variable XX :

    μn=E[Xn]\mu_n = E[X^n]
    e.g)e.g) m1=mxm_1 = m_x

  • The nthn^{th} central moment of a random variable XX

    μn=E[(Xμx)n]\mu_n = E[(X-\mu_x)^n]
    e.g)e.g) μ1=0,\mu_1 = 0, μ2=σx2\mu_2 = \sigma_x^2


■ Conditional PMF

  • PXA(xA)=P[X=xA]P_{X|A}(x|A) = P[X=x|A]

  • Conditional expectation E[XA]=xx×PXA(xA)E[X|A] = \sum_{x} x\times P_{X|A}(x|A)

■ Conditional PDF & Conditional CDF

  • Conditional distribution FXA(xA)=P[XxA]F_{X|A}(x|A) = P[X \le x|A]
  • Conditional density fXA(xA)=ddxFXA(xA)f_{X|A}(x|A) = \frac{d}{dx}F_{X|A}(x|A)
  • Conditional expectation E[XA]=x×fXA(xA)E[X|A] = \int\limits_{-\infin}^\infin x\times f_{X|A}(x|A) dxdx

■ Total Expectation Theorem

Recall that P(B)=x=1nP(BAi)×P(Ai)P(B) = \sum_{x=1}^n P(B|A_i)\times P(A_i) ~ [A1,,An][A_1, \cdots, A_n] (Partition)

  • For a discrete RV X

    PX(x)=PX[X=x]=i=1nPXAi(xAi)×P(Ai)P_X(x) = P_X[X=x] = \sum_{i = 1}^n P_{X|A_i}(x|A_i)\times P(A_i)
    E[X]=xx×PX(x)E[X] = \sum_{x} x\times P_X(x)
    i=1nxx×PXAi(xAi)×P(Ai)\Leftrightarrow \sum_{i = 1}^n \sum_{x} x \times P_{X|A_i}(x|A_i)\times P(A_i)
    i=1nEXAi[xAi]×P(Ai)\Leftrightarrow \sum_{i = 1}^n E_{X|A_i}[x|A_i] \times P(A_i)

  • Similarity, for a continuous RV X

    fX(x)=ifXAi(xAi)×P(Ai)f_X(x) = \sum_{i}f_{X|A_i}(x|A_i)\times P(A_i) // ( If multiply the δ\delta each of equation, the result is P[X=x]P[X=x] )
    E[X]=x×fX(x)E[X] = \int\limits_{-\infin}^\infin x \times f_X(x) dxdx
    ix×fXAi(xAi)×P(Ai)\Leftrightarrow \sum_{i}\int\limits_{-\infin}^\infin x \times f_{X|A_i}(x|A_i)\times P(A_i) dxdx
    iEXAi[xAi]×P(Ai)\Leftrightarrow \sum_{i} E_{X|A_i}[x|A_i] \times P(A_i) ~ Total Expectation or Expected value of EXAi(xAi)E_{X|A_i}(x|A_i)

    Notice ... E(X)E(X) is determined, but the E(XAi)E(X|A_i) is different assumption. \\ Value is changed! which event assume depending on AiA_i


■ Memorylessness of a Geometric Random Variable

  • XX~ number of independent coin tosses until first head
  • PX(x)=(1p)x1×pP_X(x) = (1-p)^{x-1} \times p ~ Probability Mass Function of Geometric Random Variable
  • AA (Condition) = {X>2}\{X > 2\}
  • PXA(xA)P_{X|A}(x|A)
    • P(A)=x=3(1p)x1×pP(A) = \sum_{x=3}^\infin (1-p)^{x-1}\times p
      1x=12(1p)x1×p\Leftrightarrow 1 - \sum_{x=1}^2 (1-p)^{x-1}\times p
      1p(1p)×p\Leftrightarrow 1-p-(1-p)\times p
      (1p)2∴ (1-p)^2
    • For a PXA(xA),P_{X|A}(x|A), we have to apply the normalization to make an area is 1.
      PXA(xA)=(1p)x1×p(1p)2P_{X|A}(x|A) = \frac{(1-p)^{x-1} \times p}{(1-p)^{2}}
      (1p)x3∴ (1-p)^{x-3}

✏️ Example

Using the above condition, we have Head for five times \Rightarrow {T,T,T,H,H}\{T,T,T,H,H\}
At this time, RV X is 5.
We make a condition Y is X>2 => Let Y=X2Y = X -2 (Y>0),(Y > 0), then PY(y)=PX(y)P_Y(y) = P_X(y)
It is like the same case of X=3,4,5X = 3, 4, 5 (X(∵X is X>2)X > 2) \Rightarrow Y=1,2,3Y = 1, 2, 3

  • Given that X>2,X > 2, random variable Y=X2Y = X-2 has the same geometric PMF as XX
    In XX pmf, the probability same as YY.
    (ex) X=Y=1,X = Y =1, probability is same as pp
  • Hence, the geometric random variable is said to be memoryless, because the past has no bearing on its future behavior.
    It means even though we pick more and more the Tail, the probability of Head does not increase.

■ Memorylessness of exponential random variable

  • XX ~ exponential random variable
  • fX(x)=λ×eλx,f_X(x) = \lambda \times e^{-\lambda x}, (x>0)(x>0)
  • A={X>2}A = \{X > 2\}
  • fXA(xA)=λ×eλxe2λ=λ×eλ(x2),f_{X|A}(x|A) = \frac{\lambda \times e^{-\lambda x}}{e^{-2 \lambda}} = \lambda \times e^{-\lambda(x-2)}, (x>2)(x>2)
  • Let Y=X2Y = X-2 (Y>0),(Y > 0), then fY(y)=fX(y)f_Y(y) = f_X(y) (y>0)(y>0)
  • Given that X>2,X > 2, random variable Y=X2Y = X - 2 has the same exponential PDF as XX
  • Hence, the exponential random variable is also memoryless

✏️ Example


■ Total Probability

  • Event A and {X=x}\{X = x\} ((If X is continuous, this probability is 0))
  • Let's consider this, P{A,X=x}=P\{A, X=x\} = P(AX=x)×P(X=x)P(A|X=x) \times P(X=x) == P(X=xA)×P(A)P(X=x|A)\times P(A)
  • P(AX=x)×fX(x)×δP(A|X=x) \times f_X(x) \times \delta == fXA(xA)×δ×P(A)f_{X|A}(x|A) \times \delta \times P(A) ((∵ In continuous RV, P(X=x)=fX(x)×δ)P(X=x) = f_X(x) \times \delta)
    P(AX=x)×fX(x)=fXA(xA)×P(A)∴ P(A|X=x) \times f_X(x) = f_{X|A}(x|A) \times P(A)
  • Now, we can apply it like this

    P(AX=x)×fX(x)\int\limits_{-\infin}^{\infin} P(A|X=x) \times f_X(x) dxdx (1)\cdots (1)
    fXA(xA)×P(A)\Leftrightarrow \int\limits_{-\infin}^{\infin} f_{X|A}(x|A) \times P(A) dxdx
    P(A)×fXA(xA)\Leftrightarrow P(A) \times\int\limits_{-\infin}^{\infin} f_{X|A}(x|A) dxdx (P(A)(∵P(A) is constant.))
    P(A)∴ P(A)
    It means that (1)(1) equation is Total Probability Theorem in continuous of P(A)P(A)

✏️ Example

In Coin tossing, probability that coin's face show head is P, with fP(p),f_P(p), p[0,1]p \in [0,1] Find P(head).P(head). (Notice that it is uniform value)

  • We know that P(headP=x)=xP(head|P=x)=x
    ((∵ That conditional probability means we get P(head), but P(head) is P and that conditional told us P is x. ))
  • Now we define P(head) to use the total probability theorem.
    P(head)=01P(headP=x)×fP(x)P(head) = \int\limits_0^1 P(head|P=x) \times f_P(x) dxdx
  • If P is uniform on [0, 1]....
    01x×1\int\limits_0^1 x \times1 dx=[12x2]01=1/2dx = [ \frac{1}{2} x^2]_0^1 = 1/2
    P(head)=1/2∴ P(head) = 1/2

■ Bayes' theorem (continuous version)

From the equation (1),(1),
fXA(xA)=P(AX=x)×fX(x)P(A)f_{X|A}(x|A) = \frac{P(A|X=x) \times f_X(x)}{P(A)}
P(AX=x)×fX(x)P(AX=x)×fX(x)dx∵ \frac{P(A|X=x) \times f_X(x)}{\int\limits_{-\infin}^\infin P(A|X=x) \times f_X(x)dx}
It is used by Total Probability.


본 글은 HGU 2023-2 확률변수론 이준용 교수님의 수업 필기 내용을 요약한 글입니다.

profile
코딩 꿈나무

0개의 댓글

관련 채용 정보