[DetnEst] 3. Cramer-Rao Lower Bound(CRLB)

KBC·2024년 9월 11일
0

Detection and Estimation

목록 보기
4/23

Review - Mean Squared Error(MSE)

  • Mean Squared Error(MSE) criterion

    mse(θ^)=E[(θ^θ)2]=var(θ^)+b2(θ)mse(\hat \theta) = E[(\hat \theta -\theta)^2] =var(\hat \theta) + b^2(\theta)

  • Note that, in many cases, minimum MSE criterion leads to unrealizable estimator, which cannot be written solely as a function of the data, i.e.,

    Aˇ=a1Nn=0N1x[n],  A  :  unknownobjective  :  estimate  A  from  x[n]\widecheck A =a\dfrac{1}{N}\displaystyle\sum_{n=0}^{N-1}x[n],\;A\;:\;unknown\\objective\;:\;estimate\;A\;from\;x[n]

    • where α\textcolor{red}{\alpha} is choosen to minimize MSE
      E(Aˇ)=αA,  var(Aˇ)=a2σ2Nmse(Aˇ)=a2σ2N+(a1)2A2aopt=A2A2+σ2/NE(\widecheck A)=\textcolor{red}\alpha A,\;var(\widecheck A)=\dfrac{a^2\sigma^2}{N} \rightarrow mse(\widecheck A)=\dfrac{a^2\sigma^2}{N}+(a-1)^2A^2\\[0.4cm] \therefore a_{opt}=\dfrac{A^2}{A^2+\sigma^2/N}

Review - MVUE

  • Minimum variance unbiased estimator(MVUE)
  • Alternatively, constrain the bias to be zero
  • Find the estimator which minimizes the variance  ()\cdots\textcolor{red}{~ (*)}
    • Minimuzing the MSE as well for unbiased case

      mse(θ^)=var(θ^)+b2(θ)=var(θ^),  E[b2(θ)]=0   ()mse(\hat \theta)=var(\hat \theta) +b^2(\theta)=var(\hat \theta),\;E[b^2(\theta)]=0 \;\cdots\textcolor{red}{~ (*)}

  • That's Minimum variance unbiased estimator so called MVUE
  • Then... how can we know MVUE Exsist!!!?

Outline

  • Cramer-Rao Lower Bound(CRLB)
  • Estimator Accuracy Considerations
  • CRLB for a Scalar Parameter
  • CRLB Proof
  • General CRLB for Signals in WGN
  • Transformation of Parameters
  • Vector form of the CRLB
  • General Gaussian Case and Fisher Infromation

Cramer-Rao Lower Bound(CRLB)

  • Cramer-Rao Lower Bound(CRLB) or Cramer-Rao Bound(CRB) - Same things
  • The CRLB give a lower bound on the variance of any unbiased estimator

    Does not guarantee bound can be obtained

  • If one finds out an unbiased estimator whose variance = CRLB then it's MVUE(very very very very ideal cases...)
  • Otherwise can use Ch.5 tools(Rao-Blackwell-Lehmann-Scheffe Teorem and Neyman-Fisher Facorization Theorem) to construct a better estimator from any unbiased one-possibly the MVUE if condidtions are met

Estimator Accuracy Considerations

  • All information is observed data and underlying PDF
    • Estimation accuracy depends directly on the PDF
  • For example) PDF dependence on unknown parameter

    Signal sample observation

    x[0]=A+w[0],  w[0]N(0,σ2)N(0,σ2)  :  Adaptive  White  Gaussian  Noisex[0]=A+w[0],\;w[0] \sim \mathcal{N}(0, \sigma^2)\\ \mathcal{N}(0, \sigma^2)\;:\;Adaptive\;White\;Gaussian\;Noise
    • Good unbiased estimator : A^=x[0]\hat A =x[0]
    • The smaller σ2\sigma^2 is, the better the estimator accuracy is
    • Alternatively, likelihood function(\neqPDF)
      pi(x[0];A)=12πσi2exp[12σi2(x[0]A)2]p_i(x[0];A)=\dfrac{1}{\sqrt{2\pi\sigma_i^2}}exp\left[-\dfrac{1}{2\sigma_i^2}(x[0]-A)^2\right]

  • Likelihood : What is Prob for θ=A\theta =A, Variance indicates how estimators Accurate
  • The Sharpness of the likelihood function determines the accuracy of the estimate
  • Measured by the curvature of the log-likelihood function
    • It's negative of the second derivative of the logarithm of the likelihood function at its peak. i.e.,
      lnp(x[0];A)=ln2πσ212σ2(x[0]A2),2A2ln2πσ2=0,  no  A  dependency2lnp(x[0];A)A2=1σ2,var(A^)=σ2=12lnp(x[0];A)A2lnp(x[0];A)=-ln\sqrt{2\pi\sigma^2}-\dfrac{1}{2\sigma^2}(x[0]-A^2),\\ \dfrac{\partial^2}{\partial A^2} \rightarrow -ln\sqrt{2\pi\sigma^2}=0,\;no\;A\;dependency\\[0.5cm] -\dfrac{\partial^2 lnp(x[0];A)}{\partial A^2}=\dfrac{1}{\sigma^2},\quad var(\hat A)=\sigma^2=\dfrac{1}{\dfrac{\partial^2lnp(x[0];A)}{\partial A^2}}
    • Average curvature : E[2ln  p(x[0];AA2]-E\left[\dfrac{\partial ^2ln\;p(x[0];A}{\partial A^2}\right]
      • In general, the 2nd2^{nd} derivative will depend on x[0]x[0] \rightarrow the likelihood function is a R.V.

Theorem : Cramer-Rao Lower Bound(CRLB) - Scalar Parameter

  • Let p(x;θ)p(x;\theta) satify the regularity condition :

    E[  ln  p(x;θ)θ]=0  for  all  θE\left[\dfrac{\partial \;ln\;p(x;\theta)}{\partial \theta}\right]=0\;for\;all\;\theta

  • Then, the variance of any unbiased estimator θ^\hat \theta must satisfy

    var(θ^)1E[2  ln  p(x;θ)θ2]=1E[(  ln  p(x;θ)θ)2]var(\hat \theta)\geq\dfrac{1}{-E\left[\dfrac{\partial^2\;ln\;p(x;\theta)}{\partial \theta^2}\right]}=\dfrac{1}{E\left[\left(\dfrac{\partial\;ln\;p(x;\theta)}{\partial\theta}\right)^2\right]}

  • Where the derivative is evaluated at the true value of θ\theta and the expectation is taken w.r.t. p(x;θ)p(x;\theta)
  • Furthermore, an unbiased estimator may be founc that attains the bound for all θ\theta if and only if

      ln  p(x;θ)θ=I(θ)(g(x)θ)\dfrac{\partial\;ln\;p(x;\theta)}{\partial \theta}=I(\theta)(g(x)-\theta)

  • For some functions g(x)g(x) and II. That estimator, which is the MVUE, is θ^=g(x)\hat \theta =g(x), and the`minimum variance is 1/I(θ)1/I(\theta)

    To sum up : If some functions's logarithm cavarture could be decomposed just like I(θ)(g(x)θ)I(\theta)(g(x)-\theta)
    Then g(x)g(x) is MVUE

CRLB Proof (Appendix 3A)

  • CRLB for a scalar parameter α=g(θ)\alpha =g(\theta) where the PDF is parameterized with θ\theta.

  • Consider unbiased estimator α^\hat \alpha, i.e.,

    E(α^)=a^p(x;θ)dx=α=g(θ)()E(\hat \alpha) =\int\hat ap(x;\theta)dx=\alpha=g(\theta)\cdots\textcolor{red}{(*)}
  • Regularity condition : holds if the order of differentiation and integration may be interchanged

    E[  ln  p(x;θ)θ]=  ln  p(x;θ)θp(x;θ)dx=0E\left[\dfrac{\partial\;ln\;p(x;\theta)}{\partial \theta}\right]=\int\dfrac{\partial\;ln\;p(x;\theta)}{\partial\theta}p(x;\theta)dx=0
  • Differentiating both sides of ()\textcolor{red}{(*)}

    α^  p(x;θ)θdx=  g(θ)θα^  ln  p(x;θ)θp(x;θ)dx()=g(θ)θ\int\hat \alpha\dfrac{\partial\;p(x;\theta)}{\partial\theta}dx=\dfrac{\partial\;g(\theta)}{\partial \theta}\\ \rightarrow \int\hat \alpha\dfrac{\partial\;ln\;p(x;\theta)}{\partial \theta}p(x;\theta)dx\cdots\textcolor{red}{(**)}=\dfrac{\partial g(\theta)}{\partial \theta}
    θln  C(θ)=1C(θ)  C(θ)θ,  C(θ)=P(x;θ)()\dfrac{\partial}{\partial\theta}ln\;C(\theta)=\dfrac{1}{C(\theta)}\dfrac{\partial\;C(\theta)}{\partial\theta},\;C(\theta)=P(x;\theta)\cdots\textcolor{red}{(**)}
  • By using regularity condition,

    (α^α)  ln  p(x;θ)θp(x;θ)dx=  g(θ)θ\int\textcolor{blue}{(\hat \alpha-\alpha)}\dfrac{\partial \;ln\;p(x;\theta)}{\partial \theta}p(x;\theta)dx=\dfrac{\partial\;g(\theta)}{\partial\theta}
  • By using Cauchy-Schwarz inequality,

    (g(θ)θ)2(α^α)2p(x;θ)dx(  ln  p(x;θ)θ)2p(x;θ)dx(α^α)2p(x;θ)dx=E[(α^E[α^])2=E[(α^α)2]var(α^)(  g(θ)θ)2E[(  ln  p(x;θ)θ)2]\left(\dfrac{\partial g(\theta)}{\partial\theta}\right)^2\leq\int(\hat \alpha -\alpha)^2p(x;\theta)dx\int\left(\dfrac{\partial\;ln\;p(x;\theta)}{\partial\theta}\right)^2p(x;\theta)dx\\[0.5cm] \int(\hat\alpha-\alpha)^2p(x;\theta)dx=E[(\hat \alpha-E[\hat \alpha])^2=E[(\hat \alpha-\alpha)^2]\\[0.5cm] \therefore var(\hat \alpha) \geq \dfrac{\left(\dfrac{\partial\;g(\theta)}{\partial\theta}\right)^2}{E\left[\left(\dfrac{\partial\;ln\;p(x;\theta)}{\partial \theta}\right)^2\right]}

    Cauchy-Schwarz Inequality

    [w(x)g(x)h(x)dx]2w(x)g2(x)dxw(x)h2(x)dx\left[\int w(x)g(x)h(x)dx\right]^2\leq\int w(x)g^2(x)dx\int w(x)h^2(x)dx
    • Arbitrary function g(x)g(x) and h(x)h(x), while w(x)0w(x)\geq0 for all xx.
    • Equality holds in and only if
      g(x)=c  h(x)g(x)=c\;h(x)
  • By differentiating the regularity condition,

    θ  ln  p(x;θ)θp(x;θ)dx=0[2  ln  p(x;θ)θ2p(x;θ)+  ln  p(x;θ)θp(x;θ)θ]dx=0E[2  ln  p(x;θ)θ2]=  ln  p(x;θ)θ  ln  p(x;θ)θp(x;θ)dx=E[(  ln  p(x;θ)θ)2]var(α^)(  g(θ)θ)2E[(  ln  p(x;θ)θ)2]=(  g(θ)θ)2E[2  ln  p(x;θ)θ2]\dfrac{\partial}{\partial\theta}\int\dfrac{\partial\;ln\;p(x;\theta)}{\partial\theta}p(x;\theta)dx=0\\[0.5cm] \int\left[\dfrac{\partial^2\;ln\;p(x;\theta)}{\partial\theta^2}p(x;\theta)+\dfrac{\partial\;ln\;p(x;\theta)}{\partial\theta}\dfrac{\partial p(x;\theta)}{\partial\theta}\right]dx=0\\[0.5cm] -E\left[\dfrac{\partial^2\;ln\;p(x;\theta)}{\partial \theta^2}\right]=\int \dfrac{\partial\;ln\;p(x;\theta)}{\partial\theta}\dfrac{\partial\;ln\;p(x;\theta)}{\partial\theta}p(x;\theta)dx\\[0.5cm] =E\left[\left(\dfrac{\partial\;ln\;p(x;\theta)}{\partial \theta}\right)^2\right]\\[0.5cm] \rightarrow var(\hat \alpha)\geq\dfrac{\left(\dfrac{\partial\;g(\theta)}{\partial\theta}\right)^2}{E\left[\left(\dfrac{\partial\;ln\;p(x;\theta)}{\partial\theta}\right)^2\right]}=\dfrac{\left(\dfrac{\partial\;g(\theta)}{\partial\theta}\right)^2}{-E\left[\dfrac{\partial^2\;ln\;p(x;\theta)}{\partial\theta^2}\right]}
  • if  α=g(θ)=θ,\text{if} \;\alpha=g(\theta)=\theta,
    var(θ^)1E[2lnp(x;θ)θ2]=1E[(lnp(x;θ)θ)2]Condition for equality:lnp(x;θ)θ=1c(α^α),  α^=MVUElnp(x;θ)θ=1c(θ)(θ^θ)To determine c(θ),2lnp(x;θ)θ2=1c(θ)+(1c(θ))θ(θ^θ)E[2lnp(x;θ)θ2=1c(θ)=I(θ)\text{var}(\hat{\theta}) \geq \frac{1}{-E \left[ \frac{\partial^2 \ln p(x; \theta)}{\partial \theta^2} \right]} = \frac{1}{E \left[ \left( \frac{\partial \ln p(x; \theta)}{\partial \theta} \right)^2 \right]}\\ \text{Condition for equality:}\\ \frac{\partial lnp(x;\theta)}{\partial\theta}=\frac{1}{c}(\hat \alpha-\alpha),\;\hat \alpha=\text{MVUE}\\ \frac{\partial ln p(x;\theta)}{\partial \theta}=\frac{1}{c(\theta)}(\hat \theta - \theta)\\ \text{To determine c(}\theta\text{)},\\ \frac{\partial^2 ln p(x;\theta)}{\partial\theta^2}=-\frac{1}{c(\theta)}+\frac{\partial(\frac{1}{c(\theta)})}{\partial\theta}(\hat \theta - \theta)\\[0.3cm] -E[\frac{\partial^2 ln p(x;\theta)}{\partial \theta^2}=\frac{1}{c(\theta)}=I(\theta)

CRLB Example - DC Level in WGN

x[n]=A+w[n],n=0,1,,N1x[n] = A + w[n], \quad n=0,1,\dots,N-1
p(x;A)=1(2πσ2)N/2exp[12σ2n=0N1(x[n]A)2]lnp(x;A)A=1σ2n=0N1(x[n]A)=Nσ2(x^A)2lnp(x;A)A2=Nσ2var(A^)σ2Np(x;A)=\frac{1}{(2\pi\sigma^2)^{N/2}}\text{exp}\left[-\frac{1}{2\sigma^2}\displaystyle\sum_{n=0}^{N-1}(x[n]-A)^2\right]\\[0.5cm] \frac{\partial ln p(x;A)}{\partial A}=\frac{1}{\sigma^2}\displaystyle\sum_{n=0}^{N-1}(x[n]-A)=\frac{N}{\sigma^2}(\hat x - A) \rightarrow \frac{\partial^2 ln p (x;A)}{\partial A^2}=-\frac{N}{\sigma^2}\\[0.5cm] \text{var}(\hat A) \geq \frac{\sigma^2}{N}

Let A^=x^\hat A = \hat x, then it is MVUE

CRLB Example - Phase Estimation

x[n]=Acos(2πf0n+ϕ)+w[n],n=0,1,,N1A,f0:known,  w[n]:WGNx[n] = A \cos(2 \pi f_0 n + \phi) + w[n],\quad n=0,1,\dots,N-1\\A,f_0:\text{known},\;w[n]:\text{WGN}
p(x;ϕ)=1(2πσ2)N/2exp[12σ2n=0N1[x[n]Acos(2πf0n+ϕ)]2]lnp(x;ϕ)ϕ=Aσ2n=0N1[x[n]sin(2πf0n+ϕA2sin(4πf0n+2ϕ]2lnp(x;ϕ)ϕ2=Aσ2n=0N1[x[n]cos(2πf0n+ϕ)Acos(4πf0n+2ϕ)]E[2lnp(x;ϕ)ϕ2]=Aσ2n=0N1[Acos2(2πf0n+ϕ)Acos(4πf0n+2ϕ)]=A2σ2n=0N1[12+12cos(4πf0n+2ϕ)cos(4πf0n+2ϕ)]=A2σ212NA2σ2×12n=0N1cos()NA22σ2p(x;\phi)=\frac{1}{(2\pi\sigma^2)^{N/2}} \text{exp} \left[-\frac{1}{2\sigma^2}\displaystyle\sum_{n=0}^{N-1}[x[n]-A\text{cos}(2\pi f_0n+\phi)]^2\right]\\[0.5cm] \frac{\partial ln p(x;\phi)}{\partial\phi}=-\frac{A}{\sigma^2} \displaystyle\sum_{n=0}^{N-1}[x[n]sin(2\pi f_0n + \phi-\frac{A}{2}\text{sin}(4\pi f_0n+2\phi]\\[0.5cm] \frac{\partial^2 ln p(x;\phi)}{\partial\phi^2}=-\frac{A}{\sigma^2}\displaystyle\sum_{n=0}^{N-1}[x[n]\text{cos}(2\pi f_0n +\phi) -A\text{cos}(4\pi f_0 n+2\phi)]\\[0.5cm] -E\left[\frac{\partial^2 ln p(x;\phi)}{\partial\phi^2}\right]=\frac{A}{\sigma^2}\displaystyle\sum_{n=0}^{N-1}[A\text{cos}^2(2\pi f_0 n+\phi)-A\text{cos}(4\pi f_0 n +2\phi)]\\[0.5cm] =\frac{A^2}{\sigma^2}\displaystyle\sum_{n=0}^{N-1}\left[\frac{1}{2}+\frac{1}{2}\text{cos}(4\pi f_0 n+2\phi)-\text{cos}(4\pi f_0n+2\phi)\right]=\frac{A^2}{\sigma^2}-\frac{1}{2}N-\frac{A^2}{\sigma^2}\times\frac{1}{2}\displaystyle\sum_{n=0}^{N-1}\text{cos}()\\[0.5cm] \approx\frac{NA^2}{2\sigma^2}

for f0f_0 not near 0 or 12\frac{1}{2}
var(ϕ^)2σ2NA2\rightarrow\text{var}(\hat\phi)\geq\frac{2\sigma^2}{NA^2}
but the condition for bound to hold is not satisfied. MVUE may exist and indded can be found with sufficient statistics(Chap. 5)

Efficient Estimator and Fisher Information

  • Efficient estimator
    • An unbiased estimator that attains the CRLB
    • MVUE may or may not be efficient

      Fisher Information
      I(θ)=E[2lnp(x;θ)θ2]=E[(lnp(x;θ)θ)2]I(\theta)=-E\left[\frac{\partial^2 ln p(x;\theta)}{\partial \theta^2}\right]=E\left[\left(\frac{\partial ln p (x;\theta)}{\partial \theta}\right)^2\right]

    • Why "Information"?
      • The more information, the lower the bound
      • Non-negative
      • Additive for independent observations

    CRLB True or False

  • The CRLB always exists regardless of p(x;θ)p(x;\theta) : False, regularity condition
  • The CRLB applies to unbiased estimators only : True
  • Determining the CRLB requires statistics of all possible estimators θ^\hat \theta : False, likelihood P(xˉ;θ)P(\bar x;\theta)
  • The CRLB depends on the observations xx : False, applied expectations, removing dependency
  • The CRLB depends on the parameter to be estimated θ\theta : True
  • The CRLB tells you whether or not a MVUE exists False, don't tell anything about MVUE

General CRLB for Signals in WGN

x[n]=x[n;θ]+w[n],n=0,1,,N1p(x;θ)=1(2πσ2)N/2exp[12σ2n=0N1(x[n]s[n;θ])2]lnp(x;θ)θ=1σ2n=0N1(x[n]s[n;θ])s[n;θ]θ2lnp(x;θ)θ2=1σ2n=0N1[(x[n]s[n;θ)2s[n;θ]θ2(s[n;θ]θ)2]E[2lnp(x;θ)θ2]=1σ2n=0N1(s[n;θ]θ)2var(θ^)σ2n=0N1(s[n;θ]θ)2x[n] = x[n;\theta]+w[n],\quad n=0,1,\dots,N-1\\[0.5cm] p(x;\theta)=\frac{1}{(2\pi\sigma^2)^{N/2}}\text{exp}\left[-\frac{1}{2\sigma^2}\displaystyle\sum_{n=0}^{N-1}(x[n]-s[n;\theta])^2\right]\\[0.5cm] \frac{\partial ln p (x;\theta)}{\partial \theta}=\frac{1}{\sigma^2}\displaystyle\sum_{n=0}^{N-1}(x[n]-s[n;\theta])\frac{\partial s[n;\theta]}{\partial \theta}\\[0.5cm] \frac{\partial^2lnp(x;\theta)}{\partial\theta^2}=\frac{1}{\sigma^2}\displaystyle\sum_{n=0}^{N-1}\left[(x[n]-s[n;\theta)\frac{\partial^2s[n;\theta]}{\partial\theta^2}-(\frac{\partial s[n;\theta]}{\partial\theta})^2\right]\\[0.5cm] E\left[\frac{\partial^2 ln p(x;\theta)}{\partial\theta^2}\right]=-\frac{1}{\sigma^2}\displaystyle\sum_{n=0}^{N-1}\left(\frac{\partial s[n;\theta]}{\partial \theta}\right)^2\\[0.5cm] var(\hat \theta)\geq\frac{\sigma^2}{\sum_{n=0}^{N-1}\left(\frac{\partial s[n;\theta]}{\partial\theta}\right)^2}

Transformation of Parameters

Wish to estimate α=g(θ)\alpha = g(\theta) instead of θ\theta itself,

var(α^)(g(θ)θ)2E[2lnp(x;θ)θ2]var(\hat \alpha)\geq\frac{\left(\frac{\partial g(\theta)}{\partial\theta}\right)^2}{-E\left[\frac{\partial^2 lnp(x;\theta)}{\partial\theta^2}\right]}

Ex) DC level in WGN, α=g(A)=A2\alpha =g(A) =A^2(non-linear transformation)

var(A2^)(2A)2Nσ2=4A2σ2Nvar(\hat{A^2})\geq\frac{(2A)^2}{\frac{N}{\sigma^2}}=\frac{4A^2\sigma^2}{N}
  • Is xˉ2\bar x^2 efficient for A2A^2?
    • if variance of xˉ2\bar x^2 hitting bound : it's efficient
    • variance is now function of AA
      • We know the Bound
      • We don't know how to find it

Linear transformation: g(θ)=aθ+bg(\theta) =a\theta +b

g(θ)^=aθ^+b,  θ^:efficient=hit CRLBE(g(θ)^=aE(θ^)+b=aθ+b=g(θ):unbiasedvar(g(θ)^)(g(theta)θ)2I(θ)=(g(θ)θ)2var(θ^)Since  θ^ is efficient  var(θ^)=1I(θ)=a2var(θ^)\hat{g(\theta)}=a\hat\theta+b,\;\hat\theta:\text{efficient}=\text{hit CRLB}\\ E(\hat{g(\theta)}=aE(\hat\theta)+b=a\theta+b =g(\theta):\text{unbiased}\\[0.5cm] \text{var}(\hat{g(\theta)})\geq\frac{\left(\frac{\partial g(theta)}{\partial\theta}\right)^2}{I(\theta)}=\left(\frac{\partial g(\theta)}{\partial\theta}\right)^2\text{var}(\hat \theta)\\ \text{Since} \; \hat\theta \text{ is efficient}\;\text{var}(\hat\theta)=\frac{1}{I(\theta)}\\ =a^2var(\hat\theta)

Since var(g(θ)^=a2var(θ^)var(\hat{g(\theta)}=a^2\text{var}(\hat\theta) : CRLB is achieved

  • Else if, Efficiency is approximately maintained over nonlinear transformations if the data record is large enough.
    Ex) DC level in WGN, estimate A2A^2 with A2^=xˉ2,var(A^2)(2A)2Nσ2=4A2σ2N\hat{A^2}=\bar x^2, \text{var}(\hat A^2)\geq\frac{(2A)^2}{\frac{N}{\sigma^2}}=\frac{4A^2\sigma^2}{N}
E(xˉ2)=E2(xˉ)+var(xˉ)=A2+σ2NA2 as N:asymptotically unbiasedvar(xˉ2)=E(xˉ4)E2(xˉ2)=4A2σ2N+2σ2N24A2σ2Nas  N:asymptotically efficientif ξN(μ,σ2),E(ξ2)=μ2+σ2,E(ξ4)=μ4+6μ2σ2+3σ4, and xˉN(A,σ2N)E(\bar{x}^2) = E^2(\bar{x}) + \text{var}(\bar{x}) = A^2 + \frac{\sigma^2}{N} \\ \rightarrow A^2 \text{ as } N \rightarrow \infty : \text{asymptotically unbiased}\\[0.5cm] var(\bar x^2)=E(\bar x^4)-E^2(\bar x^2)=\frac{4A^2\sigma^2}{N}+\frac{2\sigma^2}{N^2}\\ \rightarrow \frac{4A^2\sigma^2}{N}as\;N\rightarrow\infty:\text{asymptotically efficient}\\ \text{if } \xi \sim N(\mu, \sigma^2), E(\xi^2) = \mu^2 + \sigma^2, E(\xi^4) = \mu^4 + 6\mu^2\sigma^2 + 3\sigma^4, \text{ and } \bar{x} \sim N(A, \frac{\sigma^2}{N})

Statiscal linearity of the trasformation : linear approximation is possible when the PDF of the xˉ\bar x is concentrated about the mean AA.
Even though Non-Linear : We can get CRLB

Vector Form of the CRLB

When the parameter θ=[θ1  θ2    θp]T\theta = [\theta_1\;\theta_2\;\cdots\;\theta_p]^T, we now have a bound on the entire covariance matrix Cθ^C_{\hat\theta} of any estimator θ^\hat\theta of θ\theta.
The Fisher Information Matrix is the quantity of importance here, and is the generalization of I(θ)I(\theta) to the vector case. It is a p×pp\times p matrix of second partials of the log likelihood function. Fisher Information Matrix's i,  ji,\;j entry is

[I(θ)]ij=E[2lnp(x;θ)θiθj][I(\theta)]_{ij}=-E\left[\frac{\partial^2 ln p(x;\theta)}{\partial\theta_i\partial\theta_j}\right]

where expectations are again taken over p(x;θ)p(x;\theta).

  • Assuming p(x;θ)p(x;\theta) satisfies the regularity condition
    E[lnp(x;θ)θ]=0,for all θE\left[\frac{\partial ln p(x;\theta)}{\partial\theta}\right]=0,\quad \text{for all }\theta
  • The covariance matrix of any unbiased estimator θ^\hat \theta satisfies
    Cθ^I1(θ)0C_{\hat \theta}-I^{-1}(\theta) \geq 0
    where 0\geq 0 means the matrix is positive semidefinite.
  • Furthermore, an unbiased estimator may be found that attains the bound Cθ^=I1(θ)C_{\hat\theta}=I^{-1}(\theta) if and only if
    lnp(x;θ)θ=I(θ)(g(x)θ)\frac{\partial ln p(x;\theta)}{\partial\theta}=I(\theta)(g(x)-\theta)

    In that case, θ^=g(x)\hat \theta =g(x) is the MVU estimator with variance I1(θ)I^{-1}(\theta)

Vector Form of the CRLB - Example

x[n]=A+w[n],unknown A,  σ2x[n] = A + w[n],\quad \text{unknown }A,\;\sigma^2

Ex) DC level in AWGN with unkwnown noise variance : θ=[A  σ2]T\theta = [A \; \sigma^2]^T (estimating the vector of parameters)

  • The Fisher information matrix would be,
    I(θ)=[E[2lnp(x;θ)A2]E[2lnp(x;θ)Aσ2]E[2lnp(x;θ)Aσ2]E[2lnp(x;θ)σ4]]lnp(x;θ)=N2ln2πN2lnσ212σ2n=0N1(x[n]A)2lnp(x;θ)A=1σ2n=0N1(x[n]A)lnp(x;θ)σ2=N2σ2+12σ4n=0N1(x[n]A)22lnp(x;θ)A2=Nσ22lnp(x;θ)Aσ2=1σ4n=0N1(x[n]A)2lnp(x;θ)σ2=N2σ41σ6n=0N1(x[n]A)2I(θ)=[Nσ200N2σ4]I(\theta) = \begin{bmatrix} -E\left[ \frac{\partial^2 \ln p(x; \theta)}{\partial A^2} \right] & -E\left[ \frac{\partial^2 \ln p(x; \theta)}{\partial A \partial \sigma^2} \right] \\ -E\left[ \frac{\partial^2 \ln p(x; \theta)}{\partial A \partial \sigma^2} \right] & -E\left[ \frac{\partial^2 \ln p(x; \theta)}{\partial \sigma^4} \right] \end{bmatrix}\\ \ln p(x; \theta) = -\frac{N}{2} \ln 2\pi - \frac{N}{2} \ln \sigma^2 - \frac{1}{2\sigma^2} \sum_{n=0}^{N-1} (x[n] - A)^2\\ \frac{\partial \ln p(x; \theta)}{\partial A} = \frac{1}{\sigma^2} \sum_{n=0}^{N-1} (x[n] - A)\\ \frac{\partial \ln p(x; \theta)}{\partial \sigma^2} = -\frac{N}{2\sigma^2} + \frac{1}{2\sigma^4} \sum_{n=0}^{N-1} (x[n] - A)^2\\ \frac{\partial^2\ln p(x;\theta)}{\partial A^2}=-\frac{N}{\sigma^2}\\ \frac{\partial^2 \ln p(x; \theta)}{\partial A \partial \sigma^2} = - \frac{1}{\sigma^4} \sum_{n=0}^{N-1} (x[n] - A)\\ \frac{\partial^2 \ln p(x; \theta)}{\partial \sigma^2} = \frac{N}{2\sigma^4} - \frac{1}{\sigma^6} \sum_{n=0}^{N-1} (x[n] - A)^2\\ \rightarrow I(\theta) = \begin{bmatrix} \frac{N}{\sigma^2} & 0 \\ 0 & \frac{N}{2\sigma^4} \end{bmatrix}
  • Diagonal Fisher information matrix, var(A^)σ2N,var(σ^2)2σ4Nvar(\hat A) \geq \frac{\sigma^2}{N}, var(\hat \sigma^2 ) \geq \frac{2\sigma^4}{N}

Vector CRLB for Transformations

  • Estimate α=g(θ)\alpha = g(\theta) for gg, an r-dimensional function
  • The covariance matrix Cα^C_{\hat \alpha} of any unbiased estimator α^\hat \alpha(From Appendix 3B)
    Cα^g(θ)θI1(θ)g(θ)Tθ0C_{\hat \alpha}-\frac{\partial g(\theta)}{\partial \theta}I^{-1}(\theta)\frac{\partial g(\theta)^T}{\partial \theta} \geq 0
  • Here, g(θ)/θ\partial g(\theta)/\partial\theta is the r×pr\times p Jacobian matrix, whose elements are
    [g(θ)θ]i,j=gj(θ)θ\left[\frac{\partial g(\theta)}{\partial \theta}\right]_{i,j}=\frac{\partial g_j(\theta)}{\partial \theta}

CRLB for General Gaussian Case

  • When observations are Gaussian and one knows the dependence of the mean and covariance matrix on the unknown parameters, we know the closed form of the CRLB(or Fisher information matrix):
    xN(μ(θ),C(θ)),\mathbf{x} \sim \mathcal{N}(\mu(\theta), C(\theta)),
  • Then, the Fisher information matrix I(θ)I(\theta) is given by:
    [I(θ)]i,j=[μ(θ)θi]TC1(θ)[μ(θ)θj]+12tr[C1(θ)C(θ)θiC1(θ)C(θ)θj][I(\theta)]_{i,j}=\left[\frac{\partial \mu(\theta)}{\partial\theta_i}\right]^T C^{-1}(\theta)\left[\frac{\partial \mu(\theta)}{\partial \theta_j}\right] +\frac{1}{2}\text{tr}\left[C^{-1}(\theta)\frac{\partial C(\theta)}{\partial \theta_i} C^{-1} (\theta)\frac{\partial C(\theta)}{\partial \theta_j}\right]

Phase Estimation CRLB

  • Phase estimation example, x[n]=Acos(2πf0n+ϕ)+w[n]x[n]=A\text{cos}(2\pi f_0 n+\phi) +w[n]
    var(ϕ^)2σ2NA2\text{var}(\hat\phi)\geq\frac{2\sigma^2}{NA^2}
  • Does an efficient estimator exist for this problem? The CRLB theorem says there is only if
    lnp(x;θ)θ=I(θ)(g(x)θ)\frac{\partial \ln p(x;\theta)}{\partial \theta}=I(\theta)(g(x)-\theta)
  • From earlier result,
    lnp(x;ϕ)ϕ=Aσ2n=0N1[x[n]sin(2πf0n+ϕ)]A2sin(4πf0n+2ϕ)\frac{\partial \ln p(x; \phi)}{\partial \phi} = - \frac{A}{\sigma^2} \sum_{n=0}^{N-1} \left[ x[n] \sin(2\pi f_0 n + \phi) \right] - \frac{A}{2} \sin(4\pi f_0 n + 2\phi)
  • The condition for bound to hod is not satisfied!
  • We saw the estimator for which var(ϕ^)CRLB  as  N\text{var}(\hat\phi) \rightarrow CRLB \;as\; N \rightarrow \infty
  • Such an estimator is called an asymptotically efficient estimator

Range Estimation CRLB

  • Transmit pulse s(t)s(t), nonzero over t[0,Ts]t \in [0,T_s]

  • Receive reflection s(tτ0)s(t-\tau_0)

  • Estimation of time delay τ0\tau_0, since the round trip delay τ0=2R/c\tau_0 =2R/c

    Continuous time signal model
    x(t)=s(tτ0)+w(t)0tT=Ts+τ0maxx(t) = s(t-\tau_0)+w(t) \quad 0\leq t\leq T=T_s+\tau_{0max}


    rww(τ)=N0Bsin(2πτB)2πτBr_{ww}(\tau) = N_0 B \frac{\sin(2\pi \tau B)}{2\pi \tau B}

  • Discrete time signal model

    • Sample every δ=1/(2B)\delta=1/(2B) sec
      s[n]=s(nδτ0)+w[n]n=0,1,,N1s[n]=s(n\delta-\tau_0)+w[n]\quad n=0,1,\dots,N-1
    • s(nδτ0)s(n\delta-\tau_0) has MM non-zero samples at n0n_0
      x[n]={w[n]0nn01s(nΔτ0)+w[n]n0nn0+M1w[n]n0+MnN1x[n] = \begin{cases} w[n] & 0 \leq n \leq n_0 - 1 \\ s(n\Delta - \tau_0) + w[n] & n_0 \leq n \leq n_0 + M - 1 \\ w[n] & n_0 + M \leq n \leq N - 1 \end{cases}
  • Now apply standard CRLB result for signal + WGN

    var(τ0^)σ2n=0N1(s[n;τ0]τ0)2=σ2n=n0n0+M1(s(nΔτ0)τ0)2=σ2n=n0n0+M1(s(t)tt=nΔτ0)2=σ2n=0M1(s(t)tt=nΔ)2\text{var}(\hat{\tau_0}) \geq \frac{\sigma^2}{\sum_{n=0}^{N-1} \left( \frac{\partial s[n; \tau_0]}{\partial \tau_0} \right)^2}\\[0.4cm] = \frac{\sigma^2}{\sum_{n=n_0}^{n_0+M-1} \left( \frac{\partial s(n\Delta - \tau_0)}{\partial \tau_0} \right)^2}\\[0.4cm] = \frac{\sigma^2}{\sum_{n=n_0}^{n_0+M-1} \left( \frac{\partial s(t)}{\partial t} \Big|_{t = n\Delta - \tau_0} \right)^2}\\[0.4cm] = \frac{\sigma^2}{\sum_{n=0}^{M-1} \left( \frac{\partial s(t)}{\partial t} \Big|_{t = n\Delta} \right)^2}
  • Assume sample spacing is samll...approx. sum by integral...

    var(τ0^)σ21Δ0Ts(s(t)t)2dt=N0/20Ts(s(t)t)2dt=1EsN0/21Es0Ts(s(t)t)2dt=1SNRBrms2(sec2)\text{var}(\hat{\tau_0}) \geq \frac{\sigma^2}{\frac{1}{\Delta} \int_0^{T_s} \left( \frac{\partial s(t)}{\partial t} \right)^2 dt}\\[0.4cm] = \frac{N_0 / 2}{\int_0^{T_s} \left( \frac{\partial s(t)}{\partial t} \right)^2 dt}\\[0.4cm] = \frac{1}{\frac{E_s}{N_0 / 2} \cdot \frac{1}{E_s} \int_0^{T_s} \left( \frac{\partial s(t)}{\partial t} \right)^2 dt}\\[0.4cm] = \frac{1}{\text{SNR} \cdot B_{\text{rms}}^2} \, \text{(sec}^2\text{)}
  • Using these ideas we arrive at the CRLB on the delay:

    var(τ0^)1SNRBrms2(sec2)\text{var}(\hat{\tau_0}) \geq \frac{1}{\text{SNR} \cdot B_{\text{rms}}^2} \, \text{(sec}^2\text{)}
  • To get the CRLB on the range use transf. of params result with R=cτ0/2R=c\tau_0/2

    CRLBR^=(Rτ0)2CRLBτ0^var(R^)c2/4SNRBrms2(m2)\text{CRLB}_{\hat{R}} = \left( \frac{\partial R}{\partial \tau_0} \right)^2 \text{CRLB}_{\hat{\tau_0}}\\[0.4cm] \text{var}(\hat{R}) \geq \frac{c^2 / 4}{\text{SNR} \cdot B_{\text{rms}}^2} \, (\text{m}^2)
  • CRLB is inversely proportional to SNR and RMS BW Measure

    All Content has been written based on lecture of Prof. eui-seok.Hwang in GIST(Detection and Estimation)

profile
AI, Security

0개의 댓글