x^=argminx∥b−Ax∥\hat{\textbf{x}} = \underset{\textbf{x}}{\textrm{argmin}}\begin{Vmatrix} \textbf{b} - A \textbf{x} \end{Vmatrix}x^=xargmin∥∥∥b−Ax∥∥∥ =argminx∥b−Ax∥2= \underset{\textbf{x}}{\textrm{argmin}}\begin{Vmatrix} \textbf{b} - A \textbf{x} \end{Vmatrix}^2=xargmin∥∥∥b−Ax∥∥∥2 =argminx(b−Ax)T(b−Ax)= \underset{\textbf{x}}{\textrm{argmin}}\begin{pmatrix} \textbf{b} - A \textbf{x} \end{pmatrix}^T \begin{pmatrix} \textbf{b} - A \textbf{x} \end{pmatrix}=xargmin(b−Ax)T(b−Ax) =argminx(bbT−xTATb−bTAx+xTATAx)= \underset{\textbf{x}}{\textrm{argmin}}\begin{pmatrix} \textbf{b} \textbf{b}^T - \textbf{x}^T A^T \textbf{b} - \textbf{b}^T A \textbf{x} + \textbf{x}^T A^T A \textbf{x} \end{pmatrix}=xargmin(bbT−xTATb−bTAx+xTATAx)
Computing derivatives of (bbT−xTATb−bTAx+xTATAx)\begin{pmatrix} \textbf{b} \textbf{b}^T - \textbf{x}^T A^T \textbf{b} - \textbf{b}^T A \textbf{x} + \textbf{x}^T A^T A \textbf{x} \end{pmatrix}(bbT−xTATb−bTAx+xTATAx) w.r.t. x\textbf{x}x, we obtain −ATb−ATb+ATAx+ATAx=0⇔ATAx=ATb- A^T \textbf{b} - A^T \textbf{b} + A^T A \textbf{x} + A^T A \textbf{x} = \textbf{0} \Leftrightarrow A^T A \textbf{x} = A^T \textbf{b}−ATb−ATb+ATAx+ATAx=0⇔ATAx=ATb if C=ATAC = A^T AC=ATA is invertible, then the solution is computed as x=(ATA)−1ATb\textbf{x} = (A^T A)^{-1} A^T \textbf{b}x=(ATA)−1ATb
Computing derivatives of (bbT−xTATb−bTAx+xTATAx)\begin{pmatrix} \textbf{b} \textbf{b}^T - \textbf{x}^T A^T \textbf{b} - \textbf{b}^T A \textbf{x} + \textbf{x}^T A^T A \textbf{x} \end{pmatrix}(bbT−xTATb−bTAx+xTATAx) w.r.t. x\textbf{x}x, we obtain −ATb−ATb+ATAx+ATAx=0⇔ATAx=ATb- A^T \textbf{b} - A^T \textbf{b} + A^T A \textbf{x} + A^T A \textbf{x} = \textbf{0} \Leftrightarrow A^T A \textbf{x} = A^T \textbf{b}−ATb−ATb+ATAx+ATAx=0⇔ATAx=ATb