Autoencoder

‍이세현·2024년 4월 13일

Encoder

Encoder는 input을 받아 상응하는 latent code로 encoding 한다.
- $x$ → encoder → $z$
$\mathbf{z} = E_\theta(\mathbf{x})$
- $E$ : neural network로 구현된 function, encoding model
  - $\theta$ : network parameter
- $\mathbf{x}$ : input tensor
- $\mathbf{z}$ : output tensor

Decoder는 latent code를 받아 output으로 decoding 한다.
- $z$ →decoder → $x'$
$\mathbf{x'}=D_\phi(\mathbf{z)}$
- $D$ : neural network로 구현된 function, decoding model
  - $\phi$ : network parameter
- $\mathbf{z}$ : input tensor
- $\mathbf{x'}$ : output tensor

Autoencoder는 입력으로 출력이 지도되기 때문에 self-supervised이다.
$\mathbf{x'}=D_\phi(E_\theta(\mathbf{x}))=D_\phi \circ E_\theta(\mathbf{x})$
- $D_\phi \circ E_\theta(\mathbf{x})$ : neural network로 구현된 autoencoder
  - $\theta$ , $\phi$ : network parameter
- $\mathbf{x}$ : input
- $\mathbf{x'}$ : output prediction
Performance measurement $l(\mathbf{x}, \mathbf{x'})=\|\mathbf{x}-\mathbf{x'}\|$
Learning objective
$\min\sum_{k}l(\mathbf{x}_k, \mathbf{x'}_k)$
- 최적의 parameter를 찾는 방법 $\argmin_{\theta, \phi}\sum_{k}l(\mathbf{x}_k, \mathbf{x'}_k)$ $\mathbf{x'}$ 에 parameter ${\theta}$ , ${\phi}$ 가 포함되어 있다. $l(\mathbf{x}, \mathbf{x'})=\|\mathbf{x}-\mathbf{x'}\| \ = \|\mathbf{x}- D_\phi \circ E_\theta(\mathbf{x})\|$
gradient $\nabla_{\theta, \phi}l$ back propagation을 통해 최적의 $\theta$ , $\phi$ 를 찾을 수 있다.

28 * 28 image가 있을 때 256차원으로 linear operation
numpy는 row major

입력 $\mathbf{x}_{784}$ : $784 \times 1$
출력 $\mathbf{y}_{256}$ : $256 \times 1$ $\mathbf{y} = \mathbf{W} \times \mathbf{x}$
parameter $W$ : $256 \times 784$
$\mathbf{W}^T$ : $784 \times 256$

Hi, there 👋