can be analytically evaluated and easily fit to data
데이터셋이 너무 방대하면 모델링하는데 무리가 있음
models that are flexible
can be molded to fit structure in arbitrary data
we can define models in terms of any (non-negative) function ϕ(x) yielding the flexible distribution p(x)=Zϕ(x)
normalization constant Z의 계산이 intractable하기 때문에 flexible model을 train/evaluate/inference 하려면 아주 많은 Monte Carlo process가 필요함
1.1. Diffusion probabilistic models (DPM)
present DPM that allows
1. extreme flexibility in model structure
2. exact sampling
3. easy multiplication with other distributions (to compute posterior)
4. can cheaply evaluate log-likelihood and probability of individual states
2. Algorithm
2.0. Overview
먼저 target (data) distribution --> simple known (Gaussian) distribution으로 보내는 forward diffusion process를 정의
finite-time reversal of diffusion process를 학습
also derive entropy bounds
2.1. Forward Trajectory
data dist. q(x(0)) --> tractable dist. π(y) by repeated Markov diffusion kernel Tπ(y∣y′;β) for π(y) where β is the diffusion rate
Gaussian/Binomial이고 continuous한 diffusion에 대해 (+ 충분히 작은 step size β), reverse process는 forward process와 같은 functional form을 가진다 (Feller, 1949)
학습할 때는
Gaussian : 각 kernel의 mean fμ(x(t),t)과 covariance fΣ(x(t),t)
Binomial : 각 kernel의 bit flip probability fb(x(t),t)
를 학습한다
2.3. Model Probability
데이터에 대한 generative model의 확률은 p(x(0))=∫dx(1⋯T)p(x(0⋯T))
원래 이 계산은 intractable하지만, 아래와 같이 바꿔 쓸 수 있다
- based on annealed importance sampling & Jarzynski equality
- instead evaluate the relative probability of forward & reverse trajectories, averaged over forward trajectories