(2019)Unsupervised Learning of Probabilistic Diffeomorphic Registration for Images and Surfaces

Gyuha Park·2021년 12월 27일

Computer Vision Deep Learning Image Registration Medical Image U-Net

Paper Review

목록 보기

29/34

0. Abstract

최근에 learning-based 방식에서 spatial deformation function을 학습함으로 좋은 결과를 얻었다. 그러나 이러한 접근은 supervised labels를 요구하거나 diffeomorphic(topology-preserving)을 보장하지 않는다. 게다가 uncertainty estimates를 제공하는 probabilistic framework 기반이 아니다.

본 논문에서는 전통적인 방식과 learning-basd 방식을 연결하였다. CNN 기반의 unsupervised learning-based 방식에 probabilistic generative model을 합쳤다.

1. Introduction

전통적인 registration 방식은 많은 연구가 이루어졌지만 연산량이 많고 오래 걸린다는 단점이 있다. 최근에는 learning-based registration 방식으로 짧은 runtime을 달성했다. 그러나 topology-preserving에 대한 이론적인 접근이 생략되었다는 문제가 있다.

본 논문에서는 두 방식의 장점을 이용하고 단점을 극복하는 방식을 제안하였다. Probabilistic generative model을 CNN 기반의 unsupervised learning 방식과 연결하였다.

중요한 점은 diffeomorphic registration을 위해 diffeomorphic integration layers를 spatial transform layer에 추가하였다는 것이다.

실험 결과, 3D MR brain scans에 대해 diffeomorphic deformation을 제공하면서 SOTA에 달하는 성능을 얻었다. 또한 MRI 외에도 다양한 registration task에도 적용이 가능하다.

Diffeomorphic transform은 많은 기술적 발전이 있었다. 예를 들면 LDDMM, DARTEL, diffeomorphic Demons, SyN과 같은 것들이 있다. 일반적으로 이러한 tool들은 상당히 많은 시간과 연산량이 요구된다.

Probabilitic image registration은 prior term을 구체화하고 likelihood term이 image intensity를 잘 설명할 수 있도록 한다. 본 논문에서는 model이 효율적으로 deformation의 distribution을 출력하기 위해 general variational inference stategy를 제안하였다.

3. Background: Diffeomorphic Registration

본 논문에서 제안 된 방법은 다양한 deformable representation에 적용이 되지만 diffeomorphism, 특히 stationary velocity field representation을 선택하였다.

본 논문에서 deformation field는 다음 ordinary differential equation(ODE)를 따른다.

$\cfrac{\partial\phi^{(t)}}{\partial t}=v(\phi^{(t)})$

$\phi^{(0)}=Id$ 는 identity transformation, $t$ 는 time을 나타낸다. 최종 registration field $\phi^{(1)}$ 를 얻기 위해 stationary velocity field $v$ 를 $t=[0,1]$ 범위에서 integrate 한다.

다양한 numerical integration techniques 중에서 scaling, squaring이 가장 효과적 이였다.

Group theory에서 $v$ 는 Lie algebra의 member이며 Lie group member인 $\phi(1)$ 을 생성하도록 exponentiated된다.

$\phi(1)=\exp(v)$

One-parameter subgroups의 properties인 모든 $t$ , $t'$ 는 다음 식을 만족한다.

$\exp((t+t')v)=\exp(tv)\circ\exp(t'v)$

위 식에서 $\circ$ 는 lie group과 관련된 composition map이다.

$\phi^{(1/2^T)}=p+v(p)/2^T$ 식으로부터 시작해서 $\phi^{(1)}=\phi^{(1/2)}\circ\phi^{(1/2)}$ 를 얻기 위해서 아래 식을 반복하였다.

$\phi^{(1/2^{t-1})}=\phi^{(1/2^t)}\circ\phi^{(1/2^t)}$

4. Methods

$f$ 와 $m$ 은 3D image이며 $z$ 는 latent variable이라 가정한다. 그리고 $\phi_z:\ \mathbb{R}^3\rightarrow\mathbb{R}^3$ 은 parameterized된 transformation function이다.

본 논문에서는 diffeomorphic integration과 spatial transform layers를 이용한 CNN network를 활용하는 variational inference 방식을 제안하였다.

1) Generative Model

Parametrization $z$ 의 prior probability를 다음과 같이 modeling 했다.

$p(z)=\mathcal{N}(z;0,\sum_z)$

$\mathcal{N}(\cdot;\mu,\sum)$ 는 mean이 $\mu$ , covariance가 $\sum$ 인 multivariate normal distribution이다.

본 논문에서는 $z$ 를 ODE를 이용해 diffeomorphism을 구체화하는 stationary velocity field로 가정하였다.

$L=D-A$ 를 voxel grid에서 neighborhhood graph의 laplacian으로 정의하였으며 이 때, $D$ 는 graph degree matrix이고 $A$ 는 voxel neighbourhood adjacency matrix이다.

Velocity field $z$ 에 spatial smoothness 효과를 주기 위해서 $\sum_z^{-1}=\Lambda_z=\lambda L$ 을 적용하였다. 이 때, $\Lambda_z$ 는 precision matrix이고 $\lambda$ 는 velocity field의 scale을 조절하는 parameter이다.

$f$ 는 $m$ 이 warped된 image의 noisy observation이라고 가정했다. 식은 다음과 같다.

$p(f|z;m)=\mathcal{N}(f;m\circ\phi_z,\sigma_I^2\mathbb{I})$

위 식을 사용함으로 MAP estimation을 이용해 posterial probability $p(z|f;m)$ 을 구한다.

2) Learning

주어진 가정들로 posterior probability $p(z|f;m)$ 을 계산하는 것은 어렵다. 이를 해결하기 위해 $\psi$ 로 parameterized된 approximate posterior probability $q_\psi(z|f;m)$ 을 도입하였다.

본 논문에서는 KL divergence를 최소화 하였다.

$\min\limits_{\psi}\text{KL}[q_{\psi}(z|f;m)||p(z|f;m)]$

$=\min\limits_{\psi}\text{E}_q[\log q_{\psi}(z|f;m)-\log p(z|f;m)]$

$=\min\limits_{\psi}\text{E}_q[\log q_{\psi}(z|f;m)-\log p(z,f;m)]+\log p(f;m)$

$=\min\limits_{\psi}\text{KL}[ q_{\psi}(z|f;m)||p(z)]-E_q[\log p(f|z;m)]+\text{const}$

Posterior $q_{\psi}(z|f;m)$ 는 다음과 같이 approximated 되었다.

$q_{\psi}(z|f;m)=\mathcal{N}(z;\mu_{z|m,f},\sum_{z|m,f})$

이 때, $\sum_{z|m,f}$ 는 diagonal로 가정한다.

$\mu_{z|m,f}$ 와 $\sum_{z|m,f}$ 는 voxel-wise의 mean, varaince이다. 그리고 $\psi$ 로 parameterized된 network $\text{def}_{\psi}(f,m)$ 를 이용해 예측된다.

전체 loss 식은 다음과 같다.

$\mathcal{L}(\psi;f,m)=-\text{E}_q[\log p(f|z;m)]+\text{KL}[q_{\psi}(z|f;m)||p(z)]$

$=\cfrac{1}{2\sigma^2K}\sum\limits_{k}||f-m\circ\phi_{zk}||^2$

$+\cfrac{1}{2}\left[\text{tr}(\lambda D\sum_{z|x;y}-\log\sum_{z|x;y})+\mu_{z|m,f}^T\Lambda_z\mu_{z|m,f}\right]+\text{const}$

$K$ 는 expectation을 approximate 하는데 이용되는 sample의 개수이다. 첫 번째 term은 $f$ 와 $m\circ\phi_{z_k}$ 가 similar하도록 하는 term이다. 두 번째 term은 prior $p(z)$ 에 posterior이 가까워 지도록 하는 term이다. 실험 결과, $K=1$ 로 설정하였다.

3) Neural Network Framework

본 논문에서는 $f$ , $m$ 을 input으로 하고 $\mu_{z|m,f}$ , $\sum_{z|m,f}$ 를 output으로 하는 UNet-style의 network $\text{def}_{\psi}(f,m)$ 을 제안하였다.

Unsupervised learning으로 parameters $\psi$ 를 학습하기 위해서 첫 번째 layer를 re-parameterization trick을 이용해 다음과 같이 구현하였다.

$z_k=\mu_{z|m,f}+\sqrt{\sum_{z|m,f}}r$

이 때, r은 standard normal $r\sim\mathcal{N}(0,I)$ 의 sample이다.

$z_k$ 가 주어지면 $\phi_{z_k}=\exp(z_k)$ 를 계산하기 위해서 scaling, squaring operation을 적용한 vector integration layer를 제안하였다.

$\phi^{1/2^T}=p+z_k/2^T$ 에서 시작해, $\phi^{(1/2^{t-1})}=\phi^{(1/2^t)}\circ\phi^{1/2^t}$ 를 T번 반복해서 계산함으로 $\phi^{(1)}\triangleq\phi_{z_k}=\exp{z_k}$ 를 얻는다.

그 결과, diffeomorphic field $\phi_{z_k}$ 에 의해 계산된 warped volume $m$ 을 얻게 된다.

4) Registration

학습된 parameters가 주어지면 첫 번째로 network $\text{def}_\psi(f,m)$ 로 부터 $\hat{z}_k$ 를 얻는다. 이 때 digonal coraviance $\sum_{z|m,f}$ 는 사용하지 않는다.

$\hat{z_k}=\argmax\limits_{z_k}p(z_k|f;m)=\mu_{z|m;f}$

Inverse deformation field는 다음 식에 의해 증명되어 $\phi_z^{-1}=\phi_{-z}$ 로 계산할 수 있다.

$\phi_z\circ\phi_{-z}=\exp(z)\circ\exp(-z)=\exp(z-z)=Id$

5) Implementation

본 논문의 framework는 VoxelMorph package의 일부로 http://voxelmorph.csail.mit.edu에 구현되어 있다.

neuron, Keras를 사용하였으며 learning rate는 $0.0001$ , optimizer는 Adam, batch size는 GPU memory의 한계로 인해 1로 설정하였다.

5. Experiments

본 논문에서는 제안 된 probabilistic image registration framework의 accuracy, runtime을 SOTA methods들과 비교하여 검증하였다. VoxelMorph의 framekwork의 일부로 algorithm을 구현하였기 때문에 VoxelMorph-diff로 명명하였다.

1) Data and Preprocessing

본 논문에서는 3731개의 OASIS, ABIDE, ADHD200, MCIC, PPMI, HABS, Harvard GSP와 같은 다양한 large-scale의 brain MRI scans를 사용하였다.

1mm istropic voxels로 resampling, affine spaical normaization을 수행하고 FreeSurfer로 brain extraction을 적용하는 standard pre-processing을 모든 scans에 적용하였다. 마지막으로 $160\times192\times224$ 로 images를 crop을 하였다.

Segmentation maps는 29개의 anatomical structures로 구성되어 있으며 FreeSufer를 이용해 얻었다.

Train, validataion, test sets는 3231, 250, 250으로 나눴다.

2) Evaluation Metrics

Registration algorithm의 성능을 측정하기 위해 Dice metric으로 volume의 overlap을 측정하였다. 또한 diffeomorphic property를 측정하기 위해 jacobian matrix를 이용하였다.

$J_{\phi}(p)=\nabla\phi(p)\in\mathcal{R}^{3\times3}$

Local deformation은 $|J_{\phi}(p)|>0$ 조건을 만족하는 경우에만 diffeomorphic하다. 본 논문에서는 $|J_{\phi}(p)|\leq0$ 의 개수를 측정하였다.

3) Image Registration

위 표는 test set에 대한 결과의 요약을 보여준다. 모든 methods들은 비슷한 dice score 결과를 보여주지만 VoxelMorph가 ANTs, NiftiReg에 비해 연산 속도가 매우 빠르다. VoxelMorph-diff는 동일하게 준수한 dice score와 빠른 연산 속도를 가짐과 동시에 0에 가까운 non-negative Jacobian locations를 얻었다.

위 그림은 anatomical structures의 dice score를 나타내는 boxplots이다. 좌뇌, 우뇌의 structures는 visualization을 위해 합쳐서 계산되었다. 모든 methods들은 준수한 성능을 보여주고 있으며 특정 structures에서 특정 method가 좋고 나쁨이 있다.

4) Analysis

Smoothing precision $\lambda$ , image noise $\sigma_I^2$ 은 물리적으로 의미가 있다. 그러나 두 hyperparameters는 single degree of freedom만 loss function에서 공유를 한다. 본 논문에서는 $\sigma_I^2=0.02$ 로 고정하고 $\lambda$ 를 0.5에서 100사이로 두고 실험을 하였다. 실험 결과, $\lambda=20$ 에서 가장 좋은 결과를 얻었다.

위 그림은 scaling, squaring steps가 변화함에 따른 accuracy, runtime, deformation regularity, invertability를 보여준다. 실험 결과, 5 steps 이후부터 VoxelMorph-diff는 SOTA의 성능을 달성하였다. 메모리와 시간을 고려해 $T=7$ 로 설정하였다.

위 그림은 velocity field $z_k$ 와 voxel-wise empirical variance를 visulaization 한 것이다. 실험 결과, $\lambda$ 가 작은 경우보다 큰 경우가 더 작은 $\sum_{z|m,f}$ 와 smooth된 $z_k$ 를 생성한다.

그리고 위 표를 보면 서로 다른 $\lambda$ 에도 불구하고 integration operation은 모두 좋은 결과로 이어지도록 함을 확인할 수 있다.

6. Discussion and Conclusion

본 논문은 diffemorphic image registration을 위해 빠른 연산 속도를 보여주며 CNN, unsupervised, end-to-end learning을 적용한 probabilistic model을 제안하였다.

Diffeomrophci transforms를 얻기 위해 본 논문에서는 새로운 scaling, squaring differentiable network operation으로 stationary velocity를 integrate 하였다.

본 논문의 algorithm은 즉시 새로운 image paris가 입력되어도 registration이 가능하며 전통적인 registration methods들과 비교했을 때, 매우 빠르다. 게다가 최근의 learning-based methods와는 다르게 diffeomorphic을 보장한다.

Gyuha Park

Medical Imaging & AI

이전 포스트

(2021)HyperMorph: Amortized Hyperparameter Learning for Image Registration

다음 포스트

(2019)Unsupervised Learning of Probabilistic Diffeomorphic Registration for Images and Surfaces

Paper Review

0. Abstract

1. Introduction

3. Background: Diffeomorphic Registration