(2019)f-AnoGAN: Fast unsupervised anomaly detection with generative adversarial networks

Gyuha Park·2022년 9월 16일

Anomaly Detection Auto Encoder Computer Vision Deep Learning Medical Image gan

Paper Review

목록 보기

32/34

0. Abstract

Expert labeled data가 주어졌을 때 supervised learning은 좋은 결과를 가져다 준다. 그러나 annotated lesions에 제한된다는 단점이 있다.

본 논문은 anomalous images를 identify하고 biomaker로 활용되는 segment를 추출할 수 있는 GAN 기반의 unsupervised learning 접근 법인 fast AnoGAN (f-AnoGAN)을 제안한다.

Generative model을 healthy training data로 생성하였으며 새롭게 들어오는 data를 GAN의 latent space로 빠르게 mapping하는 방법을 제안한다.

Optical coherence tomography data로 실험을 진행하였으며 다른 접근 법들 보다 좋은 결과를 보여준다.

1. Introduction

biomaker를 추출하는데 deep learning 기반의 방법들은 매우 좋은 결과를 가져다 주지만 expert annotated data가 요구된다는 단점이 있다.

Expert annotation은 두 가지의 한계가 있다. 첫 번째로 시간이 많이 소요되며 machine learning의 학습에 적절한 annotation을 하기가 어렵다는 것이다. 두 번째로 annotated training data가 준비가 되더라도 supervised learning은 이미 알려진 makers에 제한된다는 단점이 있다.

Anomaly detection은 normal data의 distribution에 포함되지 않는 data를 선별하는 것이다. 본 논문은 annotation이 필요 없는 normal images로만 구성된 large-scale imaging data로 학습한 fast anomaly detection 방법을 제안한다. 이는 학습 전에 data를 선택할 때만 volume-level 정보가 필요하다.

본 논문은 generative adversarial networks (GANs)에 기반하고 있으며 generator와 real data distribution 사이의 Wasserstein distance를 예측해 학습을 하는 Wasserstein GAN (WGAN)을 사용하였다.

제안 된 알고리즘은 conference paper인 AnoGAN과 관련이 크다. AnoGAN은 DCGAN을 기반으로 하고 있으며 anomaly score를 계산하기 위해 iterative back propagation 방식을 사용한다.

본 논문의 f-AnoGAN은 iterative한 방식 대신 image를 latent space로 mapping하는 것을 학습함으로 속도를 real-time 수준으로 올렸다.

2. Fast GAN based anomaly detection

제안 된 anomaly detection framework는 두 가지 training step으로 학습이 된다. GAN은 generator와 discriminator를 학습함으로 normal anatomical variability의 latent representation을 학습한다. 그리고 generator를 이용해 images를 latent space로 mapping하는 encoder를 학습한다.

Encoder의 image space to latent space와 그리고 generator의 latent space to image space는 identity transform과 유사한 기능을 갖고 있어 변화의 정도가 anomaly scoring에 사용된다.

2.1. Learning a fast mapping from images to encodings in the latent space

$ziz$ architecture
ziz architecture에서 fixed generator $G$ 를 통해 z-space는 image space로 mapping된다. 그리고 encoder $E$ 는 다시 z-space로 mapping하도록 학습이 된다. 학습하는 동안 z-samples $z$ 와 reconstructed z-samples $E(G(z))$ 간의 mean squared error (MSE)를 최소화 한다.

$L_{ziz}(z)=\cfrac{1}{d}||z-E(G(z))||^2$

이러한 접근 방식은 BiGAN의 Adversarial feature learning과 비슷하다. izi architecture와는 반대로 encoder는 generator로 부터 생성된 generated images만을 입력으로 받을 수 있으며 real input images는 입력으로 받을 수 없다.
$izi$ architecture
izi architecture는 일반적인 AE configuration을 따르고 있다. 학습 동안 real images는 latent encodings $z$ 로 fixed generator $G$ 를 통해 mapping되고 다시 $z$ 는 encoder $E$ 를 통해 image space로 mapping된다.

$L_{izi}=\cfrac{1}{n}||x-G(E(x))||^2$

$||\cdot||^2$ 는 pixel-wise의 residuals의 제곱의 합이며 $n$ 은 image의 pixel 갯수이다.
$izi_f$ architecture
f-AnoGAN이 따르는 $izi_f$ architecture는 fixed discriminator $D$ 로 부터 얻은 feature $f$ 를 loss에 추가함으로 izi encoder training을 guide한다.

$L_{izi_f}(x)=\cfrac{1}{n}||x-G(E(x))|| ^2+\cfrac{\kappa}{n_d}||f(x)-f(G(E(x))||^2$

$n_d$ 는 discriminator features $f$ 의 dimensionality를 나타내며 $\kappa$ 는 weighting factor이다.

2.2. Detection of anomalies

f-AnoGAN에서 anomaly quantification은 다음과 같이 정의된다.

$\mathcal{A}(x)=\mathcal{A}_R(x)+\kappa\ \cdot\ \mathcal{A}_D(x)$

$\mathcal{A}_R(x)=\cfrac{1}{n}\ \cdot\ ||x-G(E(x))||^2,\ A_D(x)=\cfrac{1}{n_d}\ \cdot\ ||f(x)-f(G(E(x))||^2$

그리고 pixel-wise의 anomaly localization은 다음과 같이 정의된다.

$\dot{\mathcal{A}}_R(x)=|x-G(E(x))|$

3. Experiments

제안한 anomaly detection framework를 평가하기 위해 realistic images를 model이 생성할 수 있는지 그리고 anomalies를 indentify하고 anomalous region을 localize 할 수 있는지 실험하였다.

WGAN과 encoder는 healthy subjects인 270 SD-OCT volumes로 부터 얻은 약 850000장의 64x64 pixel 2D image patches로 학습되었다. 테스트는 health subjects 10 SD-OCT와 diseased case 10 SD-OCT를 사용하였다. 총 약 70000장의 2D image patches이다.

위 그림에서 1, 2, 3 열은 z-space에서 random하게 뽑은 두 point간의 linear interpolation의 결과이다. 아래 4, 5, 6 열은 두 real image로 부터 주어진 z-space간의 interpolation의 결과이다.

위 그림에서 첫 번째 열은 real input images, 두 번째 열은 pixel-level anomaly annotations이다. 그리고 나머지 아래 열들은 각 model 별 generated images와 real image간의 overlayed residual이다.

위 표는 f-AnoGAN joint encoder-decoder 학습 접근 방식인 AE, AdvAE, ALI, 그리고 WGAN $A_D$ 의 image-level anomaly detection 성능을 보여준다.

Image-level anomaly detection의 ROC curve와 그에 따른 AUC value이다. 오른쪽 그림은 동일한 pre-trained WGAN을 사용하지만 encoder의 학습 방식을 $izi$ , $ziz$ , $izi_f$ (f-AnoGAN)으로 나눠 평가하였다.

위 표는 encoder 학습 방식에 따른 anomaly detection의 성능이다.

Encoder 학습 방식에 따른 anamalous image regions의 pixel-level localization의 결과이다.

Pixel-level anomaly localization 성능을 f-AnGAN과 encoder-decoder 학습 접근 방식인 AE, AdvAE, ALI, 그리고 WGAN $A_D$ 를 비교하였을 때 f-AnoGAN이 다른 접근 법들과 비교했을 때 가장 좋은 성능을 보여준다.

또한, pre-trained WGAN을 그대로 사용하고 encoder의 학습 방식을 $izi$ , $ziz$ , $izi_f$ (f-AnoGAN)으로 나눠 평가해도 f-AnoGAN의 학습 방식이 가장 좋은 성능을 보여준다.

그리고 discriminator를 학습에 사용하지 않은 encoder-decoder 학습 접근 방식인 AE, AdvAE, ALI와 discriminator를 사용한 $izi$ , $ziz$ , $izi_f$ (f-AnoGAN)의 pixel-level anomaly localization 성능을 비교했을 때 discriminator를 학습에 사용하는 것이 reconstruction에 큰 도움이 된다는 것을 확인할 수 있다.

4. Conclusion

본 논문에서는 GANs를 사용한 fast anomaly detection을 제안하였다. f-AnoGAN을 생성하기 위해 healthy examples로 WGAN과 images를 latent space로 mapping하는 encoder를 학습하였다.

Anomaly detection의 과정에서 input images는 encoder와 generator를 통해 reconstructed images로 변환된다. 그리고 reconstruction residual과 discriminator features에서 residual을 score로 얻어 anomalies를 위한 maker를 생성한다.

제안 된 방법은 충분한 양의 normal medical data가 있다면 1D, 2D, 3D data와 같은 다양한 biomedical data에 anomaly detection을 위해 사용될 수 있다.

한계 점은 anomaly detection의 segmentation 평가 결과는 expert의 annotation을 기준으로 하고 있는데 이는 평가가 가능한 일부의 anomalies만을 다루고 있기 때문에 새로운 biomaker를 발견한다는 측면에서는 false positive의 결과를 온전히 신뢰할 수는 없다.

Gyuha Park

Medical Imaging & AI

이전 포스트

(2020)DECOUPLING REPRESENTATION AND CLASSIFIER FOR LONG-TAILED RECOGNITION

다음 포스트