(2020)EDCNN: Edge enhancement-based Densely Connected Network with Compound Loss for Low-Dose CT Denoising

Gyuha Park·2021년 8월 17일

Computer Vision Deep Learning Image Enhancement Medical Image

Paper Review

목록 보기

18/34

0. Abstract

Medical image 분야에서 CT image의 denoising은 매우 중요한 연구분야이다. 최근에는 CNN 기반의 딥러닝 알고리즘들이 좋은 성능을 보여주고 있다. 하지만 여전히 over-smoothed의 문제점이 있다.

본 논문에서는 EDCNN(Edge enhancement based Densely connected Convolutional Neural Network)를 제안하였다. 새롭게 제안된 학습 가능한 Sobel convolution을 사용하여 edge enhancement module을 구현하였다.

학습 시, over-smoothed 문제를 해결하기 위해 MSE loss와 multi-scales perceptual loss를 합친 compound loss를 제안하였다. 기존의 다른 알고리즘들과 비교했을 때 detail을 보존하고 noise를 없애는데 좋은 성능을 보여줬다.

1. Introduction

CT image는 잠재적인 방사선 위험이 있는 X-ray 기술을 사용하기 때문에 사람들의 많은 주의가 필요하다. 이전 연구들을 보면 X-ray 방사선 수치를 특정 range까지 높이면 CT image의 quality가 올라가며 반대로, 낮은 방사선 수치는 위험도는 낮지만 CT image quality를 떨어트린다는 단점이 있다. 두 가지가 상충되는 이러한 문제를 해결하기 위해서는 낮은 방사선 수치에서 찍은 CT image의 quality를 높이기 위해 denoising 알고리즘들이 제안 되고 있는 것이다.

하지만 기존의 알고리즘들은 over-smoothed 문제를 겪고 있다. 이러한 문제를 해결하기 위해 본 논문에서는 EDCNN(Edge enhancement based Densely connected Convolutional Neural Network)를 제안하였다.

학습 가능한 Sobel convolution 기반의 edge enhancement module을 제안하였다.
Densly connection 기법을 사용한 FCN(Fully Convolutional neural Network)를 제안하였다.
MSE loss와 multi-scales perceptual loss를 합친 compound loss를 제안하였다.

CT image denoising 분야에 대한 선행 연구들을 알아본다. 현재 mainstream은 세 가지로 나뉠 수 있다.

1) Encoder-decoder

이 구조는 spatial information을 얻는 encoder와 deconvolutional layer로 feature map을 얻는 decoder로 구성되어 있다. 선행 연구로 residual connection을 적용한 REDCNN, conveying-paths connection을 적용한 CPCE가 있다.

2) Fully convolution network

모델 전체가 convolution layer로 구성되어있다. 몇몇 모델들은 kernel size를 5 또는 3으로 고정하기도 한다. 게다가 이러한 모델들은 conveying-paths connection을 활용한다.

본 논문의 EDCNN도 fully convolution network이다.

3) GAN-based algorithms

이 알고리즘은 generator와 discriminator로 구성되어 있다. Generator는 denoised image를 생성하는 모델이며 discriminator는 noised image와 denoised image를 구분하는 모델이다. 두 모델은 adversarial strategy로 학습이 된다.

좀 더 발전한 모델들은 cascaded, parallel 구조를 구성하고 있다.

4) Loss function

Loss function은 일반적으로 3가지 종류가 사용된다.

Per-pixel loss
Output image와 target image 사이의 pixel-wise loss를 계산하는 방식이다. Type으로는 MSE loss가 일반적으로 사용된다. 단점으로는 structure information을 고려하지 못한다는 점이 있다.
Perceptual loss
Spatial information을 고려하기 위해 도입된 loss이다. Feaure space 상에서 similarity를 계산한다. 학습된 VGGNet이 주로 사용된다. Output image의 detail을 보존할 수 있다는 장점이 있지만 cross-hatch artifact가 발생한다는 단점이 있다.
Other loss
GAN-based 알고리즘들은 adversarial loss를 사용한다. 그 외에 MAP-NN 모델은 adversarial loss, MSE loss, edge incoherence loss를 합친 composite loss를 제안하였다.

3. Methodology

1) Edge enhancement Module

이 Module에서는 학습 가능한 Sobel convolution이 사용되었다. 학습 가능한 parameter $\alpha$ 는 학습 과정에서 다른 intensity에서도 edge information을 잘 찾는 값을 찾아간다.

2) Overall Network Architecture

CT image가 입력으로 들어오면 edge information을 얻기 위해 convolution operation을 수행한다. 총 8개의 block이 있으며 4가지 type의 kernel이 있다.

DenseNet에 영감을 받아서 dense connection을 도입하였다.

Output convolution block은 입력 CT image가 더해진다.

3) Compound Loss Function

MSE loss가 일반적으로 많이 사용되지만 over-smoothed 문제가 있다. 이를 해결하기 위해 본 논문에서는 MSE loss, multi-scales perceptual loss를 합친 compound loss를 제안하였다.

$L_{mse}=\frac{1}{N}\sum\limits_{i=1}^N||F(x_i,\theta)-y_i||^2$

$L_{multi-p}=\frac{1}{NS}\sum\limits_{i=1}^N\sum\limits_{s=1}^S||\phi_s(F(x_i,\theta),\hat{\theta})-\phi_s(y_i,\hat{\theta})||^2$

$L_{compound}=L_{mse}+w_p\cdot L_{multi-p}$

Multi-scales perceptual loss는 4 stage로 구성된 ImageNet으로 pretrained된 ResNet-50을 사용하였다.

4. Experiments and Results

1) Dataset

2016 NIH AAPM-Mayo Clinic Loss-Dose CT Grand Challenge의 dataset을 선정하였다.

10명의 환자로 부터 얻은 512x512의 CT image이다.

원본 이미지인 NDCT와 noise가 추가된 synthetic image인 LDCT의 paire로 구성되어 있다.

2) Experimental Setup

Weight initialization은 random으로 하였다. Sobel factor인 $\alpha$ 는 1로 초기화 하였다.

Compound loss의 hyper parameter인 $w_p$ 는 0.01로 설정하였다.

Optimizer로는 AdamW optimizer를 선정하였고 learning rate는 0.001, 200 epoch이다.

3) Results

기존의 모델인 REDCNN, WGAN, CPCE와 성능을 비교하였다.

위 표는 quantitative를 비교한 표이다. 빨간색이 best, 파란색이 second best이다.

위 표는 quality를 비교한 표이다.

위 그림은 모델에 따른 PSNR curve이다. BCNN은 적용된 기법들을 사용하지 않은 기본 CNN 모델이다. BCNN+DC는 기본 CNN 모델에 dense connection을 적용한 모델이다. BCNN+DC+EM은 dense connection과 enhancement module을 적용한 모델이다.

위 표는 model structure에 따른 quantitative analysis이다.

위 그래프와 표를 통해 dense connection과 enhancement module의 성능을 입증한다.

Perceptual loss를 얻기 위한 model의 종류에 따른 결과이다. ResNet 기반이 좀 더 좋은 결과를 보여준다.

5. Conclusion

본 논문에서는 EDCNN이라는 새로운 모델을 제안하였다. 학습 가능한 sobel operator 기반의 edge-enhancement module과 compound loss를 도입하였다.

추후 연구로는 제안된 EDCNN 모델에 multi-model structure를 적용해볼 예정이다.

Gyuha Park

Medical Imaging & AI

이전 포스트

(2017)Image Registration Techniques: A Survey

다음 포스트

(2020)EDCNN: Edge enhancement-based Densely Connected Network with Compound Loss for Low-Dose CT Denoising

Paper Review

0. Abstract