SRGAN 따라하기

Hyunwoo Lee·2022년 5월 11일

Super Resolution으로 유명한 SRGAN 모델

SRGAN을 사용해보자

0. 환경설정

ubuntu 18.04 LTS
git, conda, pip3, pytorch가 설치된 환경.
인터넷에 설치법이 쉽게 나와있으니 참고하도록 하자

1. 코드 다운로드

github에서 코드를 다운받는다.
git clone https://github.com/leftthomas/SRGAN.git SRGAN

root@7310e7ec2d72:~# git clone https://github.com/leftthomas/SRGAN.git SRGAN
Cloning into 'SRGAN'...
remote: Enumerating objects: 1064, done.
remote: Total 1064 (delta 0), reused 0 (delta 0), pack-reused 1064
Receiving objects: 100% (1064/1064), 32.17 MiB | 8.22 MiB/s, done.
Resolving deltas: 100% (675/675), done.
root@7310e7ec2d72:~#

SRGAN 폴더로 이동
cd SRGAN
폴더 내용물 확인
ls

root@7310e7ec2d72:~/SRGAN# ls
LICENSE    benchmark_results  data_utils.py  images   model.py      statistics         test_image.py  train.py
README.md  data               epochs         loss.py  pytorch_ssim  test_benchmark.py  test_video.py  training_results

README.md 파일에 사용법이 있을 것이다.

텍스트파일 내용물 확인
cat README.md

# SRGAN
A PyTorch implementation of SRGAN based on CVPR 2017 paper
[Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network](https://arxiv.org/abs/1609.04802).

## Requirements
- [Anaconda](https://www.anaconda.com/download/)
- PyTorch
...

내용물이 길고 마크다운 언어로 작성된 듯하다.
github 페이지에서 직접 확인하자.

https://github.com/leftthomas/SRGAN
웹브라우저에서 접속
페이지 하단에 아래와 같은 내용을 확인 할 수 있다.

2. Train

모든 머신러닝 모델은 학습이 필요하다.
README.md의 Usage 부분에서 확인한 대로 SRGAN 폴더에서
python train.py

root@7310e7ec2d72:~/SRGAN# python train.py
Traceback (most recent call last):
  File "train.py", line 32, in <module>
    train_set = TrainDatasetFromFolder('data/DIV2K_train_HR', crop_size=CROP_SIZE, upscale_factor=UPSCALE_FACTOR)
  File "/opt/ml/SRGAN/data_utils.py", line 44, in __init__
    self.image_filenames = [join(dataset_dir, x) for x in listdir(dataset_dir) if is_image_file(x)]
FileNotFoundError: [Errno 2] No such file or directory: 'data/DIV2K_train_HR'

에러가 뜨면서 실행되지 않는다.

마지막 줄에 에러 원인이 나와있다.

FileNotFoundError: [Errno 2] No such file or directory: 'data/DIV2K_train_HR'

'data/DIV2K_train_HR' 경로를 찾을 수 없다고 한다.
README.md 파일을 보니 dataset은 따로 다운로드 해야한다.

here라고 적힌 링크를 따라가니 baidu 클라우드로 연결된다.
다운로드하기 복잡하므로 다른 dataset을 이용하자.

README.md 파일엔 VOC2012 dataset을 사용한다고 적혀있는데, 코드상에선 DIV2K dataset을 활용했다.

자주 쓰이는 image dataset은 나중에 따로 정리하도록 하고, 이번에는 DIV2K Dataset을 이용해보자.

https://data.vision.ee.ethz.ch/cvl/DIV2K/

위 링크에서 다운로드 가능하다.

페이지 최 하단부에 다운로드 링크가 있다.

학습과 검증을 위해선 Train Dataset과 Validation Dataset이 필요하다
또한 SR 학습을 위해 Low Resolution High Resolution 파일이 모두 필요하다.

그러나 우리의 코드에는 해상도를 변환해주는 코드가 있으므로 최하단의 두개의 High Resolution images 파일만 다운로드 받아준다.

각각 3.4GB, 0.4GB 정도이다.

터미널 환경에서도 다운로드 할 수 있다.
먼저 SRGAN/data 폴더로 이동하자.
cd data
그리고 다음 명령어로 다운받는다.
wget http://data.vision.ee.ethz.ch/cvl/DIV2K/DIV2K_train_HR.zip
wget http://data.vision.ee.ethz.ch/cvl/DIV2K/DIV2K_valid_HR.zip

root@7310e7ec2d72:~/SRGAN/data# wget http://data.vision.ee.ethz.ch/cvl/DIV2K/DIV2K_train_HR.zip
--2022-05-11 12:50:22--  http://data.vision.ee.ethz.ch/cvl/DIV2K/DIV2K_train_HR.zip
Resolving data.vision.ee.ethz.ch (data.vision.ee.ethz.ch)... 129.132.52.178, 3001:67a:10ec:37b2::108
Connecting to data.vision.ee.ethz.ch (data.vision.ee.ethz.ch)|110.172.52.228|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://data.vision.ee.ethz.ch/cvl/DIV2K/DIV2K_train_HR.zip [following]
--2022-05-11 12:50:32--  https://data.vision.ee.ethz.ch/cvl/DIV2K/DIV2K_train_HR.zip
Connecting to data.vision.ee.ethz.ch (data.vision.ee.ethz.ch)|110.172.52.228|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3530603713 (3.3G) [application/zip]
Saving to: ‘DIV2K_train_HR.zip’

DIV2K_train_HR.zip             37%[=================>                                ]   1.24G  11.2MB/s    eta 3m 18s

wget이 설치되지 않은 경우
sudo apt install wget 을 먼저 실행한다.

root@7310e7ec2d72:~/SRGAN/data# apt install wget
Reading package lists... Done
Building dependency tree
Reading state information... Done
wget is already the newest version (1.19.4-1ubuntu2.2).
0 upgraded, 0 newly installed, 0 to remove and 36 not upgraded.

두 파일을 모두 다운로드 했으면, 압축을 해제시켜줘야 한다.
ls 명령어로 다운받은 파일을 확인한다.

root@7310e7ec2d72:~/SRGAN/data# ls
DIV2K_train_HR.zip  DIV2K_valid_HR.zip

unzip 명령어로 압축해제.
설치되어있지 않은 경우 sudo apt install unzip

unzip DIV2K_train_HR.zip
unzip DIV2K_valid_HR.zip

  inflating: DIV2K_train_HR/0544.png
  inflating: DIV2K_train_HR/0416.png
  inflating: DIV2K_train_HR/0295.png
  inflating: DIV2K_train_HR/0538.png
...

압축 해제되는 이미지 파일 하나하나가 표시된다.
ls 명령어로 확인해보자

root@7310e7ec2d72:~/SRGAN/data# ls
DIV2K_train_HR  DIV2K_train_HR.zip  DIV2K_valid_HR  DIV2K_valid_HR.zip

압축 해제된 것을 확인할 수 있다.

다시 SRGAN 폴더로 돌아가자.
cd ..
dataset을 넣어줬으니 train이 정상작동할 것이다.
python train.py

root@7310e7ec2d72:~/SRGAN# python train.py
# generator parameters: 734219
# discriminator parameters: 5215425
Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to /opt/ml/.cache/torch/hub/checkpoints/vgg16-397923af.pth
100%|████████████████████████████████████████████████████████████████████████████████| 528M/528M [00:08<00:00, 63.1MB/s]
[1/100] Loss_D: 0.8443 Loss_G: 0.0450 D(x): 0.5024 D(G(z)): 0.3312: 100%|███████████████| 13/13 [00:26<00:00,  2.06s/it]
[converting LR images to SR images] PSNR: 15.3047 dB SSIM: 0.3970: 100%|██████████████| 100/100 [00:33<00:00,  3.03it/s]
[saving training results]: 100%|████████████████████████████████████████████████████████| 20/20 [00:20<00:00,  1.01s/it]
[2/100] Loss_D: 0.8718 Loss_G: 0.0213 D(x): 0.4618 D(G(z)): 0.2530: 100%|███████████████| 13/13 [00:26<00:00,  2.04s/it]
[converting LR images to SR images] PSNR: 14.3030 dB SSIM: 0.3909: 100%|██████████████| 100/100 [00:32<00:00,  3.03it/s]
[saving training results]: 100%|████████████████████████████████████████████████████████| 20/20 [00:20<00:00,  1.05s/it]
[3/100] Loss_D: 0.8175 Loss_G: 0.0186 D(x): 0.4169 D(G(z)): 0.2086: 100%|███████████████| 13/13 [00:26<00:00,  2.04s/it]
[converting LR images to SR images] PSNR: 18.2039 dB SSIM: 0.5148: 100%|██████████████| 100/100 [00:34<00:00,  2.91it/s]
[saving training results]: 100%|████████████████████████████████████████████████████████| 20/20 [00:20<00:00,  1.03s/it]
[4/100] Loss_D: 0.4785 Loss_G: 0.0176 D(x): 0.6804 D(G(z)): 0.1720:  31%|████▉           | 4/13 [00:08<00:25,  2.88s/it][4/100] Loss_D: 0.5887 Loss_G: 0.0185 D(x): 0.6745 D(G(z)): 0.2311: 100%|███████████████| 13/13 [00:26<00:00,  2.04s/it]
[converting LR images to SR images] PSNR: 18.2836 dB SSIM: 0.5135:  39%|█████▊         | 39/100 [00:13<00:22,  2.70it/s][converting LR images to SR images] PSNR: 18.4853 dB SSIM: 0.5229:  75%|███████████▎   | 75/100 [00:25<00:07,  3.26it/s]

정상적으로 실행된다.