[부스트캠프] augmentation 고찰 (09/08)

이영훈·2021년 9월 9일

부스트캠프

Naver Boostcamp AI-TECH

목록 보기

12/19

Intro

Augmentation을 하면 결과가 좋아질 때도 있고 결과가 오히려 나빠질 때도 있다. 우리는 데이터 특성 혹은 모델의 특성에 따라 필요한 augmentation을 생각해내는 나름의 기술이 필요하다.

오늘은 data augmentation에 대해 조금 생각해보려고 한다.

오늘 포스팅에는 개인적 견해가 매우 많이 들어갔기 때문에 제가 잘못 해석하고 있거나 틀린 부분을 지적해주시면 감사하겠습니다.

1. albumentations

여태까지 주로 augmentation을 할 때 torchvision.transforms를 사용해 왔다. 하지만 open source library중에 더욱 빠르고 더 다양한 기능을 지원하는 library가 있는데 그게 바롤 albumentations이다.

albumentations를 구현할 때 주의할점은 data pipeline이 torchvision.transforms와는 사뭇 다르다는 점이다.

class TorchvisionDataset(Dataset):
    def __init__(self, file_paths, labels, transform=None):
        self.file_paths = file_paths
        self.labels = labels
        self.transform = transform
        
    def __len__(self):
        return len(self.file_paths)

    def __getitem__(self, idx):
        label = self.labels[idx]
        file_path = self.file_paths[idx]
        
        # Read an image with PIL
        image = Image.open(file_path)
        
        start_t = time.time()
        if self.transform:
            image = self.transform(image)
        total_time = (time.time() - start_t) # augmentation time check

        return image, label, total_time

위의 코드는 기본적인 torchvison.transforms를 사용할 때의 dataset 코드이다.

class AlbumentationsDataset(Dataset):
    """__init__ and __len__ functions are the same as in TorchvisionDataset"""
    def __init__(self, file_paths, labels, transform=None):
        self.file_paths = file_paths
        self.labels = labels
        self.transform = transform
        
    def __len__(self):
        return len(self.file_paths)

    def __getitem__(self, idx):
        label = self.labels[idx]
        file_path = self.file_paths[idx]
        
        # Read an image with OpenCV
        image = cv2.imread(file_path)
        
        # By default OpenCV uses BGR color space for color images,
        # so we need to convert the image to RGB color space.
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

        start_t = time.time()
        if self.transform:
            augmented = self.transform(image=image) 
            image = augmented['image']
	    total_time = (time.time() - start_t) # augmentation time check
        return image, label, total_time

__getitem__ 부분에서 self.transform을 적용 시켜줄 때에 image=augmented['image']와 같이 딕셔너리 구조로 augmentation이 출력되고 있기 때문에 이 부분을 주의해주어야 한다.

2. blur augmentation

blur augmentation이 모델의 성능 향상에 도움이 되는 경우도 있다. 그렇다면 왜 이미지를 흐릿하게 해주는 것이 모델의 성능 향상에 도움이 되는 것 일까? 우선 blur augmentation은 이미지의 선을 흐릿하게 하지만 동시에 이미지를 넓게 만들어주기도 한다. 즉 얇의 선으로 이미지가 이루어져 있을 때 blur augmentation을 적용해 선을 굵게 만듬으로써 해당 이미지에서 공백(노이즈)를 줄이고 의미있는 feature을 추출할 수 있는 확률을 높일 수 있게 된다.

이번에는 Resize augmentation과 연관지어서 생각해보자. 큰 이미지에서 blur augmentation을 하는 것보다 작은 이미지에서 blur augmentation을 적용하는 것이 성능 향상면에서 더 도움이 될 수 있다. 또한 resize를 적용하기 전에 blur 를 적용하는 것이 반대의 경우보다 성능면에서 우월한 결과를 보인다. 아마 resize를 적용하면 선들이 sparse해지는 경향이 있는데 sparse한 상태에서 blur를 적용하는 것보다 원래 상태에서 적용하는 것이 더 blur가 잘 적용되기 때문이 아닐까 싶다.

3. fine-tuning & augmentation

fine-tuning과 augmentation에 관계에대해 생각해보자. 어떻게 보면 직관적으로 매우 당연한 사실일 수도 있다. Fine-tuning을 적용하면 augmentation의 효과가 떨어질 수 있다. Fine-tuning은 말 그대로 Feature extraction 부분의 Convolution layers 부분을 freeze 시키고 classifier layers 부분만을 학습시킨다. augmentation을 적용한다는 의미는 이미지의 여러가지 feature을 추출하겠다는 시도인데 convolution layres부분을 freeze하기 때문에 학습의 효과가 잘 나타나지 않을 수 있다.

Outro

data augmentation에도 정말 여러가지 기능이 있고 여러가지 요소(hyper parameters)들과 잘 고려해서 적용해야한다. 또한 data augmentation을 적용하고 그 적용된 이미지를 확인해보는 것도 매우 중요한 것같다. 자신이 의도한대로 이미지가 잘 변형되어 학습에 feeding 되는지를 확인해야 효율적으로 연구가 진행될 수 있을 것 같다.

이영훈

이전 포스트

[부스트캠프] CV Image classification (09/07)

다음 포스트