Image Segmentation(3) - 데이터 압축 알고리즘 : RLE 인코딩/디코딩

JINNI·2023년 7월 9일

위성 이미지 건물 영역 분할

목록 보기

3/4

RLE(Ren-Length Encoding)

RLE은 데이터 압축 알고리즘 중 하나로, 연속된 반복되는 값들을 효율적으로 표현하기 위해 사용됨

# RLE 디코딩 함수
def rle_decode(mask_rle, shape):
    s = mask_rle.split()
    starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
    starts -= 1
    ends = starts + lengths
    img = np.zeros(shape[0]*shape[1], dtype=np.uint8)
    for lo, hi in zip(starts, ends):
        img[lo:hi] = 1
    return img.reshape(shape)

# RLE 인코딩 함수
def rle_encode(mask):
    pixels = mask.flatten()
    pixels = np.concatenate([[0], pixels, [0]])
    runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
    runs[1::2] -= runs[::2]
    return ' '.join(str(x) for x in runs)

RLE 디코딩 함수인 rle_decode는 압축된 RLE 형식의 마스크를 입력으로 받아서 원래의 형태로 디코딩된 이미지 마스크를 반환
- RLE 형식의 마스크를 공백 문자를 기준으로 분리하여 시작 위치(starts)와 길이(lengths)를 얻음
- 시작 위치를 0부터 시작하는 인덱스로 변환하기 위해 1 빼기
- 종료 위치(ends)를 시작 위치(starts)에 길이(lengths)를 더해서 구하기
- 이미지 마스크를 나타내기 위해 지정된 모양(shape)에 맞게 0으로 초기화된 배열 생성
- 시작 위치(starts)와 종료 위치(ends)를 이용하여 마스크의 해당 영역을 1로 설정
- 최종적으로 생성된 1차원 배열을 지정된 모양(shape)으로 재구성하여 이미지 마스크를 반환합니다.
RLE 인코딩 함수인 rle_encode는 이미지 마스크를 입력으로 받아서 RLE 형식으로 압축된 마스크를 반환
- 이미지 마스크를 1차원 배열로 변환
- 배열의 시작과 끝에 0을 추가
- 연속되는 값의 시작 인덱스 탐색
- 연속된 값의 길이 계산
- 길이 정보를 RLE 형식으로 변환. 시작 위치와 길이를 번갈아가면서 기록하며, 길이는 시작 위치의 차이로 표현
- RLE 형식으로 변환된 정보를 문자열로 반환

➡ RLE 디코딩과 인코딩을 통해 이미지 데이터를 효율적으로 표현하고 저장할 수 있음. 이미지 마스크의 압축된 표현이나 객체 검출과 같은 작업에서 활용

* backbone network : 입력 이미지를 feature map으로 변형시키는 부분 사전학습시킨 ResNet-50, VGG16이 많이 쓰임

Dataset Info.

train_img
TRAIN_0000.png ~ TRAIN_7139.png
1024 x 1024

test_img
TEST_00000.png ~ TEST_60639.png
224 x 224

train.csv
img_id : 학습 위성 이미지 샘플 ID
img_path : 학습 위성 이미지 경로 (상대 경로)
mask_rle : RLE 인코딩된 이진마스크(0 : 배경, 1 : 건물) 정보
학습 위성 이미지에는 반드시 건물이 포함되어 있습니다.
그러나 추론 위성 이미지에는 건물이 포함되어 있지 않을 수 있습니다.
학습 위성 이미지의 해상도는 0.5m/픽셀이며, 추론 위성 이미지의 해상도는 공개하지 않습니다.

test.csv
img_id : 추론 위성 이미지 샘플 ID
img_path : 추론 위성 이미지 경로 (상대 경로)

sample_submission.csv - 제출 양식
img_id : 추론 위성 이미지 샘플 ID
mask_rle : RLE 인코딩된 예측 이진마스크(0: 배경, 1 : 건물) 정보
단, 예측 결과에 건물이 없는 경우 반드시 -1 처리

Dice coefficient

Ground Truth (정답)에 건물이 없고, Prediction (예측) 또한 건물이 없다고 맞춘 경우에는 샘플들의 Dice Coefficient 평균 계산에서 '제외' 됩니다.
다른 케이스로, Ground Truth (정답)에 건물이 없으나, Prediction (예측)에는 건물이 있다고 하는 경우에는 해당 샘플의 Dice Coefficient는 0점이 됩니다.

import numpy as np
import pandas as pd
from typing import List, Union
from joblib import Parallel, delayed


def rle_decode(mask_rle: Union[str, int], shape=(224, 224)) -> np.array:
    '''
    mask_rle: run-length as string formatted (start length)
    shape: (height,width) of array to return 
    Returns numpy array, 1 - mask, 0 - background
    '''
    if mask_rle == -1:
        return np.zeros(shape)
    
    s = mask_rle.split()
    starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
    starts -= 1
    ends = starts + lengths
    img = np.zeros(shape[0]*shape[1], dtype=np.uint8)
    for lo, hi in zip(starts, ends):
        img[lo:hi] = 1
    return img.reshape(shape)


def dice_score(prediction: np.array, ground_truth: np.array, smooth=1e-7) -> float:
    '''
    Calculate Dice Score between two binary masks.
    '''
    intersection = np.sum(prediction * ground_truth)
    return (2.0 * intersection + smooth) / (np.sum(prediction) + np.sum(ground_truth) + smooth)


def calculate_dice_scores(ground_truth_df, prediction_df, img_shape=(224, 224)) -> List[float]:
    '''
    Calculate Dice scores for a dataset.
    '''


    # Keep only the rows in the prediction dataframe that have matching img_ids in the ground truth dataframe
    prediction_df = prediction_df[prediction_df.iloc[:, 0].isin(ground_truth_df.iloc[:, 0])]
    prediction_df.index = range(prediction_df.shape[0])


    # Extract the mask_rle columns
    pred_mask_rle = prediction_df.iloc[:, 1]
    gt_mask_rle = ground_truth_df.iloc[:, 1]


    def calculate_dice(pred_rle, gt_rle):
        pred_mask = rle_decode(pred_rle, img_shape)
        gt_mask = rle_decode(gt_rle, img_shape)


        if np.sum(gt_mask) > 0 or np.sum(pred_mask) > 0:
            return dice_score(pred_mask, gt_mask)
        else:
            return None  # No valid masks found, return None


    dice_scores = Parallel(n_jobs=-1)(
        delayed(calculate_dice)(pred_rle, gt_rle) for pred_rle, gt_rle in zip(pred_mask_rle, gt_mask_rle)
    )


    dice_scores = [score for score in dice_scores if score is not None]  # Exclude None values


    return np.mean(dice_scores)

참고자료

실행 길이 인코딩(RLE) 데이터 압축 알고리즘

JINNI

천재 개발자 되기

이전 포스트

Image Segmentation(2) - 기술과 딥러닝 기반 모델(FCN, U-Net, SegNet, DeepLab, ResNet)

다음 포스트

Image Segmentation(3) - 데이터 압축 알고리즘 : RLE 인코딩/디코딩

위성 이미지 건물 영역 분할

RLE(Ren-Length Encoding)

Dataset Info.

Dice coefficient

Image Segmentation(2) - 기술과 딥러닝 기반 모델(FCN, U-Net, SegNet, DeepLab, ResNet)

DeepLabV3+ Colab으로 구현하기(1)

0개의 댓글

관련 채용 정보