Semantic Segmentation Labeling

BSH·2023년 7월 7일

데이터가 RLE(Run Length Encode)를 통해 압축되어 있을 때 mmsegmentation을 학습을 위해서 label image를 만들어 주어야 합니다.

RLE(Run Length Encode)

간단하게 RLE는 이미지 압축 알고리즘으로 연속된 픽셀을 그 값과 길이를 함께 저장하는 방법입니다. Dacon에서는 두개의 클래스만 있어 1인 경우에 대해서 시작과 그 길이를 함께 저장했습니다. 예로 1 3 10 3 인 경우 1, 2, 3 / 10, 11, 12 로 작성되어 있습니다.

def rle_decode(mask_rle, shape):
    s = mask_rle.split()
    starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
    starts -= 1
    ends = starts + lengths
    img = np.zeros(shape[0]*shape[1], dtype=np.uint8)
    for lo, hi in zip(starts, ends):
        img[lo:hi] = 1
    return img.reshape(shape)

def rle_encode(mask):
    pixels = mask.flatten()
    pixels = np.concatenate([[0], pixels, [0]])
    runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
    runs[1::2] -= runs[::2]
    return ' '.join(str(x) for x in runs)

Make label

실제 이미지랑 label이 저장되는 폴더는 사용자가 지정할 수 있습니다.

os.makedirs("data/images/train", exist_ok=True)
os.makedirs("data/annotations/train", exist_ok=True)

data_root = 'data'
img_dir = "images"
ann_dir = "annotations"

palette = [128, 0, 0, 0, 128, 0]
# sklearn의 train_test_split을 통해 나눠진 값들
for x, y in zip(x_train, y_train):
  img_name = x.split("/")[-1]
  shutil.copy(osp.join(data_root, x), osp.join(data_root, f"{img_dir}/train/{img_name}"))
  img_path = osp.join(data_root, x)
  img = cv2.imread(img_path) # BGR
  h, w, _ = img.shape
  ann_img = rle_decode(y, (h, w))
  # 디코딩 된 값을 이미지로 변환해 저장
  png = Image.fromarray(ann_img).convert('P')
  png.putpalette(palette)
  png.save(os.path.join(data_root, f"{ann_dir}/train/{img_name}"))
  del png

cv2를 이용해서 저장하면 이미지가 3채널로 저장되고 메모리를 더 많이 차지하는 문제가 있습니다. 그리고 이 이미지를 mmsegmentation이 제대로 인식을 못하고 warning을 뱉습니다. 그래서 위 방식으로 각 픽셀에 대한 정답을 픽셀에 표시해주고 그 픽셀에 해당하는 값을 팔레트에 지정해주어야 합니다.

BSH

컴공생

이전 포스트

Visual Attention Network 논문 리뷰

다음 포스트

Semantic Segmentation Labeling

RLE(Run Length Encode)

Make label

Visual Attention Network 논문 리뷰

FastAPI Swagger 문서에 간단한 권한 주기

0개의 댓글

관련 채용 정보