CV 과제 관련 이슈

한량·2021년 9월 6일

[U-stage] Computer Vision

목록 보기

13/13

필수과제 1. VGGNet-11

에러) RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same

https://stackoverflow.com/questions/59013109/runtimeerror-input-type-torch-floattensor-and-weight-type-torch-cuda-floatte
input, label, model 중 일부를 cuda에 올리지 않았을 때, device가 달라서 생기는 문제

use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")

model.to(device)
for iter, (img, label) in enumerate(qd_train_dataloader):
  optimizer.zero_grad()
  
  img = img.to(device)
  label = label.to(device)
  
  ...

로 해결

model.named_parameters()

Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
--> 그 자체로 generator, 이름과 매개 변수 자체인 이름(name)과 파라미터(param)를 반환

출처: https://comlini8-8.tistory.com/50 [컴린이 탈출기 ver 2.0 ٩( ᐛ )و]
출처: https://pytorch.org/docs/stable/generated/torch.nn.Module.html

for name, layer in model_finetune.named_parameters():
  print("=" * 40)
  print("%-20s ==> Train : %s" % (name, layer.requires_grad))
  
# ========================================
# features.0.weight    ==> Train : False
# ========================================
# features.0.bias      ==> Train : False
# ========================================
# features.3.weight    ==> Train : False
# ========================================
# features.3.bias      ==> Train : False
# ========================================
# ...

필수과제 2. Data augmentation for Quickdraw datset

next(iter(datset))

iter: 객체의 iter 메서드를 호출
next: 객체의 next 메서드를 호출

>>> it = iter(range(3))
>>> next(it)
0
>>> next(it)
1
>>> next(it)
2
>>> next(it)
Traceback (most recent call last):
  File "<pyshell#6>", line 1, in <module>
    next(it)
StopIteration

반복 가능한 객체에서 __iter__를 호출하고 이터레이터에서 __next__ 메서드를 호출한 것과 같다
즉, iter는 반복 가능한 객체에서 이터레이터를 반환하고, next는 이터레이터에서 값을 차례대로 꺼낸다

출처: https://dojang.io/mod/page/view.php?id=2408

이를 dataset에 적용하면 첫번째 data를 가져올 수 있음

img, label = next(iter(qd_train_dataset))

>>> img.shape
torch.Size([3, 224, 224])

>>> type(img)
torch.Tensor

image.permute()

cv2는 (height, width, channel) 순서로 이미지를 처리하지만, torch의 conv2d layer의 경우 (batch, channel, height, width) 순서의 tensor를 입력받는다

따라서, permute() 같은 함수를 통해 channel 순서를 바꿔주거나, torch.squeeze(), torch.unsqueeze(), np.newaxis()등을 dummy batch dimension을 생성/제거해주는 경우도 많이 존재

참고: https://stackoverflow.com/questions/53623472/how-do-i-display-a-single-image-in-pytorch

>>> type(img)
torch.Tensor

# channel을 마지막으로 보냄
visualized_img = img.permute(1, 2, 0)

# TO DO (2-1) ends here
plt.imshow(visualized_img)
plt.show()

Mopological image processing

https://swprog.tistory.com/entry/Mathematical-morphology-%EB%AA%A8%ED%8F%B4%EB%A1%9C%EC%A7%80-%EC%97%B0%EC%82%B0

dilation, erosion 등 mopology 연산으로 blur 같은 효과를 대신

필수과제 3. Classification to Segmentation

Convolution layer의 output size

the output value of the layer with input size $(N, C_{in}, H, W)$ and output $(N, C_{out}, H_{out}, W_{out})$

$N$ : a batch size
$C$ : a number of channels,
$H$ : a height of input planes in pixels
$W$ : width in pixels

출처: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html?highlight=nn%20conv2d#torch.nn.Conv2d

필수과제4. cGAN

argparse 사용법

https://greeksharifa.github.io/references/2019/02/12/argparse-usage/

argparse 오류

SystemExit: 2
/Users/username/lib/python3.6/site-packages/IPython/core/interactiveshell.py:2971: UserWarning: To exit: use 'exit', 'quit', or Ctrl-D.
  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)

Jupyter notebook에서는 argparse 사용 불가 --> easydict를 사용!

출처: https://worthpreading.tistory.com/56 [Worth spreading]

Selective Search - import cv2 에러

https://stackoverflow.com/questions/55313610/importerror-libgl-so-1-cannot-open-shared-object-file-no-such-file-or-directo

혹시 ImportError: libGL.so.1: cannot open shared object file: No such file or directory

에러가 뜬다면 위의 링크에서 처럼

$ sudo apt-get install libgl1-mesa-glx

를 실행

성능 개선

MSE loss는 가장 안일한 선택을 함 --> BCE loss로 바꿔보면 어떨까?

Train Epoch: [39/40] Step: [400/844] G loss: 5.83592  D loss: 0.00269  
Train Epoch: [39/40] Step: [500/844] G loss: 7.50872  D loss: 0.00075  
Train Epoch: [39/40] Step: [600/844] G loss: 4.62987  D loss: 0.00537  
Train Epoch: [39/40] Step: [700/844] G loss: 4.29126  D loss: 0.00728  
Train Epoch: [39/40] Step: [800/844] G loss: 4.29845  D loss: 0.01423

D loss가 좀 줄긴 했는데 그래도 구분은 안됨

inputdata가 1채널이니까 output을 3->1채널로 바꾸면 효과가 있지 않을까??
3채널을 쓴 이유: cv2로 읽어오면 자동으로 3채널이라 그대로 쓴듯
data[:, :, 0]로 첫번째 채널만 input으로 넣어서 학습시킴

선택과제 1. CNN Visualization

param.numel()

module.named_parameters(), module.parameters()로 return된 parameter의 총 갯수는 parameter.numel()로 확인 가능

for _, param in module.named_parameters():
    param_size = param.numel()
    print(param.size(), param_size)
# Number of parameters in customed-VGG11: 9229575

선택과제 2. Body Landmark Localization using Hourglass Network

한량

놀고 먹으면서 개발하기

이전 포스트