[Week 2] Pytorch

ํ˜œ ์ฝฉยท2022๋…„ 9์›” 27์ผ
0
post-thumbnail

๐Ÿšฉ Pytorch functions


  • numpy์˜ np.array์™€ torch์˜ tensor๋Š” ๋น„์Šทํ•˜๋‹ค.
  • numpy์˜ ๋ฌธ๋ฒ•์ด pytorch์—์„œ ๊ทธ๋Œ€๋กœ ์ ์šฉ๋œ๋‹ค.
  • pytorch์˜ tensor๋Š” GPU์— ์˜ฌ๋ ค์„œ ์‚ฌ์šฉ ๊ฐ€๋Šฅ
  • x_data.device : ํ˜„์žฌ ๋ฐ์ดํ„ฐ๊ฐ€ ์–ด๋””์— ์˜ฌ๋ผ์™€์žˆ๋Š”์ง€ ํ™•์ธ


๐Ÿ Tensor handling

  • view: reshape()์™€ ๋™์ผํ•˜๊ฒŒ tensor์˜ shape๋ฅผ ๋ณ€ํ™˜
  • squeeze: ์ฐจ์›์˜ ๊ฐœ์ˆ˜๊ฐ€ 1์ธ ์ฐจ์›์„ ์‚ญ์ œ (์••์ถ•)
  • unsqueeze: ์ฐจ์›์˜ ๊ฐœ์ˆ˜๊ฐ€ 1์ธ ์ฐจ์›์„ ์ถ”๊ฐ€
  • unsqueeze(0): index 0์— 1 ์ถ”๊ฐ€ // 2d tensor(2x2) โ†’ 3d tensor(1x2x2)
  • unsqueeze(1): index 1์— 1 ์ถ”๊ฐ€ // 2d tensor(2x2) โ†’ 3d tensor(2x1x2)


๐Ÿ Tensor operation

  • ํ–‰๋ ฌ๊ณฑ์…ˆ : mm or matmul ํ•จ์ˆ˜ ์‚ฌ์šฉ (ํ–‰๋ ฌ ๊ฐ„์˜ ์—ฐ์‚ฐ)
  • ํ–‰๋ ฌ๋‚ด์  : dot ํ•จ์ˆ˜ ์‚ฌ์šฉ (๋ฒกํ„ฐ ๊ฐ„์˜ ์—ฐ์‚ฐ)
  • ํ–‰๋ ฌ ๊ฐ„์˜ ์‚ฌ์ด์ฆˆ๊ฐ€ ๋งž์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— mm์„ ์‹คํ–‰ํ•˜๋ฉด ์—๋Ÿฌ๊ฐ€ ๋œฌ๋‹ค.
  • ์ด๋•Œ, matmul๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ์ž๋™์œผ๋กœ broadcasting์ด ์ผ์–ด๋‚˜๊ธฐ ๋•Œ๋ฌธ์— b๋ฅผ ์ž๋™์œผ๋กœ (3,1)๋กœ ๋งž์ถฐ ๊ณ„์‚ฐํ•˜๊ฒŒ ๋œ๋‹ค.
  • matmul์€ ์ด๋Ÿฐ ๊ผด๊ณผ ๊ฐ™๋‹ค.


๐Ÿ Tensor operation for ML/DL formula

  • nn.functional ๋ชจ๋“ˆ์„ ํ†ตํ•ด ๋‹ค์–‘ํ•œ ์ˆ˜์‹ ๋ณ€ํ™˜์„ ์ง€์›
import torch
import torch.nn.functional as F

tensor = torch.FloatTensor([0.5, 0.7, 0.1])
h_tensor = F.softmax(tensor, dim=0)

๊ตฌ๊ธ€ ์ฝ”๋žฉ .ipynb ํŒŒ์ผ ; https://drive.google.com/open?id=1nlT2Fq-vURe5aywOZ0VHuqa5D0tydL2u



๐Ÿ index_select

x = torch.randn(3, 4)
# tensor([[ 0.1427,  0.0231, -0.5414, -1.0009],
#         [-0.4664,  0.2647, -0.1228, -1.1068],
#         [-1.1734, -0.6571,  0.7230, -0.6004]])

indices = torch.tensor([0, 2])
torch.index_select(x, 0, indices)
#  tensor([[ 0.1427,  0.0231, -0.5414, -1.0009],
#          [-1.1734, -0.6571,  0.7230, -0.6004]])

๐Ÿ gather

  • tensor์—์„œ ์ธ๋ฑ์Šค๋ฅผ ๊ธฐ์ค€์œผ๋กœ ํŠน์ • ๊ฐ’๋“ค์„ ์ถ”์ถœํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ
  • torch.gather([tensor],[dim],[์ถ”์ถœํ•  tensor์˜ ๋ชจ์–‘ ๋ฐ ์ธ๋ฑ์Šค])
import torch

A = torch.Tensor([[1, 2],
                  [3, 4]])

indices = torch.tensor([[0], 			# A์˜ 0ํ–‰, 0์—ด
						[1]])			# A์˜ 1ํ–‰ 1์—ด
output = torch.gather(A, 1, indices).flatten()

>>> tensor([1., 4.])
  • 1์ฐจ์›(ํ–‰ ์ฐจ์›) ๊ธฐ์ค€์œผ๋กœ indices๋ฅผ ๋ณด๋ฉด,
  • 0ํ–‰์˜ ์ธ๋ฑ์Šค 0 = 1 ย ย ย  // ย ย ย  1ํ–‰์˜ ์ธ๋ฑ์Šค 1 = 4


๐Ÿ other functions

  • torch.numel : Returns the total number of elements in the input tensor
  • torch.chunk : ์ฃผ์–ด์ง„ ํ…์„œ๋ฅผ ์›ํ•˜๋Š” ์กฐ๊ฐ(chunk)๋งŒํผ ๋‚˜๋ˆ„๊ธฐ
    ๋‹จ, ์›ํ•˜๋Š” ์กฐ๊ฐ๋ณด๋‹ค ์ ์€ ์กฐ๊ฐ์œผ๋กœ ๋‚˜๋ˆ„์–ด์งˆ ์ˆ˜๋„ ์žˆ๋‹ค. (๋„˜์น˜๋Š” ๊ฑด ์•ˆ๋˜๋‹ˆ๊นŒ!)

๐Ÿ dim์ด ๋ญ์—์š”?

  • axis์™€ dim์€ ๊ฐ™์€ ๊ฐœ๋…์œผ๋กœ ์“ฐ์ธ๋‹ค.


    axis = 0 (ํ–‰) / ํ–‰ ์ถ•์œผ๋กœ ์ž๋ฅด๊ธฐ (โ†“ ๋ฐฉํ–ฅ)
    axis = 1 (์—ด) / ์—ด ์ถ•์œผ๋กœ ์ž๋ฅด๊ธฐ (โ†’ ๋ฐฉํ–ฅ)



๐Ÿ torch.swapdims




๐Ÿ torch.Tensor.scatter_

  • scatter ์ดํ•ดํ•˜๊ธฐ



๐Ÿ torch์˜ math operations




๐Ÿšฉ Autograd & optimizer

x = torch.randn(5, 7)     # (5, 7) ํ–‰๋ ฌ์˜ ๋‚œ์ˆ˜๊ฐ’ ํ…์„œ
# ๋ฐ์ดํ„ฐ๊ฐ€ 5๊ฐœ, feature๊ฐ€ 7๊ฐœ๋ผ๋Š” ๋œป

layer = MyLiner(7, 12)      # output์˜ size๋Š” (5, 12)๊ฐ€ ๋œ๋‹ค.
  • input ๋ฒกํ„ฐ (3, 7) ํ–‰๋ ฌ โ†’ (3, 5) ํ–‰๋ ฌ๋กœ output์„ ๋ฝ‘๊ณ  ์‹ถ๋‹ค๋ฉด?
    • input์— (7, 5) ํ–‰๋ ฌ์„ ๊ณฑํ•ด์ฃผ๋ฉด ๋œ๋‹ค! ( = weight )
      in_features : 7
      out_features : 5
  • ๋งจ ์•„๋ž˜ return x @ self.weights + self.bias๋Š” y^=Wโˆ—x+b\hat{y} = W*x+b
    ์˜ˆ์ธก๊ฐ’ y๋ผ๊ณ  ์ƒ๊ฐํ•˜๋ฉด ๋œ๋‹ค.



๐Ÿ backward

  • layer์— ์žˆ๋Š” Parameter๋“ค์˜ ๋ฏธ๋ถ„์„ ์ˆ˜ํ–‰ํ•œ๋‹ค.
  • forward์˜ ๊ฒฐ๊ณผ๊ฐ’ (model์˜ output = ์˜ˆ์ธก๊ฐ’)๊ณผ ์‹ค์ œ๊ฐ’ ๊ฐ„์˜ ์ฐจ์ด(Loss)์— ๋Œ€ํ•ด ๋ฏธ๋ถ„์„ ์ˆ˜ํ–‰
  • ํ•ด๋‹น ๊ฐ’์œผ๋กœ parameter ์—…๋ฐ์ดํŠธ
for epoch in range(epochs):
	# clear gradient buffers because we don't want any gradient from previous epoch to carry forward
    optimizer.zero_grad()
    
    # get output from the model, given the outputs
    outputs = model(inputs)
    
    # get loss for the predicted output
    loss = criterion(outputs, labels)         # ์˜ˆ์ธก๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’์˜ loss๋ฅผ ๊ตฌํ•จ
    print(loss)
    # get gradients w, r, t to parameters
    # (w, r, t) -> MSE loss function's parameters
    loss.backward()							  # loss๋ฅผ w์— ๋Œ€ํ•ด ๋ฏธ๋ถ„, r์— ๋Œ€ํ•ด ๋ฏธ๋ถ„, ...
    
    # update parameters
    optimizer.step()




โœ๐Ÿป ํšŒ๊ณ 

์–ด๋ ค์› ๋˜ ๋ถ€๋ถ„:

  • dimension (์ฐจ์›) ์— ๋Œ€ํ•œ ๊ฐœ๋…์ด ํ—ท๊ฐˆ๋ ธ๊ณ  ์–ด๋ ต๊ฒŒ ๋‹ค๊ฐ€์™”๋‹ค.
  • torch.swapdims์™€ gather ํ•จ์ˆ˜, scatter_ ํ•จ์ˆ˜๊ฐ€ ์‰ฝ๊ฒŒ ์ดํ•ด๋˜์ง€ ์•Š์•˜๋‹ค.

๋‚˜์˜ ๋…ธ๋ ฅ!

  • ์—ฌ๋Ÿฌ ๋ธ”๋กœ๊ทธ ์ •๋ฆฌ๋ฅผ ๋ณด๋ฉด์„œ ๋‚ด ๋‚˜๋ฆ„๋Œ€๋กœ์˜ ์ดํ•ด๋ฅผ ํ•˜๊ณ  ๊ทธ๋ฆผ ๊ทธ๋ ค๊ฐ€๋ฉฐ ์ •๋ฆฌ๋ฅผ ํ–ˆ๋‹ค. ์‹œ๊ฐ„์ด ๊ฝค ์˜ค๋ž˜ ๊ฑธ๋ ธ์ง€๋งŒ ์ดํ•ด๋Š” ํ™•์‹คํ•˜๊ฒŒ ํ•œ ๊ฒƒ ๊ฐ™๋‹ค.
profile
๋ฐฐ์šฐ๊ณ  ์‹ถ์€๊ฒŒ ๋งŽ์€ ๊ฐœ๋ฐœ์ž๐Ÿ“š

0๊ฐœ์˜ ๋Œ“๊ธ€