๐ŸŽฒ[AI] PyTorch๋กœ ํ…์„œ ๊ฐ€์ง€๊ณ  ๋†€๊ธฐ

manduยท2025๋…„ 4์›” 20์ผ

[AI]

๋ชฉ๋ก ๋ณด๊ธฐ
3/20

1. ํŒŒ์ดํ† ์น˜(PyTorch)


1.1 PyTorch๋ž€?

๋”ฅ๋Ÿฌ๋‹์„ ์œ„ํ•œ ์˜คํ”ˆ์†Œ์Šค ๋จธ์‹ ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ by Meta(facebook)

  • ๋„ˆ๋ฌด๋‚˜๋„ ํ›Œ๋ฅญํ•œ Document
  • Pythonicํ•œ ๋ฌธ๋ฒ•๊ณผ ๋™์  ๊ณ„์‚ฐ ๊ทธ๋ž˜ํ”„(Dynamic Computational Graph) ์ง€์›์œผ๋กœ ์ง๊ด€์ ์ธ ์ฝ”๋“œ ์ž‘์„ฑ์ด ๊ฐ€๋Šฅ
  • PyTorch์˜ Tensor๋Š” NumPy ๋ฐฐ์—ด๊ณผ ๋งค์šฐ ์œ ์‚ฌ, GPU ์—ฐ์‚ฐ์„ ์ง€์›
  • GPU ์—ฐ๋™์„ ํ†ตํ•ด ๋Œ€๊ทœ๋ชจ ๋ชจ๋ธ๋„ ๋น ๋ฅด๊ฒŒ ํ•™์Šตํ•  ์ˆ˜ ์žˆ์Œ
  • ๋™์ ์œผ๋กœ back-propagation ๊ฒฝ๋กœ ์ƒ์„ฑ ๊ฐ€๋Šฅ
  • pytorch.org/docs์˜ documnet๋ฅผ ์ ๊ทน ํ™œ์šฉํ•˜์ž!

1.2 GPU ๊ธฐ๋ณธ ๊ฐœ๋… ๋ฐ Colab์—์„œ GPU ์‚ฌ์šฉํ•˜๊ธฐ

GPU ๊ธฐ๋ณธ ๊ฐœ๋… ๋ฐ Colab์—์„œ GPU ์‚ฌ์šฉํ•˜๊ธฐ ์š”์•ฝ ์ •๋ฆฌ ๊ธ€

  • PyTorch์—์„œ๋Š” ์—ฐ์‚ฐ ๋Œ€์ƒ์ด ๋˜๋Š” ๋ชจ๋“  ํ…์„œ๊ฐ€ ๋™์ผํ•œ ์žฅ์น˜(device) ์— ์žˆ์–ด์•ผ ํ•จ!
  • GPU๋„ ์—ฌ๋Ÿฌ ๊ฐœ๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ๋Š”๋ฐ, ๋™์ผํ•œ ์žฅ์น˜์— ์žˆ์–ด์•ผ ํ•จ
๊ธฐ๋Šฅ์ฝ”๋“œ ์˜ˆ์‹œ์„ค๋ช…
GPU ์‚ฌ์šฉ ๊ฐ€๋Šฅ ์—ฌ๋ถ€ ํ™•์ธtorch.cuda.is_available()GPU ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ํ™˜๊ฒฝ์ธ์ง€ ํ™•์ธ
ํ…์„œ๋ฅผ GPU๋กœ ์ด๋™x.cuda() ๋˜๋Š” x.to('cuda')CPU โ†’ GPU
ํ…์„œ๋ฅผ CPU๋กœ ์ด๋™x.cpu() ๋˜๋Š” x.to('cpu')GPU โ†’ CPU
์žฅ์น˜ ํ™•์ธx.deviceํ…์„œ์˜ ํ˜„์žฌ ์žฅ์น˜ ์ •๋ณด ์ถœ๋ ฅ
์žฅ์น˜ ์ผ์น˜ ํ•„์ˆ˜x + y ์—ฐ์‚ฐ ์‹œx์™€ y๋Š” ๋™์ผ device์— ์žˆ์–ด์•ผ ํ•จ

2. Tensor ์†์„ฑ ๋ฐ ์ƒ์„ฑ ๋ฐฉ๋ฒ•


2.1 Tensor๋ž€?

๋‹ค์ฐจ์› ๋ฐฐ์—ด์˜ ์ž๋ฃŒ๊ตฌ์กฐ

  • ์Šค์นผ๋ผ(0์ฐจ์›), ๋ฒกํ„ฐ(1์ฐจ์›), ํ–‰๋ ฌ(2์ฐจ์›) ๋“ฑ ๋ชจ๋“  ์ˆ˜ํ•™์  ๋ฐ์ดํ„ฐ๋ฅผ ์ผ๋ฐ˜ํ™”ํ•œ ๊ตฌ์กฐ
  • PyTorch์—์„œ์˜ ํ…์„œ(tensor)๋Š” ๊ธฐ๋Šฅ์ ์œผ๋กœ ๋„˜ํŒŒ์ด(NumPy)์™€ ๋งค์šฐ ์œ ์‚ฌ
  • GPU ์—ฐ์‚ฐ(๋ณ‘๋ ฌ ์—ฐ์‚ฐ)์ด ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ์ ์ด NumPy ๋ฐฐ์—ด๊ณผ์˜ ํฐ ์ฐจ์ด์ 
  • PyTorch์—์„œ๋Š” ํ…์„œ๋ฅผ ์‚ฌ์šฉํ•ด ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์˜ ์ž…๋ ฅ, ์ถœ๋ ฅ, ๊ฐ€์ค‘์น˜ ๋“ฑ์„ ํ‘œํ˜„
  • PyTorch์˜ ํ…์„œ๋Š” ์ž๋™ ๋ฏธ๋ถ„(autograd) ๊ธฐ๋Šฅ์„ ์ œ๊ณต
  • ์˜ˆ์‹œ 1: ๊ฐ์ •๋ถ„์„ ๋ฐ์ดํ„ฐ,
    • |x| = (#s, #w, #f) / (#img, h, w)
    • |xi| = (#w, #f) / (h, w)
    • |xi,j| = (#f) / (w)
  • ์˜ˆ์‹œ 2: ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ(RGB) (4์ฐจ์›)
    • |x| = (#img, #ch h, w)
    • |xi| = (#ch, h, w) (ํ•œ ์žฅ์˜ ์ด๋ฏธ์ง€, 3์ฐจ์›)

2.2 ํ…์„œ์˜ ์†์„ฑ

  • ํ…์„œ์˜ ๊ธฐ๋ณธ ์†์„ฑ์œผ๋กœ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ฒƒ๋“ค์ด ์žˆ๋‹ค.

    • ๋ชจ์–‘ (shape)
    • ๋ฐ์ดํ„ฐ ํ˜•์‹ (data type)
    • ์ €์žฅ๋œ ์žฅ์น˜ (device)
  • Pandas Dataframe๊ณผ ๋งค์šฐ ์œ ์‚ฌ.

import torch

tensor = torch.rand(3, 4)  # 3 by 4

print(tensor)
print(f"Shape: {tensor.shape}")  # ์ฐจ์› ๊ฒฝ์šฐ์—๋Š” ๊ฐ€์žฅ ๋ฐ”๊นฅ ๋Œ€๊ด„ํ˜ธ๋ถ€ํ„ฐ ์„ธ๋ฉด ๋จ
print(f"Data type: {tensor.dtype}")  # ๊ธฐ๋ณธ์ ์œผ๋กœ float
print(f"Device: {tensor.device}")    # ๊ธฐ๋ณธ์ ์œผ๋กœ CPU

x = torch.FloatTensor([[[1, 2, 3],
                        [3, 4, 5]],
                       [[5, 6, 7],
                        [7, 8, 9]],
                       [[9, 10, 11],
                        [11, 12, 13]]])
print(x.shape)  #torch.Size([3, 2, 3]) โ†’ ์ œ์ผ ํฐ ๊ป๋ฐ๊ธฐ์˜ ์ฐจ์›์ด ๋งจ ์•ž์œผ๋กœ
                        

2.3 ํ…์„œ ์ดˆ๊ธฐํ™”

  • ๋ฆฌ์ŠคํŠธ ๋ฐ์ดํ„ฐ์—์„œ ์ง์ ‘ ํ…์„œ๋ฅผ ์ดˆ๊ธฐํ™”ํ•  ์ˆ˜ ์žˆ๋‹ค.
data = [
  [1, 2],
  [3, 4]
]

x = torch.tensor(data)
print(x)
ft = torch.FloatTensor(data)
print(ft)
lt = torch.LongTensor(data)
print(lt)
bt = torch.ByteTensor(data) #0๋ถ€ํ„ฐ 255๊นŒ์ง€์˜ ๋ฒ”์œ„ โ†’ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰ ์ ˆ์•ฝ
print(bt)
bool_tensor = torch.tensor([True, False, True, False], dtype=torch.bool)
print(bool_tesnor)
  • NumPy ๋ฐฐ์—ด์—์„œ ํ…์„œ๋ฅผ ์ดˆ๊ธฐํ™”ํ•  ์ˆ˜ ์žˆ๋‹ค.
a = torch.tensor([5])
b = torch.tensor([7])

c = (a + b).numpy()
print(c)
print(type(c))

result = c * 10
tensor = torch.from_numpy(result)
print(tensor)
print(type(tensor))

2.4 ๋‹ค๋ฅธ ํ…์„œ๋กœ๋ถ€ํ„ฐ ํ…์„œ ์ดˆ๊ธฐํ™”ํ•˜๊ธฐ

  • ๋‹ค๋ฅธ ํ…์„œ์˜ ์ •๋ณด๋ฅผ ํ† ๋Œ€๋กœ ํ…์„œ๋ฅผ ์ดˆ๊ธฐํ™”ํ•  ์ˆ˜ ์žˆ๋‹ค.
  • ์†์„ฑ: ๋ชจ์–‘(shape), ์ž๋ฃŒํ˜•(dtype) ๋“ฑ ๋ณต์‚ฌ ๊ฐ€๋Šฅ
x = torch.tensor([
    [5, 7],
    [1, 2]
])

# x์™€ ๊ฐ™์€ ๋ชจ์–‘, ๊ฐ’์ด 0์œผ๋กœ ์ฑ„์›Œ์ง„ ํ…์„œ
x_zeros = torch.zeros_like(x)
print(x_zeros)

# x์™€ ๊ฐ™์€ ๋ชจ์–‘, ๊ฐ’์ด 1์ธ ํ…์„œ ์ƒ์„ฑ
x_ones = torch.ones_like(x)
print(x_ones)

# x์™€ ๊ฐ™์€ ๋ชจ์–‘, ์ดˆ๊ธฐํ™”๋œ ๊ฐ’์€ ๋ฌด์ž‘์œ„ (๋ฉ”๋ชจ๋ฆฌ ์ƒํƒœ์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์ง)
x_empty = torch.empty_like(x)
print(x_empty)

# x์™€ ๊ฐ™์€ ๋ชจ์–‘, ๋ชจ๋“  ๊ฐ’์„ ์ง€์ •๋œ ๊ฐ’(์˜ˆ: 42)์œผ๋กœ ์ฑ„์›€
x_full = torch.full_like(x, fill_value=42)
print(x_full)

# x์™€ ๊ฐ™์€ ๋ชจ์–‘, ์ž๋ฃŒํ˜•์€ float์œผ๋กœ ๋ณ€๊ฒฝํ•˜๊ณ  ๋žœ๋ค ๊ฐ’ ์ƒ์„ฑ
x_rand = torch.rand_like(x, dtype=torch.float32)  # uniform distribution [0, 1)
print(x_rand)

2.5 randperm: Random Permutation

x = torch.randperm(10)

print(x) # tensor([8, 1, 0, 5, 6, 4, 3, 7, 2, 9])

3. ํ…์„œ์˜ ํ˜•๋ณ€ํ™˜ ๋ฐ ์ฐจ์› ์กฐ์ž‘

  • ํ…์„œ๋Š” ๋„˜ํŒŒ์ด(NumPy) ๋ฐฐ์—ด์ฒ˜๋Ÿผ ์กฐ์ž‘ํ•  ์ˆ˜ ์žˆ๋‹ค.
  • Pandas๋ž‘ ๊ต‰์žฅํžˆ ๋น„์Šท

3.1 ํ…์„œ ์ผ๋ถ€๋ถ„ ์ ‘๊ทผ (Indexing, slicing)

  • ํ…์„œ์˜ ์›ํ•˜๋Š” ์ฐจ์›์— ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ๋‹ค.
  • ํŒŒ์ด์ฌ, Pandas์—์„œ ์‚ฌ์šฉํ•˜๋Š” ์ธ๋ฑ์‹ฑ, ์Šฌ๋ผ์ด์‹ฑ ๊ธฐ๋ฒ• ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉ ๊ฐ€๋Šฅ
tensor = torch.tensor([
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12]
])

print(tensor[0])         # ์ฒซ ๋ฒˆ์งธ ํ–‰, tensor[0], :)๋„ ๊ฐ€๋Šฅ
print(tensor[:, 0])      # ์ฒซ ๋ฒˆ์งธ ์—ด
print(tensor[..., -1])   # ๋งˆ์ง€๋ง‰ ์—ด ...(ellipsis, ์ƒ๋žต)๋Š” ๋ชจ๋“  ์•ž ์ฐจ์›์„ ์˜๋ฏธ, PyTorch์—์„œ ๊ณ ์ฐจ์› ๋ฐฐ์—ด์„ ๋‹ค๋ฃฐ ๋•Œ ์ž์ฃผ ์“ฐ์ด๋Š” ์Šฌ๋ผ์ด์‹ฑ ๋ฌธ๋ฒ•


x = torch.FloatTensor([[[1, 2],
                        [3, 4]],
                       [[5, 6],
                        [7, 8]],
                       [[9, 10],
                        [11, 12]]])
print(x.size())

print(x[0])
print(x[0, :])
print(x[0, :, :])

print(x[1:3, :, :].size()) #indexing์ด ์•„๋‹ˆ๋ผ sliciing์œผ๋กœ ํ•˜๋ฉด dimension reduction ๋ฐœ์ƒ x
print(x[:, :1, :].size())
print(x[:, :-1, :].size())

3.2 ํ…์„œ ์ž๋ฅด๊ธฐ (Split)

x = torch.FloatTensor(10, 4)
splits = x.split(4, dim=0)
for s in splits:
    print(s.size())
# torch.Size([4, 4])
# torch.Size([4, 4])
# torch.Size([2, 4])    

x = torch.FloatTensor(8, 4)
chunks = x.chunk(3, dim=0)
for c in chunks:
    print(c.size())
# torch.Size([3, 4])
# torch.Size([3, 4])
# torch.Size([2, 4])    

3.3 ํ…์„œ ์ด์–ด๋ถ™์ด๊ธฐ (Concatenate)

  • ๋‘ ํ…์„œ๋ฅผ ์ด์–ด ๋ถ™์—ฌ ์—ฐ๊ฒฐํ•˜์—ฌ ์ƒˆ๋กœ์šด ํ…์„œ๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค.
  • pandas์˜ concat๊ณผ ์œ ์‚ฌ (axis โ‰ˆ dim)
tensor = torch.tensor([
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12]
])

result = torch.cat([tensor, tensor, tensor], dim=0)  #dim์€ ์ฐจ์› ๋‚˜์˜ค๋Š” ์ˆœ์„œ == ๊ฐ€์žฅ ๋ฐ”๊นฅ ๋Œ€๊ด„ํ˜ธ๋ถ€ํ„ฐ
print(result)  # ํ–‰ ๋ฐฉํ–ฅ์œผ๋กœ ๋ถ™์ด๊ธฐ

result = torch.cat([tensor, tensor, tensor], dim=1)
print(result)  # ์—ด ๋ฐฉํ–ฅ์œผ๋กœ ๋ถ™์ด๊ธฐ

3.4 ํ…์„œ ์Œ“๊ธฐ (Stack)

  • ์›๋ฆฌ๋Š” ์ง€์ •ํ•œ ์ฐจ์›์— ๋Œ€ํ•ด unsqueeze๋ฅผ ํ•œ๋ฒˆํ•˜๊ณ  concat์„ ํ•˜๊ฒŒ ๋˜๋Š” ๊ฒƒ
  • stack์€ for๋ฌธ๊ณผ ํ•จ๊ป˜ ์ง„์งœ ๋งŽ์ด ์“ฐ์ด๊ฒŒ ๋œ๋‹ค (pd.concat์ฒ˜๋Ÿผ)

    unsqueeze: ์ฐจ์›์ด 1์ธ ์ถ•(axis)์„ ์ƒ์„ฑํ•˜๋Š” ํ•จ์ˆ˜

    • ์ง€์ •ํ•œ ์œ„์น˜(dim)์— ํฌ๊ธฐ 1์ธ ์ฐจ์›์„ ์ถ”๊ฐ€ํ•˜๋Š” ํ•จ์ˆ˜

    squeeze: ์ฐจ์›์ด 1์ธ ์ถ•(axis)์„ ์ œ๊ฑฐํ•˜๋Š” ํ•จ์ˆ˜

    • dim ์ง€์ • ์•ˆ ํ•˜๋ฉด ๋ชจ๋“  1์งœ๋ฆฌ ์ฐจ์›์„ ๋‹ค ์ œ๊ฑฐํ•จ.
x = torch.FloatTensor([[1, 2, 3],
                       [4, 5, 6],
                       [7, 8, 9]])
y = torch.FloatTensor([[10, 11, 12],
                       [13, 14, 15],
                       [16, 17, 18]])

print(x.size(), y.size())

z = torch.stack([x, y])
print(z.size()) # torch.Size([2, 3, 3])
 # z = torch.cat([x.unsqueeze(0), y.unsqueeze(0)], dim=0)์™€ ๋™์ผํ•œ ๊ฒƒ

z = torch.stack([x, y], dim=-1)
print(z.size()) # torch.Size([3, 3, 2])

z = torch.stack([x, y], dim=1)
print(z.size()) # torch.Size([3, 2, 3])



3.5 ํ…์„œ ๋ณต์ œํ•ด์„œ ์ฐจ์› ๋„“ํžˆ๊ธฐ (Expand)

  • broadcasting์˜ ์›๋ฆฌ๊ฐ€ ์ด expand๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ
x = torch.FloatTensor([[[1, 2]],
                       [[3, 4]]])
print(x.size()) # torch.Size([2, 1, 2])

y = x.expand(*[2, 3, 2])

print(y)
print(y.size()) # torch.Size([2, 3, 2])

3.5 ํ…์„œ ํ˜•๋ณ€ํ™˜ (Type Casting)

  • .float(), .long()์ฒ˜๋Ÿผ ์ถ•์•ฝ ๋ฉ”์„œ๋“œ๋กœ ๋ณ€ํ™˜ ๊ฐ€๋Šฅ
  • pandas์˜ astype() ํ•˜๊ณ  ๋น„์Šทํ•˜๋„ค
a = torch.tensor([2], dtype=torch.int)
b = torch.tensor([5.0])

print(a.dtype)
print(b.dtype)

print(a.float())
print(b.int())

print(a + b)  # ์ž๋™ ํ˜•๋ณ€ํ™˜ (a๊ฐ€ float์œผ๋กœ ๋ฐ”๋€œ)
print(a + b.type(torch.int32))  # b๋ฅผ int32๋กœ ๋ฐ”๊ฟ”์„œ ์—ฐ์‚ฐ

3.6 ํ…์„œ์˜ ๋ชจ์–‘ ๋ณ€๊ฒฝ (View, Reshape)

  • view()๋‚˜ reshape()๋Š” ํ…์„œ์˜ ๋ชจ์–‘์„ ๋ณ€๊ฒฝํ•  ๋•Œ ์‚ฌ์šฉํ•œ๋‹ค.
  • ์ด๋•Œ, ํ…์„œ(tensor)์˜ ์ˆœ์„œ๋Š” ๋ณ€๊ฒฝ๋˜์ง€ ์•Š๋Š”๋‹ค.
a = torch.tensor([1, 2, 3, 4, 5, 6, 7, 8])
b = a.view(4, 2)
b = a.reshape(4, 2)
print(b.shape)
b = b.reshape(2,-1)
print(b.shape)

a[0] = 7
print(b) # ์–•์€ ๋ณต์‚ฌ ๋ฌธ์ œ ๋ฐœ์ƒ

c = a.clone().view(4, 2)   # ๊นŠ์€ ๋ณต์‚ฌ๋กœ๋Š” ํ•™์Šต ๊ทธ๋ž˜ํ”„ ๋Š์œผ๋ฉด์„œ ์นดํ”ผํ•  ์ˆ˜ ์žˆ๋Š” clone ์ œ์ผ ๋งŽ์ด ์‚ฌ์šฉ
a[0] = 9
print(c)

3.7 ํ…์„œ์˜ ์ฐจ์› ๊ตํ™˜ (Permute)

  • ํ•˜๋‚˜์˜ ํ…์„œ์—์„œ ํŠน์ •ํ•œ ์ฐจ์›๋ผ๋ฆฌ ์ˆœ์„œ๋ฅผ ๊ต์ฒดํ•  ์ˆ˜ ์žˆ๋‹ค.
a = torch.rand((64, 32, 3))
print(a.shape)

b = a.permute(2, 1, 0)
print(b.shape)

3.8 ์ฐจ์› ๋Š˜๋ฆฌ๊ธฐ / ์ค„์ด๊ธฐ (Squeeze)

  • unsqueeze() ํ•จ์ˆ˜๋Š” ํฌ๊ธฐ๊ฐ€ 1์ธ(ํ•ด๋‹น ์ฐจ์›์— ์›์†Œ๊ฐ€ 1๊ฐœ๋งŒ ์žˆ๋Š”) ์ฐจ์› ํ•˜๋‚˜๋ฅผ ์ถ”๊ฐ€ํ•œ๋‹ค.
    โ†’ ๋ฐฐ์น˜(batch) ์ฐจ์›์„ ์ถ”๊ฐ€ํ•˜๊ธฐ ์œ„ํ•œ ๋ชฉ์ ์œผ๋กœ ํ”ํžˆ ์‚ฌ์šฉ๋œ๋‹ค.
  • squeeze() ํ•จ์ˆ˜๋Š” ํฌ๊ธฐ๊ฐ€ 1์ธ(ํ•ด๋‹น ์ฐจ์›์— ์›์†Œ๊ฐ€ 1๊ฐœ๋งŒ ์žˆ๋Š”) ์ฐจ์›์„ ๋ชจ๋‘ ์ œ๊ฑฐํ•œ๋‹ค.

    batch: ๋”ฅ๋Ÿฌ๋‹ ํ•™์Šต ์‹œ ํ•œ ๋ฒˆ์— ๋ชจ๋ธ์— ๋„ฃ๋Š” ๋ฐ์ดํ„ฐ ๋ฌถ์Œ

a = torch.Tensor([[[[1, 2, 3, 4], [5, 6, 7, 8]]]])
print(a.shape) # torch.Size([1, 1, 2, 4])

print(a.unsqueeze(0).shape) # dimension arg ํ•„์ˆ˜
print(a.unsqueeze(3).shape)
print(a.squeeze(0).shape)
print(a.squeeze().shape) # ํฌ๊ธฐ 1์งœ๋ฆฌ ์ฐจ์› ๋ชจ๋‘ ์ œ๊ฑฐ

4. ํ…์„œ์˜ ์—ฐ์‚ฐ๊ณผ ํ•จ์ˆ˜

  • ๊ธฐ๋ณธ์ ์œผ๋กœ ์š”์†Œ๋ณ„(element-wise) ์—ฐ์‚ฐ

4.1 ํ…์„œ์˜ ์—ฐ์‚ฐ (์‚ฐ์ˆ ์—ฐ์‚ฐ)

a = torch.tensor([[1, 2], [3, 4]])
b = torch.tensor([[5, 6], [7, 8]])

# Arithmetic operations
print(a + b)
print(a - b)
print(a * b)
print(a / b)
print(a ** b)

# Logical operation
print(a == b) 

# Inplace Operation
print(a)
print(a.mul(b)) # ๊ทธ๋ƒฅ ๊ณฑํ•˜๊ธฐ
print(a)
print(a.mul_(b)) # ์ฃผ์†Œ๊ฐ’์„ ๋ฎ์–ด์”Œ์šฐ๋Š” ๊ฒƒ โ†’ ๋ฉ”๋ชจ๋ฆฌ ์ ˆ์•ฝํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๊ธด ํ•œ๋ฐ Pytorch๊ฐ€ ์ด๋ฏธ ๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™” ์ž˜ ํ•ด๋‘ 
print(a) # a ๋ฎ์–ด์”Œ์›Œ์ง

# Broadcast in Operations
x = torch.FloatTensor([[1, 2],
                       [4, 8]])
y = torch.FloatTensor([3,
                       5])

print(x.size()) # torch.Size([2, 2])
print(y.size()) # torch.Size([2])

z = x + y
print(z)
print(z.size()) # torch.Size([2, 2])

x = torch.FloatTensor([[1, 2]])
y = torch.FloatTensor([[3],
                       [5]])

print(x.size()) # torch.Size([1, 2])
print(y.size()) # torch.Size([2, 1])

z = x + y
print(z)
print(z.size()) torch.Size([2, 2])
# ์‚ฌ์‹ค scalar 1์„ ๋”ํ•˜๋Š” ๊ฒƒ๋„, 1์ด ์ฐจ์›์ด ๋งž์ถฐ์ง€๋ฉด์„œ ์ „์ฒด๋กœ ๋”ํ•ด์ง€๋Š” ๊ฒƒ
# ์—๋Ÿฌ๊ฐ€ ์•ˆ ๋‚˜๋Š” ๊ฒŒ ์ข‹์€ ๊ฒƒ๋งŒ์€ ์•„๋‹ ์ˆ˜ ์žˆ๋‹ค โ†’ ์˜๋„์น˜ ์•Š์€ broadcasting์„ ์กฐ์‹ฌํ•˜๋ผ!
# broadcasting์ด ๋ถˆ๊ฐ€๋Šฅํ•˜๋ฉด ์—๋Ÿฌ ๋ฐœ์ƒ

4.2 ํ–‰๋ ฌ๊ณฑ

print(a.matmul(b))
print(torch.matmul(a, b))

4.3 ํ‰๊ท /ํ•ฉ๊ณ„ ํ•จ์ˆ˜ (Dimension Reducing Operation)

a = torch.Tensor([[1, 2, 3, 4], [5, 6, 7, 8]])

print(a.mean())        # ์ „์ฒด ํ‰๊ท 
print(a.mean(dim=0))   # ํ–‰ ์—ฐ์‚ฐ
print(a.mean(dim=1))   # ์—ด ์—ฐ์‚ฐ

print(a.sum())
print(a.sum(dim=0))
print(a.sum(dim=1))

4.4 ์ •๋ ฌ (Sort)

x = torch.randperm(3**3).reshape(3, 3, -1)
values, indices = torch.sort(x, dim=-1, descending=True)
print(values)
print(indices)

4.5 ์ตœ๋Œ€ / ์ตœ๋Œ“๊ฐ’ ์ธ๋ฑ์Šค

  • max() ํ•จ์ˆ˜๋Š” ์›์†Œ์˜ ์ตœ๋Œ“๊ฐ’์„ ๋ฐ˜ํ™˜ํ•œ๋‹ค.
  • argmax() ํ•จ์ˆ˜๋Š” ๊ฐ€์žฅ ํฐ ์›์†Œ(์ตœ๋Œ“๊ฐ’)์˜ ์ธ๋ฑ์Šค๋ฅผ ๋ฐ˜ํ™˜ํ•œ๋‹ค.
  • min์— ๋Œ€ํ•ด์„œ๋Š” min(), argmin()์„ ์‚ฌ์šฉํ•˜๋ฉด ๋จ
print(a.max())
print(a.max(dim=0)) ์–ด๋–ค ์ฐจ์›์„ ๊ธฐ์ค€์œผ๋กœ ์ˆ˜ํ–‰ํ• ์ง€
print(a.max(dim=1))

print(a.argmax())
print(a.argmax(dim=0))
print(a.argmax(dim=1))

# print(x)
x = torch.tensor([
       [[18,  9, 25],
        [ 0, 16,  8],
        [24, 20, 14]],

       [[ 1,  4, 17],
        [ 2, 22,  7],
        [ 5, 10, 12]],

       [[15, 13, 23],
        [ 3, 21, 19],
        [26,  6, 11]]])
print(x.size())
# torch.Size([3, 3, 3])


y = x.argmax(dim=-1)

print(y)
# tensor([[2, 1, 0],
#         [2, 1, 2],
#         [2, 1, 0]])

print(y.size())
# torch.Size([3, 3])

4.6 ์ตœ๋Œ€๊ฐ’๊ณผ ์ธ๋ฑ์Šค ๋ฝ‘๊ธฐ (topk)

values, indices = torch.topk(x, k=1, dim=-1)

print(values.size()) # torch.Size([3, 3, 1])  # k=1์ด์—ฌ๋„ ์ฐจ์›์ด ์‚ด์•„์žˆ์Œ
print(indices.size()) # torch.Size([3, 3, 1])

_, indices = torch.topk(x, k=2, dim=-1)
print(indices.size()) torch.Size([3, 3, 2])

print(x.argmax(dim=-1) == indices[:, :, 0])
# tensor([[True, True, True],
#         [True, True, True],
#         [True, True, True]])

# Top K๋กœ Sortingํ•˜๊ธฐ
target_dim = -1
values, indices = torch.topk(x,
                             k=x.size(target_dim),
                             largest=True)

print(values)

4.7 ๋งˆ์Šคํ‚นํ•˜๊ณ  ๊ฐ’ ์ฑ„์šฐ๊ธฐ (where or Masked fill)




x = torch.FloatTensor([i for i in range(3**2)]).reshape(3, -1)


# ๋ฐฉ๋ฒ• 1: torch.where(์กฐ๊ฑด, ์ฐธ์ผ ๋•Œ, ๊ฑฐ์ง“์ผ ๋•Œ)
y = torch.where(x > 4, torch.tensor(-1.0), x)
print(y)

# ๋ฐฉ๋ฒ• 2: tensor ํด๋ž˜์Šค์˜ masked_fill ๋‚ด์žฅ ๋ฉ”์„œ๋“œ ์‚ฌ์šฉ 
mask = x > 4
print(mask)
y = x.masked_fill(mask, value=-1)

print(y)

5. ๊ธฐ์šธ๊ธฐ(Gradient)์™€ ์ž๋™ ๋ฏธ๋ถ„(Autograd)


5.1 ๊ธฐ์šธ๊ธฐ ๊ฐœ๋…์ด ์™œ ํ•„์š”ํ• ๊นŒ?

  • ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์€ ๊ฒฐ๊ตญ ์ˆ˜๋งŽ์€ ํŒŒ๋ผ๋ฏธํ„ฐ(๊ฐ€์ค‘์น˜)๋ฅผ ์กฐ์ ˆํ•ด์„œ ์˜ˆ์ธก์„ ๋” ์ •ํ™•ํ•˜๊ฒŒ ๋งŒ๋“œ๋Š” ๊ฒŒ ๋ชฉ์ 
  • ์ฆ‰, ์†์‹ค ํ•จ์ˆ˜(loss) ๊ฐ’์„ ์ค„์ด๊ธฐ ์œ„ํ•ด ๋ชจ๋ธ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ(weight) ๋ฅผ ์—…๋ฐ์ดํŠธํ•ด์•ผ ํ•จ
  • ์ด๋•Œ, ๊ธฐ์šธ๊ธฐ(gradient) ๊ฐ€ ํ•„์š” โ†’ ์–ด๋А ๋ฐฉํ–ฅ์œผ๋กœ, ์–ผ๋งˆ๋‚˜ ๋ฐ”๊ฟ”์•ผ ํ• ๊นŒ?

    โ— ๊ธฐ์šธ๊ธฐ๋ž€?
    ์–ด๋–ค ๊ฐ’(x)์„ ์กฐ๊ธˆ ๋ฐ”๊ฟจ์„ ๋•Œ, ์ถœ๋ ฅ(y)์ด ์–ผ๋งˆ๋‚˜ ๋ฐ”๋€Œ๋Š”์ง€๋ฅผ ์•Œ๋ ค์ฃผ๋Š” ๋ฐฉํ–ฅ + ๋ฏผ๊ฐ๋„


5.2 ๋”ฅ๋Ÿฌ๋‹ ํ•™์Šต ํ๋ฆ„

  1. ์ˆœ์ „ํŒŒ (forward)

    • ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ๋„ฃ๊ณ , ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ๋ƒ„ (output = model(input))
  2. ์†์‹ค ๊ณ„์‚ฐ (loss)

    • ์˜ˆ์ธก๊ฐ’๊ณผ ์ •๋‹ต(label)์˜ ์ฐจ์ด ๊ณ„์‚ฐ
  3. ์—ญ์ „ํŒŒ (backward) โ† โ˜… ์—ฌ๊ธฐ์„œ autograd๊ฐ€ ๋“ฑ์žฅ!

    • .backward()๋ฅผ ํ˜ธ์ถœํ•˜๋ฉด PyTorch๊ฐ€ ์ž๋™์œผ๋กœ ๋ชจ๋“  ํŒŒ๋ผ๋ฏธํ„ฐ์— ๋Œ€ํ•œ ๊ธฐ์šธ๊ธฐ ๊ณ„์‚ฐ
  4. ๊ฐ€์ค‘์น˜ ์—…๋ฐ์ดํŠธ

    • Optimizer๊ฐ€ ๊ธฐ์šธ๊ธฐ๋ฅผ ๋ณด๊ณ  ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์—…๋ฐ์ดํŠธ

5.3 ์ž๋™ ๋ฏธ๋ถ„(Autograd)

  • ์ˆ˜์‹ ๋”ฐ๋กœ ๋ฏธ๋ถ„ ์•ˆ ํ•ด๋„ ๋จ! (์ˆ˜์ฒœ๋งŒ ๊ฐœ ํŒŒ๋ผ๋ฏธํ„ฐ๋„ ์ž๋™์œผ๋กœ ์ฒ˜๋ฆฌ)
  • ์—ฐ์‚ฐ์„ ์ถ”์ ํ•˜๋Š” ๊ณ„์‚ฐ ๊ทธ๋ž˜ํ”„(computational graph) ๋ฅผ ๋งŒ๋“ค์–ด์„œ .backward() ํ•œ ๋ฒˆ์œผ๋กœ ๋
  • PyTorch๋Š” ๋™์  ๊ทธ๋ž˜ํ”„ ๊ธฐ๋ฐ˜์ด๋ผ ์ง๊ด€์ ์ด๊ณ  ์œ ์—ฐํ•จ
    • ๋™์  ๊ทธ๋ž˜ํ”„ ๋ฐฉ์‹: ์—ฐ์‚ฐ ๊ทธ๋ž˜ํ”„๋ฅผ ๋ฏธ๋ฆฌ ๋งŒ๋“ค์–ด๋‘์ง€ ์•Š๊ณ , ์—ฐ์‚ฐ์ด ์ˆ˜ํ–‰๋  ๋•Œ๋งˆ๋‹ค ๊ณ„์‚ฐ ๊ทธ๋ž˜ํ”„๋ฅผ ์ฆ‰์‹œ ๋งŒ๋“ค์–ด์ฃผ๋Š” ๋ฐฉ์‹
  • leaf variable: ์—ฐ์‚ฐ์„ ํ†ตํ•ด ๋งŒ๋“ค์–ด์ง€์ง€ ์•Š๊ณ , requires_grad=True๋กœ ์ƒ์„ฑ๋œ ์‚ฌ์šฉ์ž ์ •์˜ ํ…์„œ (์ฃผ๋กœ Input parameter)
x = torch.tensor([3.0, 4.0], requires_grad=True)
y = torch.tensor([1.0, 2.0], requires_grad=True)
z = x + y

print(z)
print(z.grad_fn)

out = z.mean()
print(out)
print(out.grad_fn)

out.backward() # scalar์— ๋Œ€ํ•˜์—ฌ ๊ฐ€๋Šฅ, not scalar ์ถœ๋ ฅ์ผ ๋•Œ๋Š” ๋ฐ˜๋“œ์‹œ gradient๋ฅผ ์ค˜์•ผ ํ•จ (out.shape๊ณผ ๋™์ผํ•ด์•ผ ํ•จ)
print(x.grad) # x๊ฐ’์ด ๋ฐ”๋€” ๋•Œ z๋Š” ์–ผ๋งˆ๋‚˜ ๋ฐ”๋€Œ๋Š”์ง€ 
print(y.grad) # y๊ฐ’์ด ๋ฐ”๋€” ๋•Œ z๋Š” ์–ผ๋งˆ๋‚˜ ๋ฐ”๋€Œ๋Š”์ง€ 
print(z.grad) # leaf variable(์ดˆ๊ธฐ ๋ณ€์ˆ˜)์— ๋Œ€ํ•ด์„œ๋งŒ gradient ์ถ”์ ์ด ๊ฐ€๋Šฅํ•˜๋‹ค. ๋”ฐ๋ผ์„œ None.
  • ํ•™์Šต ์‹œ์—๋Š” requires_grad=True๋กœ ๊ธฐ์šธ๊ธฐ ์ถ”์ 
  • ์ถ”๋ก (inference) ์‹œ์—๋Š” ๋ถˆํ•„์š”ํ•œ ์ถ”์  ๋ฐฉ์ง€ โ†’ with torch.no_grad()
temp = torch.tensor([3.0, 4.0], requires_grad=True)
print(temp.requires_grad)
print((temp ** 2).requires_grad)

# ๊ธฐ์šธ๊ธฐ ์ถ”์ ์„ ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ๊ณ„์‚ฐ ์†๋„๊ฐ€ ๋” ๋น ๋ฅด๋‹ค.
with torch.no_grad():
    temp = torch.tensor([3.0, 4.0], requires_grad=True)
    print(temp.requires_grad)
    print((temp ** 2).requires_grad)
profile
๋งŒ๋‘๋Š” ๋ชฉ๋ง๋ผ

0๊ฐœ์˜ ๋Œ“๊ธ€