๐ŸŽฒ[AI] GPU ๊ธฐ๋ณธ ๊ฐœ๋… ๋ฐ Colab์—์„œ GPU ์‚ฌ์šฉํ•˜๊ธฐ

manduยท2025๋…„ 4์›” 20์ผ

[AI]

๋ชฉ๋ก ๋ณด๊ธฐ
2/20

1. GPU

1.1 GPU(Graphic Processing Unit)๋ž€?

Graphic ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•ด ๋“ฑ์žฅํ•œ ์—ฐ์‚ฐ์žฅ์น˜๋กœ, ์ˆ˜์ฒœ ๊ฐœ์˜ ์ฝ”์–ด๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์–ด ๋ณ‘๋ ฌ ์—ฐ์‚ฐ์— ๋งค์šฐ ๊ฐ•๋ ฅํ•œ ์žฅ์น˜

GPU vs CPU

ํ•ญ๋ชฉCPU (Central Processing Unit)GPU (Graphics Processing Unit)
์ฃผ์š” ๋ชฉ์ ๋ฒ”์šฉ ์—ฐ์‚ฐ, ์‹œ์Šคํ…œ ์ œ์–ด๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ ๋ณ‘๋ ฌ ์—ฐ์‚ฐ ์ฒ˜๋ฆฌ
์ฝ”์–ด ์ˆ˜์ ์ง€๋งŒ ๊ณ ์„ฑ๋Šฅ (์ˆ˜ ๊ฐœ ~ ์ˆ˜์‹ญ ๊ฐœ)๋งŽ๊ณ  ๋‹จ์ˆœํ•œ ์ฝ”์–ด ์ˆ˜์ฒœ ๊ฐœ
์ฒ˜๋ฆฌ ๋ฐฉ์‹์ง๋ ฌ ์ฒ˜๋ฆฌ(Sequential Processing)๋ณ‘๋ ฌ ์ฒ˜๋ฆฌ(Parallel Processing)
๋‹จ์ผ ์ž‘์—… ์„ฑ๋Šฅ๊ฐ•ํ•จ์ƒ๋Œ€์ ์œผ๋กœ ์•ฝํ•จ
๋Œ€๊ทœ๋ชจ ์—ฐ์‚ฐ ์„ฑ๋Šฅ์•ฝํ•จ๋งค์šฐ ๊ฐ•ํ•จ
์‚ฌ์šฉ ์˜ˆ์‹œOS ์‹คํ–‰, ๋…ผ๋ฆฌ ์ œ์–ด, ์•ฑ ์‹คํ–‰ ๋“ฑ๋”ฅ๋Ÿฌ๋‹, ๊ทธ๋ž˜ํ”ฝ ์ฒ˜๋ฆฌ, ํ–‰๋ ฌ ์—ฐ์‚ฐ ๋“ฑ
์ ํ•ฉํ•œ ์ž‘์—…๋ณต์žกํ•œ ์กฐ๊ฑด๋ฌธ, ๋ถ„๊ธฐ ์ฒ˜๋ฆฌ๋ฐ˜๋ณต์ ์ด๊ณ  ์œ ์‚ฌํ•œ ์—ฐ์‚ฐ์˜ ๋Œ€๋Ÿ‰ ์ฒ˜๋ฆฌ

์š”์•ฝ: CPU๋Š” ๋˜‘๋˜‘ํ•œ ์†Œ์ˆ˜์˜ ์ž‘์—…์ž, GPU๋Š” ๋‹จ์ˆœํ•˜์ง€๋งŒ ๋งŽ์€ ์ž‘์—…์ž

TPU(Tensor Processing Unit)

  • ๊ตฌ๊ธ€์ด ๊ฐœ๋ฐœํ•œ ํŠน์ˆ˜ ํ•˜๋“œ์›จ์–ด๋กœ, ์ฃผ๋กœ ๋”ฅ๋Ÿฌ๋‹๊ณผ ๋จธ์‹ ๋Ÿฌ๋‹ ์ž‘์—…์„ ์ตœ์ ํ™”ํ•˜๊ธฐ ์œ„ํ•ด ์„ค๊ณ„
  • ๊ณ ์† ์—ฐ์‚ฐ: TPU๋Š” ํŠนํžˆ ํ–‰๋ ฌ ์—ฐ์‚ฐ๊ณผ ๊ฐ™์€ ์ˆ˜ํ•™์  ์ž‘์—…์„ ๋น ๋ฅด๊ฒŒ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ์–ด, ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ํ•™์Šต๊ณผ ์ถ”๋ก ์—์„œ ๋†’์€ ์„ฑ๋Šฅ
  • ๋งž์ถคํ˜• ์„ค๊ณ„: CPU๋‚˜ GPU๋ณด๋‹ค ๋”ฅ๋Ÿฌ๋‹์— ํŠนํ™”๋œ ์„ค๊ณ„๋กœ, ํ…์„œ ์—ฐ์‚ฐ์— ์ตœ์ ํ™”๋˜์–ด ์žˆ์Œ
  • ๊ตฌ๊ธ€์€ ์ž์‚ฌ์˜ TPU ํ•˜๋“œ์›จ์–ด ์„ค๊ณ„๋ฅผ ๊ณต๊ฐœX, ํด๋ผ์šฐ๋“œ ๋“ฑ ์„œ๋น„์Šค ํ˜•ํƒœ๋กœ๋งŒ ์ œ๊ณต

1.2 GPU ๋ฉ”๋ชจ๋ฆฌ (VRAM)๋ž€?

ํ…์„œ, ๋ชจ๋ธ, ์—ฐ์‚ฐ ์ค‘๊ฐ„๊ฐ’ ์ €์žฅํ•˜๋Š” ๊ณต๊ฐ„ (GPU ๋‚ด๋ถ€์— ์œ„์น˜ )

  • VRAM ๊ณต๊ฐ„์ด ๋ถ€์กฑํ•˜๋ฉด OOM(Out of Memory) ์—๋Ÿฌ ๋ฐœ์ƒ
  • CPU๋Š” RAM์— ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜์ง€๋งŒ, GPU๋Š” VRAM์— ์ €์žฅ
  • CPU๋Š” VRAM์— ์ง์ ‘ ์ ‘๊ทผ ๋ถˆ๊ฐ€, GPU๋Š” RAM์— ์ง์ ‘ ์ ‘๊ทผ ๋ถˆ๊ฐ€

2. Google Colab์—์„œ GPU ์‚ฌ์šฉํ•˜๊ธฐ

  • Google Colab == ์›น ๋ธŒ๋ผ์šฐ์ €์—์„œ ํด๋ผ์šฐ๋“œ CPU/GPU/TPU๋ฅผ ๋ฌด๋ฃŒ๋กœ ์ด์šฉํ•ด Python ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋Š” ํ™˜๊ฒฝ

2.1 GPU ์‚ฌ์šฉ ๋ฐฉ๋ฒ•

  1. Colab ๋“ค์–ด๊ฐ€์„œ ์ƒ๋‹จ ๋ฉ”๋‰ด์—์„œ [๋Ÿฐํƒ€์ž„] โ†’ [๋Ÿฐํƒ€์ž„ ์œ ํ˜• ๋ณ€๊ฒฝ] ํด๋ฆญ
  2. ํ•˜๋“œ์›จ์–ด ๊ฐ€์†๊ธฐ โ†’ GPU ์„ ํƒ
  3. ์ดํ›„ ์•„๋ž˜ ์ฝ”๋“œ๋ฅผ ํ†ตํ•ด GPU ์‚ฌ์šฉ ๊ฐ€๋Šฅ ์—ฌ๋ถ€ ํ™•์ธ
import torch

print(torch.cuda.is_available())  # True๋ฉด GPU ์‚ฌ์šฉ ๊ฐ€๋Šฅ

# PyTorch, TensorFlow ๋“ฑ ์ฃผ๋ฅ˜ ๋”ฅ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ๋Š” ์ดˆ๊ธฐ๋ถ€ํ„ฐ ์‚ฐ์—… ํ‘œ์ค€์ด ๋˜์–ด๋ฒ„๋ฆฐ NVIDIA CUDA๋ฅผ ์ค‘์‹ฌ์œผ๋กœ ๊ฐœ๋ฐœ๋จ

2.2 GPU ๊ด€๋ จ ํ•จ์ˆ˜๋“ค

PyTorch์—์„œ๋Š” ์—ฐ์‚ฐ ๋Œ€์ƒ์ด ๋˜๋Š” ๋ชจ๋“  ํ…์„œ๊ฐ€ ๋™์ผํ•œ ์žฅ์น˜(device e.g., CPU์˜ RAM ํ˜น์€ GPU์˜ VRAM) ์— ์žˆ์–ด์•ผ ํ•จ์— ์œ ์˜ํ•ด์•ผ ํ•œ๋‹ค.
์„œ๋กœ ๋‹ค๋ฅธ ์žฅ์น˜(device)์— ์žˆ๋Š” ํ…์„œ๋ผ๋ฆฌ ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๋ฉด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•œ๋‹ค.

  • ๋ฐ์ดํ„ฐ ๋ฟ ์•„๋‹ˆ๋ผ ๋ชจ๋ธ๋„ ๋งˆ์ฐฌ๊ฐ€์ง€์ž„!
๊ธฐ๋Šฅ์ฝ”๋“œ ์˜ˆ์‹œ์„ค๋ช…
GPU ์‚ฌ์šฉ ๊ฐ€๋Šฅ ์—ฌ๋ถ€ ํ™•์ธtorch.cuda.is_available()GPU ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ํ™˜๊ฒฝ์ธ์ง€ ํ™•์ธ
ํ…์„œ๋ฅผ GPU๋กœ ์ด๋™x.cuda() ๋˜๋Š” x.to('cuda')CPU โ†’ GPU
ํ…์„œ๋ฅผ CPU๋กœ ์ด๋™x.cpu() ๋˜๋Š” x.to('cpu')GPU โ†’ CPU
์žฅ์น˜ ํ™•์ธx.deviceํ…์„œ์˜ ํ˜„์žฌ ์žฅ์น˜ ์ •๋ณด ์ถœ๋ ฅ
์žฅ์น˜ ์ผ์น˜ ํ•„์ˆ˜x + y ์—ฐ์‚ฐ ์‹œx์™€ y๋Š” ๋™์ผ device์— ์žˆ์–ด์•ผ ํ•จ
๋ชจ๋ธ์„ GPU๋กœ ์ด๋™linear.cuda()CPU โ†’ GPU
๋ชจ๋ธ์„ CPU๋กœ ์ด๋™linear.cpu()CPU โ†’ GPU
์žฅ์น˜ ํ™•์ธlinear.device โŒ๋ชจ๋ธ์€ ํ˜„์žฌ ์žฅ์น˜๋ฅผ ์ถœ๋ ฅํ•˜๋Š” ๊ธฐ๋Šฅ โŒ
import torch

data = [
    [1, 2],
    [3, 4]
]

# CPU ์ƒ์˜ Tensor ์ƒ์„ฑ
x = torch.tensor(data)
print("์ดˆ๊ธฐ ์ƒํƒœ:", x.is_cuda)  # False

# GPU๋กœ ์ด๋™
if torch.cuda.is_available():
    x = x.cuda() # copy์ž„
	# x = x.to('cuda')
    print("GPU ์ด๋™ ํ›„:", x.is_cuda)  # True

    # ๋‹ค์‹œ CPU๋กœ ์ด๋™
    x = x.cpu()
    # x = x.to('cpu')
    print("๋‹ค์‹œ CPU ์ด๋™ ํ›„:", x.is_cuda)  # False
    
# GPU ์žฅ์น˜์˜ ํ…์„œ
a = torch.tensor([
    [1, 1],
    [2, 2]
]).cuda()

# CPU ์žฅ์น˜์˜ ํ…์„œ
b = torch.tensor([
    [5, 6],
    [7, 8]
])

# print(torch.matmul(a, b)) # ์˜ค๋ฅ˜ ๋ฐœ์ƒ
# RuntimeError: Expected all tensors to be on the same device, but found at least two devices, ~~
print(torch.matmul(a.cpu(), b))
    

cpu(), cuda()๋Š” ์ƒˆ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํ• ๋‹นํ•˜๋ฏ€๋กœ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์— ์œ ์˜

import torch

x = torch.randn(1000, 1000, device="cuda:0")  # VRAM์— ์ƒ์„ฑ
y = x.cpu()  # RAM์— ๋ณต์‚ฌ๋ณธ ์ƒ์„ฑ
z = y.cuda()  # VRAM์— ๋˜ ๋‹ค๋ฅธ ๋ณต์‚ฌ๋ณธ ์ƒ์„ฑ
# Tip 1: ๋ณ€์ˆ˜ ์ด๋ฆ„์ด ๊ฐ™๋„๋ก ์„ค์ •ํ•˜๋ฉด ๊ฐ€๋น„์ง€ ์ปฌ๋ ‰์…˜์œผ๋กœ ์ •๋ฆฌ
# Tip 2: ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š” ํ…์„œ๋Š” del๋กœ ์ •๋ฆฌํ•˜๊ธฐ

2.3 Tip: ํ…์„œ ์ƒ์„ฑ๋ถ€ํ„ฐ device(CPU, GPU) ์ง€์ •ํ•ด๋ฒ„๋ฆฌ๊ธฐ

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
x = torch.tensor([[1, 2], [3, 4]], device=device)
print(x.device)  # cuda:0 ๋˜๋Š” cpu

2.4 ํ˜„์žฌ CPU, GPU ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰ ํ™•์ธ

ํ•ญ๋ชฉCPU ๋ฉ”๋ชจ๋ฆฌ (RAM)GPU ๋ฉ”๋ชจ๋ฆฌ (VRAM)
์œ„์น˜์‹œ์Šคํ…œ ์ „์ฒด ๋ฉ”๋ชจ๋ฆฌ๊ทธ๋ž˜ํ”ฝ์นด๋“œ ์ „์šฉ ๋ฉ”๋ชจ๋ฆฌ
์—ญํ• ์ผ๋ฐ˜ ํ”„๋กœ๊ทธ๋žจ ์‹คํ–‰๊ทธ๋ž˜ํ”ฝ ์ฒ˜๋ฆฌ, ๋”ฅ๋Ÿฌ๋‹ ์—ฐ์‚ฐ
์˜ˆ์‹œ์—‘์…€, ๋ธŒ๋ผ์šฐ์ € ๋“ฑTensor, ๋ชจ๋ธ ํŒŒ๋ผ๋ฏธํ„ฐ ๋“ฑ

CPU ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰ ํ™•์ธ

import psutil

# CPU ์‚ฌ์šฉ๋ฅ  (%)
print(f"CPU ์‚ฌ์šฉ๋ฅ : {psutil.cpu_percent(interval=1)}%")

# ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ ํ˜„ํ™ฉ
mem = psutil.virtual_memory()
print(f"์‚ฌ์šฉ ์ค‘ ๋ฉ”๋ชจ๋ฆฌ: {mem.used / 1024**3:.2f} GB / ์ „์ฒด {mem.total / 1024**3:.2f} GB")

GPU ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰ ํ™•์ธ

import torch

# ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ GPU ์ˆ˜
print(torch.cuda.device_count())

# ํ˜„์žฌ ํ™œ์„ฑํ™”๋œ GPU ID
print(torch.cuda.current_device())

# GPU ์ด๋ฆ„
print(torch.cuda.get_device_name(0))

# ์‹ค์ œ๋กœ ํ˜„์žฌ ํ…์„œ๋“ค์ด ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š” ๋ฉ”๋ชจ๋ฆฌ ํฌ๊ธฐ
print(f"Allocated: {torch.cuda.memory_allocated() / 1024**2:.2f} MB") 

# PyTorch๊ฐ€ ๋ฏธ๋ฆฌ ํ™•๋ณดํ•ด๋‘” ์ด GPU ๋ฉ”๋ชจ๋ฆฌ ํฌ๊ธฐ (์บ์‹œ ํฌํ•จ)
print(f"Reserved : {torch.cuda.memory_reserved() / 1024**2:.2f} MB")

or

# !pip install gputil
import GPUtil

gpus = GPUtil.getGPUs()
for gpu in gpus:
    print(f"GPU: {gpu.name}")
    print(f"์‚ฌ์šฉ๋ฅ : {gpu.load * 100:.1f}%")
    print(f"๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰: {gpu.memoryUsed}MB / {gpu.memoryTotal}MB")

3. CUDA, NVIDIA Driver cuDNN๊ฐ€ ๋ญ์ง€?

3.1 CUDA (Compute Unified Device Architecture)

  • NVIDIA์—์„œ ๊ฐœ๋ฐœํ•œ GPU๋ฅผ ์ด์šฉํ•œ ๋ณ‘๋ ฌ ์—ฐ์‚ฐ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ด์ฃผ๋Š” ์ปดํ“จํŒ… ํ”Œ๋žซํผ ๋ฐ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๋ชจ๋ธ
  • CUDA๋Š” GPU์˜ ์ˆ˜์ฒœ ๊ฐœ ์ฝ”์–ด๋ฅผ ํ™œ์šฉํ•ด ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณ‘๋ ฌ๋กœ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ด์ค€๋‹ค.
  • PyTorch, TensorFlow ๋“ฑ ๋”ฅ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ๋Š” CUDA๋ฅผ ํ†ตํ•ด GPU ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•œ๋‹ค.
  • โœ… ์š”์•ฝ: GPU ๋ณ‘๋ ฌ ์ฒ˜๋ฆฌ ํ”Œ๋žซํผ(๋ชจ๋ธ)

3.2 NVIDIA Driver

  • GPU๋ฅผ ์šด์˜์ฒด์ œ(OS)์—์„œ ์ธ์‹ํ•˜๊ณ  ์ œ์–ดํ•˜๊ธฐ ์œ„ํ•œ ์žฅ์น˜ ๋“œ๋ผ์ด๋ฒ„
  • CUDA ๋ฐ cuDNN๊ณผ ๊ฐ™์€ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ๋จผ์ € ์„ค์น˜๋˜์–ด ์žˆ์–ด์•ผ ํ•œ๋‹ค.
  • OS โ†” GPU ๊ฐ„์˜ ๊ธฐ๋ณธ ํ†ต์‹ ์„ ๋‹ด๋‹นํ•œ๋‹ค.
  • โœ… ์š”์•ฝ: GPU๋ฅผ ์ž‘๋™์‹œํ‚ค๊ธฐ ์œ„ํ•œ ํ•„์ˆ˜ ๋“œ๋ผ์ด๋ฒ„

3.3 cuDNN (CUDA Deep Neural Network library)

  • NVIDIA๊ฐ€ ์ œ๊ณตํ•˜๋Š” ๋”ฅ๋Ÿฌ๋‹ ์—ฐ์‚ฐ ์ตœ์ ํ™” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ
  • CNN, RNN, BatchNorm ๋“ฑ์˜ ์—ฐ์‚ฐ์„ GPU์—์„œ ๊ณ ์† ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ด์ค€๋‹ค.
  • PyTorch, TensorFlow ๋“ฑ ํ”„๋ ˆ์ž„์›Œํฌ์—์„œ ์ž๋™์œผ๋กœ ์‚ฌ์šฉ๋˜๋ฉฐ, ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ํ•„์ˆ˜์ ์ด๋‹ค.
  • โœ… ์š”์•ฝ: ๋”ฅ๋Ÿฌ๋‹ ์—ฐ์‚ฐ์„ GPU์—์„œ ๋น ๋ฅด๊ฒŒ ์ˆ˜ํ–‰ํ•˜๊ฒŒ ํ•ด์ฃผ๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ

profile
๋งŒ๋‘๋Š” ๋ชฉ๋ง๋ผ

0๊ฐœ์˜ ๋Œ“๊ธ€