๐Ÿ–ฅ๏ธ Datasets & DataLoaders

Dataset

labels_map = {
    0: "T-Shirt",
    1: "Trouser",
    2: "Pullover",
    3: "Dress",
    4: "Coat",
    5: "Sandal",
    6: "Shirt",
    7: "Sneaker",
    8: "Bag",
    9: "Ankle Boot",
}
figure = plt.figure(figsize=(8, 8))
cols, rows = 3, 3
for i in range(1, cols * rows + 1):
    sample_idx = torch.randint(len(training_data), size=(1,)).item()
    img, label = training_data[sample_idx]
    figure.add_subplot(rows, cols, i)
    plt.title(labels_map[label])
    plt.axis("off")
    plt.imshow(img.squeeze(), cmap="gray")
plt.show()

์ˆ˜๋™์œผ๋กœ Datasets์— ๋ฆฌ์ŠคํŠธ ํ˜•์‹์œผ๋กœ ์ธ๋ฑ์‹ฑ ํ•  ์ˆ˜ ์žˆ๋‹ค.

Custom Dataset

import os
import pandas as pd
from torchvision.io import read_image

class CustomImageDataset(Dataset):
    def __init__(self, annotations_file, img_dir, transform=None, target_transform=None):
        self.img_labels = pd.read_csv(annotations_file)
        self.img_dir = img_dir
        self.transform = transform
        self.target_transform = target_transform

    def __len__(self):
        return len(self.img_labels)

    def __getitem__(self, idx):
        img_path = os.path.join(self.img_dir, self.img_labels.iloc[idx, 0])
        image = read_image(img_path)
        label = self.img_labels.iloc[idx, 1]
        if self.transform:
            image = self.transform(image)
        if self.target_transform:
            label = self.target_transform(label)
        return image, label

์ปค์Šคํ…€ Dataset์„ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด์„œ๋Š” init, len, getitem์„ ๊ตฌํ˜„ํ•˜์—ฌ์•ผ ํ•œ๋‹ค.

DataLoader

from torch.utils.data import DataLoader

train_dataloader = DataLoader(training_data, batch_size=64, shuffle=True)
test_dataloader = DataLoader(test_data, batch_size=64, shuffle=True)

# Display image and label.
train_features, train_labels = next(iter(train_dataloader))
print(f"Feature batch shape: {train_features.size()}")
print(f"Labels batch shape: {train_labels.size()}")
img = train_features[0].squeeze()
label = train_labels[0]
plt.imshow(img, cmap="gray")
plt.show()
print(f"Label: {label}")

Dataset์€ ๋ฐ์ดํ„ฐ์…‹์˜ ํŠน์ง•(feature)์„ ๊ฐ€์ ธ์˜ค๊ณ  ํ•˜๋‚˜์˜ ์ƒ˜ํ”Œ์— ์ •๋‹ต(label)์„ ์ง€์ •ํ•˜๋Š” ์ผ์„ ํ•œ ๋ฒˆ์— ์ง„ํ–‰
๋ชจ๋ธ์„ ํ•™์Šตํ•  ๋•Œ, ์ผ๋ฐ˜์ ์œผ๋กœ ์ƒ˜ํ”Œ๋“ค์„ ใ€Š๋ฏธ๋‹ˆ๋ฐฐ์น˜(minibatch)ใ€‹๋กœ ์ „๋‹ฌํ•˜๊ณ , ๋งค ์—ํญ(epoch)๋งˆ๋‹ค ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค์‹œ ์„ž์–ด์„œ ๊ณผ์ ํ•ฉ(overfit)์„ ๋ง‰๊ณ ,Python์˜ multiprocessing์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ ๊ฒ€์ƒ‰ ์†๋„๋ฅผ ๋†’์ธ๋‹ค.

DataLoader๋Š” ์ด๋Ÿฌํ•œ ๋ณต์žกํ•œ ๊ณผ์ •์„ ์•Œ์•„์„œ ์ฒ˜๋ฆฌํ•ด์ค€๋‹ค.
๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๊ณ  ํ•„์š”์—๋”ฐ๋ผ ์ˆœํšŒ๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค. ๊ฐ ์ˆœํšŒ๋งˆ๋‹ค ํ”ผ์ฒ˜์™€ ๋ ˆ์ด๋ธ”์„ ํฌํ•จํ•˜๋Š” ๋ฐฐ์น˜๋ฅผ ๋ฐ˜ํ™˜ํ•œ๋‹ค.


๐Ÿ–ฅ๏ธ Transforms

ToTensor

transform = ToTensor()

PIL ์ด๋ฏธ์ง€๋‚˜ Numpy array๋ฅผ Tensor๋กœ ๋ณ€ํ™˜ํ•œ๋‹ค.

Lamda

target_transform=Lambda(lambda y: torch.zeros(10, dtype=torch.float).scatter_(0, torch.tensor(y), value=1))

์‚ฌ์šฉ์ž ์ •์˜ ๋žŒ๋‹คํ•จ์ˆ˜๋ฅผ ํ• ๋‹น

๐Ÿ–ฅ๏ธ Build the Neural Network

import os
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

์‹ ๊ฒฝ๋ง์€ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ณ„์ธต(layer)/๋ชจ๋“ˆ(module)๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค.
torch.nn ๋„ค์ž„์ŠคํŽ˜์ด์Šค๋Š” ์‹ ๊ฒฝ๋ง์„ ๊ตฌ์„ฑํ•˜๋Š”๋ฐ ํ•„์š”ํ•œ ๋ชจ๋“  ๊ตฌ์„ฑ ์š”์†Œ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
PyTorch์˜ ๋ชจ๋“  ๋ชจ๋“ˆ์€ nn.Module ์˜ ํ•˜์œ„ ํด๋ž˜์Šค(subclass).
์‹ ๊ฒฝ๋ง์€ ๋‹ค๋ฅธ ๋ชจ๋“ˆ(๊ณ„์ธต; layer)๋กœ ๊ตฌ์„ฑ๋œ ๋ชจ๋“ˆ์ด๋‹ค.
์ด๋Ÿฌํ•œ ์ค‘์ฒฉ๋œ ๊ตฌ์กฐ๋Š” ๋ณต์žกํ•œ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์‰ฝ๊ฒŒ ๊ตฌ์ถ•ํ•˜๊ณ  ๊ด€๋ฆฌํ•  ์ˆ˜ ์žˆ๋‹ค.

device = (
    "cuda"
    if torch.cuda.is_available()
    else "mps"
    if torch.backends.mps.is_available()
    else "cpu"
)
print(f"Using {device} device")

class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10),
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits
        
model = NeuralNetwork().to(device)
print(model)

X = torch.rand(1, 28, 28, device=device)
logits = model(X)
pred_probab = nn.Softmax(dim=1)(logits)
y_pred = pred_probab.argmax(1)
print(f"Predicted class: {y_pred}")

Raw ์˜ˆ์ธก ๊ฐ’๋“ค์„ softmax ํ•จ์ˆ˜๋ฅผ ํ†ต๊ณผ์‹œ์ผœ ์˜ˆ์ธก ํ™•๋ฅ ์„ ๊ตฌํ•œ๋‹ค. ์ง์ ‘ forward ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•ด์„œ๋Š” ์•ˆ๋œ๋‹ค


Model Layers

input_image = torch.rand(3,28,28)
print(input_image.size())

FashionMNIST ๋ชจ๋ธ์˜ ๊ณ„์ธต์„ ์‚ดํŽด๋ณด๊ธฐ ์œ„ํ•ด 28 * 28 ํฌ๊ธฐ์˜ ์ด๋ฏธ์ง€ 3๊ฐœ๋กœ ๊ตฌ์„ฑ๋œ ๋ฐฐ์น˜๋ฅผ ๋ถˆ๋Ÿฌ์˜จ๋‹ค.

nn.Flatten

flatten = nn.Flatten()
flat_image = flatten(input_image)
print(flat_image.size())

28 * 28 ์˜ 2์ฐจ์› ์ด๋ฏธ์ง€๋ฅผ 784์˜ ์—ฐ์†๋œ ๋ฐฐ์—ด๋กœ ๋ณ€ํ™˜

nn.Linear

layer1 = nn.Linear(in_features=28*28, out_features=20)
hidden1 = layer1(flat_image)
print(hidden1.size())

weight ์™€ bias๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์„ ํ˜•๋ณ€ํ™˜ ํ•˜๋Š” ๋‹จ๊ณ„

nn.ReLU

print(f"Before ReLU: {hidden1}\n\n")
hidden1 = nn.ReLU()(hidden1)
print(f"After ReLU: {hidden1}")

ReLU ์™€ ๊ฐ™์€ ๋น„์„ ํ˜• activation ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ mapping์„ ์ง„ํ–‰

nn.Sequential

seq_modules = nn.Sequential(
    flatten,
    layer1,
    nn.ReLU(),
    nn.Linear(20, 10)
)
input_image = torch.rand(3,28,28)
logits = seq_modules(input_image)

๋ฐ์ดํ„ฐ๋ฅผ ์ˆœ์ฐจ์ ์œผ๋กœ ์ „๋‹ฌ์‹œ์ผœ์ฃผ๋Š” ๋ชจ๋“ˆ์˜ ์ปจํ…Œ์ด๋„ˆ

nn.Softmax

softmax = nn.Softmax(dim=1)
pred_probab = softmax(logits)

Raw ๋ฐ์ดํ„ฐ ๊ฐ’์„ [0, 1] ์‚ฌ์ด์˜ ๋ฒ”์œ„๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ์ถœ๋ ฅํ•ด์ฃผ์–ด ํ™•๋ฅ ์ฒ˜๋Ÿผ ๋งŒ๋“ ๋‹ค

Model Parameters

print(f"Model structure: {model}\n\n")

for name, param in model.named_parameters():
    print(f"Layer: {name} | Size: {param.size()} | Values : {param[:2]} \n")

์‹ ๊ฒฝ๋ง ๋‚ด๋ถ€์˜ ๋ชจ๋ธ๋“ค์€ ์ž๋™์œผ๋กœ ํŒŒ๋ผ๋ฏธํ„ฐํ™” ๋˜๊ณ  ์ตœ์ ํ™”๋œ weights ์™€ bias ์— ์—ฐ๊ด€๋œ๋‹ค.
์ด ํŒŒ๋ผ๋ฏธํ„ฐ๋“ค์€ ์ž๋™์œผ๋กœ ์ถ”์ ๋˜๋ฉฐ parameters ๋˜๋Š” named_parameters๋กœ ์ถ”์  ๊ฐ€๋Šฅํ•˜๋‹ค.

์ถœ์ฒ˜ : PyTorch Tutorials https://tutorials.pytorch.kr/beginner/basics/data_tutorial.html
https://tutorials.pytorch.kr/beginner/basics/transforms_tutorial.html
https://tutorials.pytorch.kr/beginner/basics/buildmodel_tutorial.html

profile
HGU - ๊ฐœ์ธ ๊ณต๋ถ€ ๊ธฐ๋ก์šฉ ๋ธ”๋กœ๊ทธ

1๊ฐœ์˜ ๋Œ“๊ธ€

comment-user-thumbnail
2023๋…„ 7์›” 18์ผ

์•„์ฃผ ์œ ์šฉํ•œ ์ •๋ณด๋„ค์š”!

๋‹ต๊ธ€ ๋‹ฌ๊ธฐ

๊ด€๋ จ ์ฑ„์šฉ ์ •๋ณด

Powered by GraphCDN, the GraphQL CDN