PyTorch's Data type & Functions

duckbill413·2024년 8월 5일

AI PyTorch

PyTorch

목록 보기

2/5

PyTorch의 Data Types

정수 타입 (Integer Types):

torch.int8: 8비트 정수. 메모리 사용량이 적고 값의 범위가 제한됨.
torch.int16: 16비트 정수. 값의 범위가 넓지만 메모리 사용량이 늘어남.
torch.int32: 32비트 정수. 대부분의 경우 기본 정수 타입으로 사용됨.
torch.int64: 64비트 정수. 큰 정수 값 범위를 다룰 때 사용됨.

부동 소수점 타입 (Floating Point Types):

torch.float16: 16비트 반정밀도 부동 소수점. 메모리 사용이 적고 연산 속도가 빠르지만 정밀도가 낮음.
torch.float32: 32비트 단정밀도 부동 소수점. 딥러닝 모델에서 기본 데이터 타입으로 사용됨.
torch.float64: 64비트 배정밀도 부동 소수점. 정밀도가 높지만 메모리 사용량과 연산 비용이 큼.

부호 없는 정수 타입 (Unsigned Integer Types):

torch.uint8: 8비트 부호 없는 정수. 주로 이미지 데이터에서 사용됨.

논리형 타입 (Boolean Type):

torch.bool: True/False 값을 가진 논리형 타입. 조건 필터링과 마스크에 사용됨.

복소수 타입 (Complex Types):

torch.complex64: 64비트 복소수. 32비트 실수와 32비트 허수로 구성됨.
torch.complex128: 128비트 복소수. 64비트 실수와 64비트 허수로 구성됨.

Type Casting

torch.Tensor.to() 메서드나 torch.Tensor.type() 메서드를 사용해서 데이터 타입을 변경할 수 있음

to 를 이용한 방법

# Type Casting
float_tensor = torch.tensor([1.5, 2.5, 3.5], dtype=torch.float32)

int_tensor = float_tensor.to(dtype=torch.int32)
print(int_tensor) # tensor([1, 2, 3], dtype=torch.int32)

method 를 이용한 방법

float_tensor = int_tensor.float()
print(float_tensor) # tensor([1., 2., 3.])

Tensor Functions

Tensor Creation

torch.tensor(data, dtype=None, device=None): 주어진 데이터로 텐서 생성, dtype과 device를 지정할 수 있음
```
x = torch.tensor([1.0, 2.0, 3.0])  # 1차원 텐서 생성
```
torch.zeros(size, dtype=None, device=None): 주어진 크기의 모든 값이 0인 텐서 생성
```
zeros_tensor = torch.zeros((2, 3))  # 2x3 크기의 0으로 채워진 텐서 생성
```
torch.ones(size, dtype=None, device=None): 주어진 크기의 모든 값이 1인 텐서 생성
```
ones_tensor = torch.ones((3, 4))  # 3x4 크기의 1로 채워진 텐서 생성
```

torch.zeros_like(a), torch.ones_like(b): 0으로 초기화된 텐서를 1로 초기화

a = torch.zeros([2, 3])
b = torch.ones_like(a)
print(a, b)

# a: tensor([[0., 0., 0.],
#            [0., 0., 0.]])
# b: tensor([[1., 1., 1.],
#            [1., 1., 1.]])

torch.rand(size): 난수로 채워진 텐서 생성

i = torch.rand([2, 3])
print(i)

# tensor([[0.7526, 0.4461, 0.6692],
#         [0.3348, 0.0447, 0.9901]])

torch.randn(size): 정규 분포에서 무작위로 추출한 난수를 생성

j = torch.randn(4)
print(j) # tensor([-1.6679,  0.7817, -2.1413, -0.5142])

torch.arange(start, end, step): 주어진 범위의 값을 가진 1차원 텐서 생성
```
arange_tensor = torch.arange(0, 10, 2)  # 0부터 10까지 2 간격으로 생성
```

tensor.empty(size) 초기화 되지 않은 텐서 생성

k = torch.empty(3)
print(k) # tensor([-1.7280e+34,  4.3925e-41,  4.5398e-32])

torch.linspace(start, end, steps): 주어진 범위 내에서 균등하게 나눈 값의 1차원 텐서 생성

linspace_tensor = torch.linspace(0, 1, steps=5)  # 0과 1 사이를 5개의 값으로 나눈 텐서 생성

torch.IntTensor([1, 2, 3]): CPU 텐서의 생성

c = torch.IntTensor([1, 2, 3])
print(c) # tensor([1, 2, 3], dtype=torch.int32)

torch.tensor(size).cuda(): CUDA 텐서의 생성
```
a = torch.tensor([1, 2, 3]).cuda()
```

Tensor Manipulation

slicing: 텐서의 데이터를 index 를 이용하여 리턴

s = torch.tensor([[1, 2, 3],
                  [4, 5, 6]])
s[:, 1:]

# tensor([[2, 3],
#         [5, 6]])

s = torch.tensor([[1, 2, 3],
                  [4, 5, 6]])
s[1:, ...]

# tensor([[4, 5, 6]])

flatten: 텐서의 데이터를 평탄화

f = torch.tensor([[[1, 2], [3, 4]],
                  [[5, 6], [7, 8]],
                  [[9, 10], [11, 12]]])
print(f)
print(f.flatten())
print(torch.flatten(f, 1))

결과)
tensor([[[ 1,  2],
         [ 3,  4]],

        [[ 5,  6],
         [ 7,  8]],

        [[ 9, 10],
         [11, 12]]])
tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])
tensor([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])

view: 텐서의 메모리가 연속적인 경우 view 메서드를 사용하여 Tensor의 모양 변경

v = torch.tensor([1, 2, 3, 4, 5, 6, 7, 8])
print(v.is_contiguous()) # 메모리 연속적 인지
print(v.is_distributed()) # 메모리 불연속적 인지
print(v.view(2, 2, -1))
print(v.view(2, 2, 2))
print(v.view(2, 2, -1).shape)

# True
# False
# tensor([[[1, 2],
#          [3, 4]],
#	        [[5, 6],
#          [7, 8]]])
# torch.Size([2, 2, 2])

reshape: view와 달리 메모리 연속적이지 않아도 모양 변경 가능하지만 성능상 불리

f = torch.tensor([[[1, 2], [3, 4]],
                  [[5, 6], [7, 8]],
                  [[9, 10], [11, 12]]])
print(f.reshape(4, 3))

결과)
tensor([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]])

tensor.shape: 텐서의 차원 크기를 반환
```
shape = ones_tensor.shape  # (3, 4)
```
tensor.dim(): 텐서의 차원 크기 반환
```
dim = ones_tensor.dim()
```
tensor.size(dim): 특정 차원의 크기를 반환
```
size_dim0 = ones_tensor.size(0)  # 3
```
tensor.numel(): 텐서의 총 요소 개수 반환
```
numel = float_tensor.numel() # 3
```

tensor.view(new_shape): 텐서의 형태를 변경 (reshape)

reshaped_tensor = ones_tensor.view(2, 6)  # 2x6 형태로 변경

tensor.transpose(dim0, dim1): 두 차원을 교환

transposed_tensor = ones_tensor.transpose(0, 1)  # 행과 열 교환

torch.squeeze(dim): dim 이 1인 차원을 축소

s = torch.rand(3, 2, 1)
print(s)
torch.squeeze(s)

결과)
tensor([[[0.8857],
         [0.2621]],

        [[0.9879],
         [0.3326]],

        [[0.5443],
         [0.3445]]])
tensor([[0.8857, 0.2621],
        [0.9879, 0.3326],
        [0.5443, 0.3445]])

torch.stack([dim1, dim2, dim3]): 텐서를 결합

red = torch.tensor([[255, 0],
                    [0, 255]])
green = torch.tensor([[0, 255],
                      [0, 255]])
blue = torch.tensor([[0, 0], [255, 0]])
result = torch.stack([red, green, blue])
print(result)
result = torch.stack([red, green, blue], dim=1)
print(result)

결과)
tensor([[[255,   0],
         [  0, 255]],

        [[  0, 255],
         [  0, 255]],

        [[  0,   0],
         [255,   0]]])
tensor([[[255,   0],
         [  0, 255],
         [  0,   0]],

        [[  0, 255],
         [  0, 255],
         [255,   0]]])

tensor.flatten(start_dim=0, end_dim=-1): 텐서를 1차원으로 변환

flattened_tensor = ones_tensor.flatten()  # 모든 차원을 평탄화

Tensor Calculate

tensor + other_tensor: 텐서의 덧셈

sum_tensor = x + torch.tensor([1.0, 1.0, 1.0])

tensor - other_tensor: 텐서의 뺄셈

diff_tensor = x - torch.tensor([0.5, 0.5, 0.5])

tensor * other_tensor: 텐서의 곱셈 (요소별 곱)

product_tensor = x * torch.tensor([2.0, 2.0, 2.0])

tensor.matmul(other_tensor): 텐서의 행렬 곱셈

matmul_result = torch.matmul(torch.tensor([[1, 2], [3, 4]]), torch.tensor([[5, 6], [7, 8]]))

tensor.mean(): 텐서의 평균
```
mean_value = x.mean()
```
tensor.sum(): 텐서의 합
```
sum_value = x.sum()
```
tensor.max(): 텐서의 최대값
```
max_value = x.max()
```
tensor.min(): 텐서의 최소값
```
min_value = x.min()
```
tensor.prod(): 텐서의 요소의 곱
```
prod = float_tensor.prod()
```
tensor.var(): 텐서의 표본분산
```
var = float_tensor.var()
```
`tensor.std():` 텐서의 표본표준편차
```
std = float_tensor.std()
```

Tensor In-Place Operations

tensor.add(value): 텐서의 값을 제자리에서 더하기
```
x.add(1.0)  # x의 각 요소에 1.0을 더함
```
tensor.mul(value): 텐서의 값을 제자리에서 곱하기
```
x.mul(2.0)  # x의 각 요소에 2.0을 곱함
```

duckbill413

같이 공부합시다~

이전 포스트

PyTorch & Tensor

다음 포스트