[PyTorch] PyTorch operation 정리

lijm1358·2023년 3월 14일

(참고 : https://pytorch.org/docs/stable/torch.html)

필요에 따라 더 추가되거나 수정될 수 있습니다.

Tensor

numel : tensor내의 원소의 개수 반환

x = torch.tensor([1,2,3])
torch.numel(x)

>>>
3

Creation

from_numpy : ndarray를 tensor로 변환

a = numpy.array([1, 2, 3])
t = torch.from_numpy(a)

zeros, ones : 0 또는 1로만 이루어진 tensor 반환

torch.zeros(2, 3)
torch.ones(5)

Indexing, Slicing, Joining, Mutating

index_select : 주어진 index로 tensor 반환

x = torch.randn(3, 4)
indices = torch.tensor([0, 2])

torch.index_select(input=x, dim=0, index=indices)

>>>
tensor([[ 0.1427,  0.0231, -0.5414, -1.0009],
        [-1.1734, -0.6571,  0.7230, -0.6004]])

torch.index_select(x, 1, indices)

>>>
tensor([[ 0.1427, -0.5414],
        [-0.4664, -0.1228],
        [-1.1734,  0.7230]])

입력 tensor에 대해, dim방향으로 index에 정의된 1d tensor에 있는 index위치의 tensor를 반환한다.

gather : tensor의 값 모으기

t = torch.tensor([[1, 2], [3, 4]])
torch.gather(input=t, dim=1, index=torch.tensor([[0, 0], [1, 0]]))

>>>
tensor([[ 1,  1],
        [ 4,  3]])

index에 정의된 값에 따라 input tensor의 값들을 모은다.
input이 3차원 tensor라면 다음과 같은 방식을 따른다.

out[i][j][k] = input[index[i][j][k]][j][k] # dim=0일 떄
out[i][j][k] = input[i][index[i][j][k]][k] # dim=1일 때
out[i][j][k] = input[i][j][index[i][j][k]] # dim=2일 때

input과 index는 같은 차원을 가져야 하고, 각 차원에서 원소의 개수는 index보다 input이 더 많아야 한다. gather의 결과는 index의 shape와 동일하다.

3차원 tensor에서 대각선 요소 모으기

def get_diag_in_3d(A):
    dim_less = min(A.shape[1:])
    index = torch.tensor(range(dim_less)).expand((A.shape[0], 1, -1))
    view_shape = (A.shape[0], dim_less)
    output = A.gather(1, index).view(view_shape)

    return output

scatter_ : tensor의 값들을 연산 대상이 되는 tensor의 지정한 index위치에 쓴다.

src = torch.arange(1, 11).reshape((2, 5))
index = torch.tensor([[0, 1, 2, 0]])

torch.zeros(3, 5, dtype=src.dtype).scatter_(dim=0, index=index, src=src)
>>>
tensor([[1, 0, 0, 4, 0],
        [0, 2, 0, 0, 0],
        [0, 0, 3, 0, 0]])

index = torch.tensor([[0, 1, 2], [0, 1, 4]])
torch.zeros(3, 5, dtype=src.dtype).scatter_(1, index, src)
>>>
tensor([[1, 2, 3, 0, 0],
        [6, 7, 0, 0, 8],
        [0, 0, 0, 0, 0]])

위의 gather과 반대되는 동작을 한다고 볼 수 있다.

chunk, tensor_split : 주어진 개수만큼의 덩어리(chunk)로 tensor를 나눔

torch.arange(11).chunk(chunks=6)

>>>
(tensor([0, 1]),
 tensor([2, 3]),
 tensor([4, 5]),
 tensor([6, 7]),
 tensor([8, 9]),
 tensor([10]))

chunk는 지정된 chunks수 보다 더 적은 개수의 덩어리로 나뉠 수 있다. 지정한 chunks수 만큼 정확히 나누려면 torch.tensor_split()을 사용할 수 있다.

swapdims : tensor의 두 차원(dim)을 뒤바꾼다.

x = torch.tensor([[[0,1],[2,3]],[[4,5],[6,7]]])

torch.swapdims(input=x, dim0=0, dim1=1)
>>> 
tensor([[[0, 1],
        [4, 5]],

        [[2, 3],
        [6, 7]]])

torch.swapdims(input=x, dim0=0, dim1=2)
>>> 
tensor([[[0, 4],
        [2, 6]],

        [[1, 5],
        [3, 7]]])

torch.transpose(input, dim0, dim1)와 동일하게 사용할 수 있다.

Random Sampling

randn : 표준정규분포에서 smapling한 값으로 채워진 tensor를 반환한다.

torch.randn(2, 3)
>>>
tensor([[ 1.5954,  2.8929, -1.0923],
        [ 1.1719, -0.4709, -0.1996]])

Math operation

clamp : tensor의 원소의 범위를 제한한다.

a = torch.randn(4)
torch.clamp(a, min=-0.5, max=0.5)
>>>
tensor([-0.0157,  0.5409, -0.0850,  1.1789])

# torch >= 1.13만 가능
min=torch.linspace(-1, 1 ,steps=4)
torch.clamp(a, min=min)
>>> 
tensor([-1.0000,  0.1734,  0.3333,  1.0000])

min또는 max가 tensor라면 각 index별 최소값 또는 최대값을 지정하게 된다.

prod : tensor내 원소의 값들을 모두 곱해 반환한다.

a = torch.randn(1, 3) # tensor([[-0.8020,  0.5428, -1.5854]])
torch.prod(a) # tensor(0.6902)

a = torch.randn(4, 2)
torch.prod(a, dim=1) # tensor([-0.2018, -0.2962, -0.0821, -1.1831])

argmax : tensor의 원소 중, 최대값의 index들을 반환한다.

a = torch.randn(4, 4)
a
>>>
tensor([[ 1.3398,  0.2663, -0.2686,  0.2450],
        [-0.7401, -0.8805, -0.3402, -1.1936],
        [ 0.4907, -1.3948, -1.0691, -0.3132],
        [-1.6092,  0.5419, -0.2993,  0.3195]])

torch.argmax(a, dim=1)
>>>
tensor([ 0,  2,  0,  1])

allclose : 두 입력 tensor의 조건에 맞는지 확인한다. 보통 두 tensor가 비슷한 지 확인한다.

torch.allclose(torch.tensor([10000., 1e-07]), torch.tensor([10000.1, 1e-08])) # False
torch.allclose(torch.tensor([10000., 1e-08]), torch.tensor([10000.1, 1e-09])) # True
torch.allclose(torch.tensor([1.0, float('nan')]), torch.tensor([1.0, float('nan')])) # False
torch.allclose(torch.tensor([1.0, float('nan')]), torch.tensor([1.0, float('nan')]), equal_nan=True) # True

allclose(input, other, rtol=1e-05, atol=1e-08)형태로, $|\text{input}-\text{other}|\leq\texttt{atol}+\texttt{rtol}\times|\text{other}|$ 의 조건에 맞으면 True를 반환한다.
보통 testing등에 사용되는 것으로 알고 있다.

triu : 행렬의 upper triangular부분을 반환한다.

b = torch.randn(4, 6)
b
>>>
tensor([[ 0.5876, -0.0794, -1.8373,  0.6654,  0.2604,  1.5235],
        [-0.2447,  0.9556, -1.2919,  1.3378, -0.1768, -1.0857],
        [ 0.4333,  0.3146,  0.6576, -1.0432,  0.9348, -0.4410],
        [-0.9888,  1.0679, -1.3337, -1.6556,  0.4798,  0.2830]])

torch.triu(b, diagonal=1)
>>>
tensor([[ 0.0000, -0.0794, -1.8373,  0.6654,  0.2604,  1.5235],
        [ 0.0000,  0.0000, -1.2919,  1.3378, -0.1768, -1.0857],
        [ 0.0000,  0.0000,  0.0000, -1.0432,  0.9348, -0.4410],
        [ 0.0000,  0.0000,  0.0000,  0.0000,  0.4798,  0.2830]])

einsum : Einstein summation convention 표기법을 기반으로 피연산자 원소들의 곱의 합을 계산한다.
- 참고
- (https://en.wikipedia.org/wiki/Einstein_notation)[https://en.wikipedia.org/wiki/Einstein_notation]
- (https://baekyeongmin.github.io/dev/einsum/)[https://baekyeongmin.github.io/dev/einsum/]
addmm : 두 행렬의 행렬곱을 계산 후, input행렬을 더함.

M = torch.randn(2, 3)
mat1 = torch.randn(2, 3)
mat2 = torch.randn(3, 3)
torch.addmm(input=M, mat1=mat1, mat2=mat2, beta=1, alpha=1)
>>>
tensor([[-4.8716,  1.4671, -1.3746],
        [ 0.7573, -3.9555, -2.8681]])

$\text{out}=\beta\text{input}+\alpha\left(\text{mat1}@\text{mat2}\right)$ 를 계산하게 된다.

`torch.linalg`

여러 선형대수학 관련 함수들을 포함.
norm, svd decomposition, inverse등을 지원
(https://pytorch.org/docs/stable/linalg.html#)[https://pytorch.org/docs/stable/linalg.html#]

lijm1358

ML, DL 공부중

이전 포스트

부스트캠프 AI Tech 주간학습정리 - week1

다음 포스트