NLLLoss

TEMP·2021년 10월 12일

Torch_LOSS

목록 보기

3/5

기본적으로 torch.nn.LOSS 는 class torch.nn.functional.LOSS는 function이다.

따라서 trainer에서 들어가는 위치가 다르다.
내가 주로 사용하는 trainer에서는 params로 class를 받을 수도 있고 아니면 trainer안에서 직접 function을 넣어 줄 수도 있다.

항상 공식문서 부터 보자. 뭔 말인지 몰라도 그냥 보자. 공식이다. 오피셜이다. 정확하다. 친절하다. 예시도 잘나와있다. 공식문서 보는 연습을 해야 아무도 모르는것도 알 수 있다.

당연한건데 custom loss는 무조건 느리다. 또한 nan을 생성하기도 한다.
다만 어떻게 계산되는지 이해해보고자 만들었다.

결론 numpy는 channel이 뒤에 있는데 torch는 dim=1이다.
이게 이유가 있었다. 진짜 엄청 편하다.

torch는 기본적으로 function을 만들어 놓고 class도 같이 만들어 놨는데 간단한 수정이면 class를 상속받고 전체를 수정하고 싶다면 function으 가지고 와서 class를 만들면 된다.

torch.nn.NLLLoss

https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html

torch.nn.functional.nll_loss

https://pytorch.org/docs/stable/generated/torch.nn.functional.nll_loss.html#torch.nn.functional.nll_loss

이거 공식문서의 식을 보고 진짜 전혀 뭔 말을 하는지 몰랐는데 이유가 있었다.
DL을 처음 배울때 CEE를 배웠었는데 CEE를 기본으로 생각하니 다른 식들은 머리에 들어오지 않았다.

이건 위의 CEE를 만들기 위한 baseline정도라고 생각하면 될거 같다.

Output shape : tensor( Batch × Class × k-dimension )
Target shape : tensor( Batch × k-dimension )

loss = nn.NLLLoss(reduction='sum')
outputs = torch.rand(10, 5, requires_grad=True)
targets = torch.tensor([0,2,1,3,4,3,2,1,3,1])
loss(outputs, targets)

def myloss(outputs, targets):
    onehot = torch.nn.functional.one_hot(targets).float()
    hadamrd = outputs*onehot
    sum = torch.sum(hadamrd, dim=1)
    return -torch.sum(sum)
    
myloss(outputs, targets)

위 두개의 값이 같다.
이는 1D 일때고 2D 일때도 해보자.

실제로 이게 필요했는데 이건 공식문서에 없지요.

다음과 같이 생각하니 이해가 되었다.
pixel-wise <- 엄청 강력한 단어인거같다. 저 한단어로 이해가 된다.

loss = nn.NLLLoss(reduction='sum')
outputs = torch.rand(2, 5, 10, 10, requires_grad=True)
targets = torch.tensor([[
                       [4,0,1,2,4,4,2,1,3,4],
                       [0,2,1,3,4,2,0,2,3,1],
                       [2,1,1,2,4,2,2,1,3,0],
                       [0,0,1,3,2,3,3,4,3,1],
                       [4,2,1,3,4,3,2,4,2,1],
                       [4,0,1,2,4,4,2,1,3,4],
                       [0,2,1,3,4,2,0,2,3,1],
                       [2,1,1,2,4,2,2,1,3,0],
                       [0,0,1,3,2,3,3,4,3,1],
                       [4,2,1,3,4,3,2,4,2,1]],
                       
                      [[4,0,1,2,4,4,2,1,3,4],
                       [0,2,1,3,4,2,0,2,3,1],
                       [2,1,1,2,4,2,2,1,3,0],
                       [0,0,1,3,2,3,3,4,3,1],
                       [4,2,1,3,4,3,2,4,2,1],
                       [4,0,1,2,4,4,2,1,3,4],
                       [0,2,1,3,4,2,0,2,3,1],
                       [2,1,1,2,4,2,2,1,3,0],
                       [0,0,1,3,2,3,3,4,3,1],
                       [4,2,1,3,4,3,2,4,2,1]]])


loss(outputs, targets)

마찬가지로 custom 해보았다.

이거 진짜 엄청 헷갈리는데 숫자 다 다르게 한 다음 shape 찍어보는게 좋다.

def myloss(outputs, targets):
    onehot = torch.nn.functional.one_hot(targets).float()
    reshape = np.transpose(onehot, (0,3,1,2))
    hadamrd = outputs*reshape
    sum = torch.sum(hadamrd, dim=1)
    return -torch.sum(sum)
myloss(outputs, targets)

Clear !

추가적으로 중간에 log+softmax 넣어주면 아래와 같음.

하나면 더 추가하자면 당연히 square가 아니여도 된다.

loss = nn.CrossEntropyLoss(reduction='sum')
outputs = torch.rand(2, 5, 10, 15, requires_grad=True)
targets = torch.tensor([[
                       [4,0,1,2,4,4,2,1,3,4,4,2,1,3,4],
                       [0,2,1,3,4,2,0,2,3,1,4,2,1,3,4],
                       [2,1,1,2,4,2,2,1,3,0,4,2,1,3,4],
                       [0,0,1,3,2,3,3,4,3,1,4,2,1,3,4],
                       [4,2,1,3,4,3,2,4,2,1,4,2,1,3,4],
                       [4,0,1,2,4,4,2,1,3,4,4,2,1,3,4],
                       [0,2,1,3,4,2,0,2,3,1,4,2,1,3,4],
                       [2,1,1,2,4,2,2,1,3,0,4,2,1,3,4],
                       [0,0,1,3,2,3,3,4,3,1,4,2,1,3,4],
                       [4,2,1,3,4,3,2,4,2,1,4,2,1,3,4]],
                       
                      [[4,0,1,2,4,4,2,1,3,4,4,2,1,3,4],
                       [0,2,1,3,4,2,0,2,3,1,4,2,1,3,4],
                       [2,1,1,2,4,2,2,1,3,0,4,2,1,3,4],
                       [0,0,1,3,2,3,3,4,3,1,4,2,1,3,4],
                       [4,2,1,3,4,3,2,4,2,1,4,2,1,3,4],
                       [4,0,1,2,4,4,2,1,3,4,4,2,1,3,4],
                       [0,2,1,3,4,2,0,2,3,1,4,2,1,3,4],
                       [2,1,1,2,4,2,2,1,3,0,4,2,1,3,4],
                       [0,0,1,3,2,3,3,4,3,1,4,2,1,3,4],
                       [4,2,1,3,4,3,2,4,2,1,4,2,1,3,4]]])


loss(outputs, targets)





def myloss(outputs, targets):
    onehot = torch.nn.functional.one_hot(targets).float()
    reshape = np.transpose(onehot, (0,3,1,2))
    logsoft_out = nn.LogSoftmax(dim=1)
    logsoft_out_value = logsoft_out(outputs)
    hadamrd = logsoft_out_value*reshape
    sum = torch.sum(hadamrd, dim=1)
    return -torch.sum(sum)
myloss(outputs, targets)

TEMP

이전 포스트

Loss function_BASE

다음 포스트