[PyTorch] nn.LogSoftmax()와 nn.NLLLoss()

olxtar·2022년 3월 28일

LogSoftmax

CLASS torch.nn.LogSoftmax(dim=None)

Applies the $log(Softmax(x))$ function to an n-dimensional input Tensor. The LogSoftmax formulation can be simplified as:

LogSoftmax(x_i)=log\left(\frac{e^{x_i}}{\sum_n e^{x_n}}\right)

음... 그냥 Softmax function에 log 취한것이다!

왜 log를 씌워주는 것 일까?

첫번째 그래프는 $y=log\;x$
두번째 그래프는 $y=-log\;x$
여기서 $x$ : input으로 들어오는 즉, 예측한 확률
여기서 $y$ : output 즉 loss라고 생각하자 (편하게~)
예측한 확률 $x$ 가 100%에 가까울수록 즉, 1에 가까울수록 loss값은 매우 작아지고
예측한 확률 $x$ 가 0%에 가까울수록 즉, 0에 가까울수록 loss값은 매우 커진다.

NLLLoss (Negative Log Likelihood Loss)

Class torch.nn.NLLLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean')

The negative log likelihood loss. It is useful to train a classification problem with C classes.

If provided, the optional argument weight should be a 1D Tensor assigning weight to each of the classes. This is particularly useful when you have an unbalanced training set.
음... weight argument를 넣어주기도 하나본데... 불균형한 Training set을 훈련할때 1D Tensor의 각 Class별 가중치를 넣어주기도 하나보다

The input given through a forward call is expected to contain log-probabilities of each class. input has to be a Tensor of size either ( minibatch, C )
$\dots$ 생략 $\dots$
보통 NLLLoss는 input으로 log-probabilities를 받는다.
그래서 Log-Softmax function과 연계(?)된다.

여기서 잠깐!
Loss function이란... 예측값과 실제값(정답)의 차이의 정도를 내뱉는 함수이다.
또한 Loss를 통해서(=감소시키기 위해서) Gradient Descent Step을 밟는 것 이다.

DeepLearning과 같이 무수히 많은 input data를 연산할 때, 단 한개의 input data에 대한 loss값을 구해서 step을 밟는 것은 너무 많은 연산을 필요로 하므로 보통 minibatch or batch라는 개수의 input data의 loss를 구하게 된다. (평균?)

$\therefore$ 위의 굵은 글씨로 표현된 것 처럼 Loss function의 input tensor size가 ( minibatch, C )가 되는 것이다.
ex) 0~9까지의 숫자 손글씨를 분류, batch_size = 64 $\rightarrow$ (64,10) size의 input tensor

NLLLoss(Negative Log Likelihodd Loss) function는 주로 C개의 클래스로 분류하는 문제의 사용된다. 수식은 아래와 같다.

loss = -w_{y_n}\cdot x_{n,{y_n}}

$x_n$ : n번째 input
$y_n$ : n번째 input에 대한 label(정답)
$w$ : 그 잘 알고있는 weight가 아닌, 해당 loss function에서 input별로 주고 싶은 weight

따라서 심플하게 생각했을때에는... 아래와 같다.

loss = -x

이게 뭐야... 그냥 음의 값만 취해주는거야? -> ㅇㅇ
대신 위에서 말한 loss function의 역할처럼 mean을 내주는 기능이 있음(?)

Code

x = torch.Tensor([[0.8982, 0.805, 0.6393, 0.9983, 0.5731,
				   0.0469, 0.556, 0.1476, 0.8404, 0.5544]])
y = torch.LongTensor([1])

# Case 1
cross_entropy_loss = torch.nn.CrossEntropyLoss()
print(cross_entropy_loss(x, y)) # tensor(2.1438)

# Case 2
log_softmax = torch.nn.LogSoftmax(dim=1)
x_log = log_softmax(x)
print(NLLLoss(x_log, y)) # tensor(2.1438)

# Case 3
nll_loss = torch.nn.NLLLoss()
print(nll_loss(x_log, y)) # tensor(2.1438)

Keypoint
1. $NLLLoss$ function은 $log$ 를 취해주지 않는다! ( $log$ probability를 받아서 계산하는~)
2. $CrossEntropyLoss$ = $Softmax$ + $log$ + $NLLLoss$
ㄴ $CrossEntropyLoss$ = $LogSoftmax$ + $NLLLoss$

olxtar

예술과 기술

이전 포스트

[PyTorch] nn.Softmax()

다음 포스트

[PyTorch] nn.LogSoftmax()와 nn.NLLLoss()

LogSoftmax

NLLLoss (Negative Log Likelihood Loss)

[PyTorch] nn.Softmax()

[PyTorch] Autograd-02 : With Jacobian

0개의 댓글

관련 채용 정보