딥러닝에서 Logit, Soft Probability, Hard Probability의 차이와 활용법

Bean·2025년 9월 29일

AI DeepLearning DiceLoss HardPrediction Probability logit softmax 딥러닝 모델평가

인공지능

목록 보기

121/134

1. Logit이란?

Logit은 딥러닝 모델의 마지막 레이어(Linear Layer)에서 나오는 원시 출력 값 (raw score)입니다.
확률이 아니며, 음수/양수를 가질 수 있습니다.

예시:

logits = model(x)  # raw score

2. Soft Probability (Soft Prob)

Logit에 softmax(multi-class) 또는 sigmoid(multi-label)를 적용한 값입니다.
확률 분포를 나타내며, 모든 클래스의 합은 1이 됩니다.

p_i = \frac{e^{z_i}}{\sum_j e^{z_j}}

특징
- 미분 가능 → 학습(Loss 계산)에 사용
- 모델의 불확실성 반영 가능
- 예시: [0.1, 0.7, 0.2]

3. Hard Probability (Hard Prob)

Soft Prob에서 가장 큰 값을 가지는 클래스를 선택한 결과입니다.
보통 argmax를 통해 얻습니다.

예시:
Soft Prob: [0.1, 0.7, 0.2] → Hard Prob: [0, 1, 0]

특징
- 미분 불가능 → 학습에 적합하지 않음
- 사람이 해석하기 직관적
- 평가/배포 시 필수

4. 사용 시기 정리

상황	Logit	Soft Prob	Hard Prob
학습(Training)	모델 출력(raw)	✅ Loss 계산	❌
검증(Validation)	가능	선택적	✅ 일반적으로 사용
테스트/배포(Inference)	❌	❌	✅ 최종 예측 결과

5. 코드 예시

학습 시 (Soft Probability 활용)

def dice_loss(pred, target, smooth=1e-6):
    pred_soft = F.softmax(pred, dim=1)  # Soft probability
    target_one_hot = F.one_hot(target, num_classes=pred.size(1)).permute(0, 2, 1).float()
    
    intersection = (pred_soft * target_one_hot).sum(dim=2)
    dice_score = (2 * intersection + smooth) / (pred_soft.sum(dim=2) + target_one_hot.sum(dim=2) + smooth)
    return 1 - dice_score.mean()

평가 시 (Hard Probability 활용)

def dice_score(pred_logits, target, smooth=1e-6):
    pred_classes = torch.argmax(torch.softmax(pred_logits, dim=1), dim=1)  # Hard prediction
    pred_one_hot = F.one_hot(pred_classes, num_classes=pred_logits.size(1)).permute(0, 2, 1).float()
    target_one_hot = F.one_hot(target, num_classes=pred_logits.size(1)).permute(0, 2, 1).float()
    
    intersection = (pred_one_hot * target_one_hot).sum(dim=2)
    dice_scores = (2 * intersection + smooth) / (pred_one_hot.sum(dim=2) + target_one_hot.sum(dim=2) + smooth)
    return dice_scores.mean()