Cross-Entropy gradient

d4r6j·2023년 9월 24일
0

math

목록 보기
1/3
post-thumbnail

Cross-Entropy gradient : casecase (Hard distillation)

with respect to each logit, ziz_i of the distilled model.

If the cumbersome model has logits viv_i which produce soft target probabilities pip_i

and the transfer training is done at a temperature of TT,

Czi=1T(qipi)\frac{\partial C}{\partial z_i} = \frac{1}{T}(q_i-p_i)

[REF]
paper : https://arxiv.org/pdf/1503.02531.pdf
blog : https://jmlb.github.io/ml/2017/12/26/Calculate_Gradient_Softmax/


Softmax, Cross-Entropy fomula

Simple neural-networks

Process of solving gradient

Solve gradient

Generally neural-networks

Process of solving gradient

Solve gradient

0개의 댓글