Basic : Softmax Layer

Austin Jiuk Kim·2022년 3월 26일
0

Deep Learning

목록 보기
10/10

Softmax Layer

Softmax is userd for multinominal classification.

The number of classifiers of the classification comes from the number of the neurons in the last layer.

And this is the number of the outputs as probability.

In Tensorflow, Softmax is a kind of Activation-Function.

Each output for the neuron of the last layer can be recognized as a logit.

(p)TSi((l)T)=pi=elik=1K[elk](l)T(\overrightarrow{p})^{T} \:\\ S_i((\overrightarrow{l})^{T}) = p_i = {{e^{l_i}} \over {\sum_{k=1}^{K} [e^{l_k}]}} \:\\ (\overrightarrow{l})^{T}

In softmax, the logit vector is converted to the probability vector. And the sum of the element of the probability vector equals 1.

For binary classification, there are two ways. One is to use Sigmoid with the last one neuron, another is to use Softmax with the last two neurons.

Actually, there is only Affine-Function and no Activation-Function at output layer. Instead of that, Softmax converts the vector z to the vector p

(p)T(z)[O](a)[O1](\overrightarrow{p})^{T} \:\\ (\overrightarrow{z})^{[O]} \:\\ (\overrightarrow{a})^{[O-1]}

In conclusion, Softmax is converter from logits to probabilities, Sigmoid is converter from a logit to aprobability.

Softmax Layer

Y^TSoftmaxL[O]L[I]XT\hat{Y}^{T} \:\\ Softmax \:\\ L^{[O]} \:\\ L^{[I]} \:\\ X^T
profile
그냥 돼지

0개의 댓글