Basic : The First Dense Layer

Austin Jiuk Kim·2022년 3월 24일
0

Deep Learning

목록 보기
6/10

Parameters of Dense Layer

(x)T=(x1xi)(\overrightarrow{x})^T = (x_1 \dots x_i)

goes through
Li\dots \: {L}_i \: \dots

, which is composed of
νi[iL]\dots \: {\nu}_{i}^{\:[i_L]} \: \dots

, and which is composed of
wi,iν[iL]Rlw×1\dots \: \overrightarrow{w}_{i,\: i_{\nu}}^{[i_L]} \: \dots \in \R^{l_w \times 1}
biν[iL]R1×1\: \dots \: {b}_{\: \: i_{\nu}}^{[i_L]} \: \dots \in \R^{1 \times 1}

There are two reasons why the weights is arrayed in column vector : One is that Algebra sets column vecters as the default, the other is that especially dense layer read the vector of weights in the column type.

Weighted Matrix and Bias Vector

combine the column vectors of the weights as matrix.

The shape of the weight matrix is that the length of Input times the length of Output.

W[iL]Rlw×lν{W}^{[i_L]} \in \R^{l_w \times l_{\nu}}
b[iL]R1×lν\: \overrightarrow{b}^{[i_L]} \in \R^{1 \times l_{\nu}}

Forward Propagation of Dense Layer

ai[iL]=νi[iL]((x)T;wi,iν[iL],biν[iL]){a}_i^{[i_L]} = \nu_i^{\:[i_L]}((\overrightarrow{x})^T; \: \overrightarrow{w}_{i,\: i_{\nu}}^{[i_L]}, \: b_{\: \: i_{\nu}}^{[i_L]})
(a[iL])T=(x)TW[iL]+b[iL](\overrightarrow{a}^{[i_L]})^T = (\overrightarrow{x})^T \cdot {W}^{[i_L]} + \overrightarrow{b}^{[i_L]}



(x)TR1×lx(\overrightarrow{x})^T \in {\R}^{1 \times {l_x}}

becomes
(a)TR1×lν(\overrightarrow{a})^T \in {\R}^{1 \times {l_{\nu}}}
profile
그냥 돼지

0개의 댓글