4. Backpropagation and Neural Networks

Speedwell๐Ÿ€ยท2022๋…„ 3์›” 27์ผ
0

cs231n

๋ชฉ๋ก ๋ณด๊ธฐ
5/8

๐Ÿ“Œ ๋ฐœํ‘œ ์ž๋ฃŒ ํ‚คํฌ์ธํŠธ ๐Ÿ“Œ

  • Review ๊ฐ„๋‹จํžˆ (Optimization, Gradient Descent)
  • Computational graphs ํ๋ฆ„
    • loss function
    • input, parameter
  • Gradient ๊ฐ„๋žตํ•œ ์„ค๋ช…
    • matrix ์ƒ์—์„œ gradient์˜ ์˜๋ฏธ (partial derivative)
  • Backpropagation (scalar)
    • Chain Rule์— ๋”ฐ๋ผ local gradient๊ฐ€ ์—ญ์ „ํŒŒ ๋˜๋Š” ๊ณผ์ • ๋ณด์—ฌ์ฃผ๊ธฐ (๊ฐ•์˜ ์˜ˆ์ œ 2๊ฐœ: ์Šฌ๋ผ์ด๋“œ23~28 ๋”ฐ๋ผํ•˜๊ธฐ)
    • ๊ณผ์ • : example๋“ค ์ฐฌ์ฐฌํžˆ ๋‹ค ๊ณ„์‚ฐํ•ด์ฃผ๋Š” ๊ฒƒ ๋ณด์—ฌ์ฃผ์„ธ์š”!
    • gate๋ณ„ ์—ญ์ „ํŒŒ flow (add, max, mul, copy ๊ฒŒ์ดํŠธ->๊ฒ€์ƒ‰!)
    • Forward pass์™€ gradient computation ๊ณผ์ • ์ฝ”๋“œ๋กœ ๋ณด์—ฌ์ฃผ๊ธฐ & ์„ค๋ช…
  • Backpropagation(vector)
    • Jacobian matrix ์„ค๋ช…
    • ๊ฐ•์˜ ์Šฌ๋ผ์ด๋“œ vectorized operations ์˜ question์— ๋Œ€ํ•œ ๋‹ต๋ณ€
    • vector jacobian matrix ์˜ˆ์ œ(cs231n ๊ฐ•์˜ ์Šฌ๋ผ์ด๋“œ์— ์žˆ๋Š” ์˜ˆ์ œ ์ œ์™ธ)๋ฅผ 2~3๊ฐœ ์ •๋„ ๊ฐ€์ ธ์™€์„œ ์ฐฌ์ฐฌํžˆ ๊ณ„์‚ฐ/์„ค๋ช… (์„ธ๋ฏธ๋‚˜ ์‹œ๊ฐ„์— ๊ฐ™์ด ํ’€์–ด๋ณด๋Š” ์‹์œผ๋กœ ์ง„ํ–‰ํ•ด์ฃผ์„ธ์š”!)
    • forward() / backward() API
  • Neural Network
    • ๊ฐœ๋…(architecture.. etc)
    • activation function (ํ•จ์ˆ˜๋ณ„ ์„ค๋ช… ๋ฐ ๋‹จ์ , ๋ณด์™„ ; ex. ReLU์˜ ๋‹จ์  -> ๋ณด์™„-> Leaky ReLU)
    • Architectures : Fully-Connected Layer ์˜๋ฏธ, (vs. 1x1 conv.layer์™€๋„ ๋น„๊ตํ•ด์ฃผ๋ฉด ์ข‹๊ฒ ์Šต๋‹ˆ๋‹ค.)

์ฐธ๊ณ )
๊ฐ•์˜: https://www.youtube.com/watch?v=d14TUNcbn1k&list=PLC1qU-LWwrF64f4QKQT-Vg5Wr4qEE1Zxk&index=4&ab_channel=StanfordUniversitySchoolofEngineering

์ž๋ฃŒ: http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture4.pdf

0๊ฐœ์˜ ๋Œ“๊ธ€