나만봄 - 역전파가 뭐야

esc247·2023년 3월 9일

AI 나만봄 딥러닝

AI

목록 보기

8/22

이전 포스팅이 너무 길어질 것 같아서
나눠서 포스팅을 한다.

chatGPT의 답변

이번에도 말이 많아 요약하면,

in order to minimize the difference between the predicted output and the actual output.
the error between the predicted output and the actual output is propagated back through the network, and the weights are adjusted based on the magnitude of the error
using the chain rule

Input Layer에서 Output Layer 방향으로 계산하는 것이 순전파(Forward propagation)라면
Output Layer에서 Input Layer 방향으로
각 층에 사용된 가중치를 오차를 줄여나가는 방향으로 갱신하는 알고리즘.
연쇄 법칙(chain-rule)기반 자동 미분(auto-differentiation)을 사용한다.

질문 1. Chain-Rule이 무엇인가?

The chain rule is used to calculate the derivative of a composite function. The chain rule formula states that dy/dx = dy/du × du/dx. In words, differentiate the outer function while keeping the inner function the same then multiply this by the derivative of the inner function. - https://mathsathome.com/chain-rule-differentiation/

사용된 단어 그대로 줄줄이 연산을 하는 것인데,
예시를 먼저 보자.