RL

1.벨만 방정식(Bellman equation)

post-thumbnail

2.정책 그래디언트 (policy gradient)

post-thumbnail

3.A2C(Advantage Actor-Critic) 알고리즘

post-thumbnail