Reinforcement Learning - hyukppenheim youtube

1.1-1. Q-learning

post-thumbnail

2.2-1. Markov Decision Process(MDP)

post-thumbnail

3.2.2 State value function, Action value function & Optimal policy

post-thumbnail

4.2.3 Bellman equation

post-thumbnail

5.3.1 Optimal policy - more details

post-thumbnail

6.3.2 Monte Carlo(MC)

post-thumbnail

7.3.3 Temporal difference(TD) & SARSA

post-thumbnail

8.3.4 MC vs TD

post-thumbnail

9.4.1 On-policy vs Off-policy

post-thumbnail

10.4.2 Q-learning(advanced)

post-thumbnail

11.4.3 SARSA vs Q-learning

post-thumbnail

12.4.4 n-step TD vs n-step Q-learning

post-thumbnail

13.5.1 2013 DQN paper review

post-thumbnail