시리즈

reinforcement learning

1.강화학습 스터디 정리

RL.start() 오픈 카톡방에서 만난 분들과 함께 20.10 ~ 20.12 진행하는 sutton reinforcement learning 2nd edition 책 공부 내용을 정리하고자 합니다.

2020년 11월 15일

2.[Chapter 4] Dynamic Programming

Dynamic Programming(이하 DP)는 Markov Decision Process(이하 MDP)의 model이 완벽하게 주어졌을 때 최적의 policy를 찾게해주는 방법이다. 단, 2가지의 조건 즉 제약이 따르는데 the assumption of a per

2020년 11월 15일

3.[chapter 8] Planning and Learning with Tabular Methods

A unified view of reinforcement learning methods thatrequire a model of environment - dynamic programming, heuristic searchcan be used without a model

2021년 2월 21일