Are you crazy, human?

Are you crazy, human?

시리즈

Reinforcement Learning

1.Reinforcement Learning

강화학습 톺아보기

2022년 8월 2일

2.Markov Decision Process

강화학습의 전제

2022년 8월 10일

3.Dynamic Programming

하위 방법의 반복을 통해 최적 해 구하기

2022년 8월 11일

4.Policy와 Value function

state value와 action value

2022년 8월 11일

5.Policy Iteration과 Value Iteration

optimal policy와 optimal value function을 찾아서

2022년 8월 12일

6.Sync & Async DP

Asynchronous Dynamic Programming

2022년 8월 17일

7.Monte Carlo Method

무작위성을 통해 정답 추정하기

2022년 8월 12일

8.Monte Carlo Prediction

Monte Carlo에서의 visit

2022년 8월 12일

9.Monte Carlo Control

Monte Carlo의 수렴성 증명

2022년 8월 12일

10.Temporal Difference: Intro

Not wait, Use estimate

2022년 8월 17일

11.Temporal Difference: Pred

TD: Prediction

2022년 10월 31일

12.Temporal Difference: Ctrl

TD: Control

2022년 11월 2일

13.n-step Bootstrapping

Between MC and TD(0)

2022년 11월 3일

14.Dyna Q

Plan and Train based on table

2022년 11월 9일

15.Function Approximation: Intro

Estimating Value Functions as Supervised Learning

2022년 11월 9일

16.Function Approximation: Pred

The Objective for On-policy Prediction

2022년 11월 9일

17.Linear TD Update

Semi-Gradient & State Aggregation

2022년 11월 10일

18.Feature Construction for Linear Methods

Coarse coding & State Aggregation

2022년 11월 14일

19.Policy Gradient

Softmax

2022년 12월 16일

20.Actor-Critic

For Continuing Tasks

2022년 12월 16일

21.Policy Parameterization

Softmax

2023년 1월 12일

22.Information Theory - Entropy

Information Theory is the way to share idea. Each media has a diffent for consuming time and size of content. For example, sending by letter is takes

2023년 1월 30일