A. A common way: You need (1) a position (x,y), (2) the orientation (θ), (3) velocity (x˙,y˙), and (4) angular velocity (θ˙). (6-dimensional state representation.)
Last example: The inverted pendulum.
→ x,θ,x˙,θ˙.
In this lecture, we focus on problems where the state-space is S=Rn.
Discretization
.. the most straightforward way to work with a continuous state.
The action space is much lower-dimensional than the state space. E.g., For a car, s is 6-dimensional, and a is 2-dimensional: steering and braking. For a helicoptoer, s is 12-dimensional, and a is 4-dimensional with two control sticks. For an inverted pendulum, s is 4-dimensional, and a is 1-dimensional.
Andrew Ng comment: I think model-based RL has been taking off faster. A lot of the most promising approaches are model-based RL because if you have a physical robot, you just can’t afford to have a reinforcement learning algorithm bash your robot around for too long. Or how many helicopters do you want to crash before your learning algorithm figures it out?
Model-free RL works fine if you want to play video games because if you’re trying to get a computer or play chess or Othello or Go. You have perfect simulators for the video games which are video games themselves. So your RL algorithm can blow up hundreds of millions of times in a video game.
Although, again, the field is evolving quickly so there’s very interesting work at the intersection of model-based and model-free that gets more complicated.
Q. How to model the distribution of noise ϵt?
A. One thing you could do is estimate it from data. But as a practical matter, a lot of reinforcement learning algorithms will learn a very brittle model that works in your simulator but doesn’t really work when you put it into your real robot.
If you have a deterministic simulator using these methods, it’s not that hard to generate a cool-looking video of your reinforcement learning algorithm supposedly controlling a five-legged thing or something. But it turns out that deterministic methods are more likely to fail in real robots than in simulator.
It is very important to add some noise to your simulator if you want to your model-based simulator to actually work on a physical robot.
The exactness of distribution of noise actually matters less than adding some noise.