
Model-free reinforcement learning methods learn a policy or value function directly from experience without explicitly learning the environment’s transition dynamics or reward model.
Model-based reinforcement learning methods explicitly learn or use a model of the environment dynamics (and possibly rewards), i.e., and/or , to improve decision making.
A world model is a learned model that predicts future states (and optionally rewards) from current states and actions.
