I'm studying for my exam in "modern artificial intelligence in games". I'm a bit confused about some of the many types of reinforcement learning. Perhaps someone knows a good way to tell them all apart? I got some holes in my knowledge - can someone help me fill them?
Q-learning Link
Q-learning looks at the next state (s+1), and updates the current state as such:
Q-learning uses bootstrapping:
Bootstrapping: Estimate how good a state is based on how good we think the next state is
TD(λ)
Is exactly like Q-learning, but uses λ to find out how far it should bootstrap. TD(0) = Q-learning.
SARSA - Link
Looks at State(t+1), Action(t+1), Reward(t+2), State(t+2), Action(t+2).
(What's the difference between SARSA and Q-learning? Looks very alike)
MC Link
Monte Carlo methods uses no bootstrapping.
Updates a state purely based on values returned by performing actions in the given state.
Dynamic Programming
It's a bit out of scope, but I have no idea how it works.
Any input on these subjects is appreciated - many papers on this is poorly explained (well I think so at least).
Thanks!