CS224R: Deep Reinforcement Learning
Stanford CS224R course notes - Deep RL fundamentals, policy gradients, Q-learning, model-based RL
1.
Lec 01 - Introduction to RL
Core concepts, MDP vs POMDP, policy and value functions, RL algorithm types
2.
Lec 02 - Imitation Learning
Behavioral cloning, DAgger, HG-DAgger, and addressing compounding errors
3.
Lec 03 - Policy Gradients
Policy gradient derivation, REINFORCE, variance reduction, off-policy methods
4.
Lec 04 - Actor-Critic
Value functions, advantage estimation, actor-critic algorithm, bootstrapping
References
- CS224R Lectures Stanford University (2025)