Reinforcement Learning

RL-taxonomy

Fundamentals

Introduction, MDP and Bandits

Sequential Decision Making

MC Methods and TD(0)

Advanced TD Methods

Prediction with Approximation

Control with Approximation

model-free-rl

Model Free Reinforcement Learning

Poilcy Gradient - REINFORCE and Approximations

Policy Gradient - PGT and GAE

Deterministic PG and Evaluation

  • Deterministic Policy Gradient
  • Deep Deterministic Policy Gradient (DDPG)
  • Evaluating RL algorithms
  • Soft Actor-Critic

Planning and Learning

Model Based Reinforcement Learning can be categorized as:

Observations-predicting:

Value-predicting:

Partial Observability

Pure Exploration

  • Best Arm Identification

Other

Focus on understanding the methods and the relationship between them rather than on remembering e.g. update equations

Especially important: know the advantages, disadvantages and limitations of each methods, and the situations where a certain method should be preferred.

Resources

  1. Solutions to RL:AI 2nd Edition: https://github.com/LyWangPX/Reinforcement-Learning-2nd-Edition-by-Sutton-Exercise-Solutions/