Inverse Reinforcement Learning

Created January 22, 2023 · Updated March 4, 2026

Given the $$(s,a, R)$$ triple, where $$R$$ is the reward from expert annotations, learn the function $f(s, a) \rightarrow R$ .

Presupposition: Reward function provides the most succinct and transferable definition of a task.

References