Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, 2018 (MIT Press) - A classic and comprehensive textbook that provides a foundational understanding of reinforcement learning, including a detailed explanation of actor-critic methods.
Asynchronous Methods for Deep Reinforcement Learning, Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu, 2016ICML 2016DOI: 10.48550/arXiv.1602.01783 - Introduces the Asynchronous Advantage Actor-Critic (A3C) algorithm, a highly influential deep reinforcement learning method that effectively applies the actor-critic paradigm with advantage estimation.
Lecture Notes: Actor-Critic, Emma Brunskill, 2024CS234: Reinforcement Learning (Stanford University) - Comprehensive lecture notes from a leading university course, providing a clear and concise explanation of actor-critic fundamentals, including their derivation and benefits.
High-Dimensional Continuous Control Using Generalized Advantage Estimation, John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan and Pieter Abbeel, 2016International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.1506.02438 - Introduces Generalized Advantage Estimation (GAE), a widely adopted technique for significantly reducing variance in policy gradient methods, which is crucial for advanced actor-critic algorithms.