Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, 2018 (MIT Press) - Comprehensive textbook covering fundamental reinforcement learning concepts, including value functions, policy gradients, and Actor-Critic methods.
Policy Gradient Methods for Reinforcement Learning with Function Approximation, Richard S. Sutton, David A. McAllester, Satinder P. Singh, and Yishay Mansour, 2000Advances in Neural Information Processing Systems, Vol. 13 - Introduces the policy gradient theorem, which provides the theoretical foundation for updating the Actor's policy parameters in Actor-Critic algorithms.
Asynchronous Methods for Deep Reinforcement Learning, Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu, 2016International Conference on Machine Learning (ICML)DOI: 10.48550/arXiv.1602.01783 - Presents A3C, a significant deep Actor-Critic algorithm that effectively implements the architecture described, demonstrating its efficiency in various environments.