Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, 2018 (The MIT Press) - Comprehensive textbook covering foundational reinforcement learning algorithms, including Monte Carlo methods, TD learning, policy gradients, and actor-critic methods.
Asynchronous Methods for Deep Reinforcement Learning, Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Timothy Harley, David Silver, Koray Kavukcuoglu, 2016Proceedings of the 33rd International Conference on Machine Learning (ICML), Vol. 48 (PMLR)DOI: 10.48550/arXiv.1602.01783 - Presents Asynchronous Advantage Actor-Critic (A3C) and its synchronous variant (A2C), demonstrating their effectiveness in deep reinforcement learning tasks.