Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, 2018 (The MIT Press) - 一本全面的教科书,涵盖了强化学习的基础概念,包括表格法、函数逼近(线性和非线性)、策略梯度方法以及诸如“致命三元组”等挑战。
Human-level control through deep reinforcement learning, Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg and Demis Hassabis, 2015Nature, Vol. 518 (Nature Publishing Group)DOI: 10.1038/nature14236 - 介绍了深度Q网络(DQN),这是一项将深度神经网络与强化学习成功结合的开创性工作,利用经验回放和目标网络等技术来稳定函数逼近的学习过程。