Human-level control through deep reinforcement learning, Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Gimel, Andriy Blun, Daan Wierstra, John Wenzeslav, Remi Munos, 2015Nature, Vol. 518DOI: 10.1038/nature14236 - 这篇基础论文介绍了深度Q网络(DQN)算法,其中包括经验回放机制,该机制解决了深度强化学习中数据相关性和学习不稳定性等问题。