Human-level control through deep reinforcement learning, Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis, 2015Nature, Vol. 518DOI: 10.1038/nature14236 - This original paper introduced Deep Q-Networks (DQN) and details the experience replay mechanism and its role in stabilizing training.
Prioritized Experience Replay, Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver, 2016International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.1511.05952 - This paper presents Prioritized Experience Replay, an advanced sampling strategy that improves upon uniform sampling by prioritizing experiences with higher temporal difference errors.
Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, 2018 (MIT Press) - A standard textbook on reinforcement learning, providing a detailed explanation of Q-learning, Deep Q-Networks, and the challenges they address, including the necessity of experience replay.