Human-level control through deep reinforcement learning, Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis, 2015Nature, Vol. 518 (Springer Nature)DOI: 10.1038/nature14236 - The original paper introducing Deep Q-Networks (DQN), which first proposed using a separate target network to stabilize training.
Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, 2018 (The MIT Press) - A comprehensive textbook covering the theoretical foundations of reinforcement learning, including Q-learning, Temporal Difference (TD) learning, and function approximation, which sets the stage for DQN.