Human-level control through deep reinforcement learning, Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Andras Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis, 2015Nature, Vol. 518DOI: 10.1038/nature14236 - The original paper that introduced Deep Q-Networks (DQN), detailing the use of target networks and experience replay to stabilize deep reinforcement learning.
Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, 2018 (MIT Press) - A widely recognized textbook offering a comprehensive review of reinforcement learning, including Q-learning and considerations for function approximation.
Lecture 6: Value Function Approximation, David Silver, 2015UCL Course on Reinforcement Learning (UCL) - A lecture from a leading academic series that explains value function approximation in reinforcement learning, including the role of target networks in DQN.