Tree-based Batch Mode Reinforcement Learning, Damien Ernst, Pierre Geurts, and Louis Wehenkel, 2005Journal of Machine Learning Research, Vol. 6 - Introduces Fitted Q-Iteration (FQI) as an offline reinforcement learning algorithm, particularly effective with tree-based function approximators.
Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, 2018 (MIT Press) - A classic textbook covering the theoretical foundations of Q-learning, Value Iteration, and function approximation, which FQI extends to the batch setting.
Deep Neural Networks for Learning Control Policies, Sascha Lange, Martin Riedmiller, and Alexander van der Smagt, 2012Autonomous Robots, Vol. 33 (Springer)DOI: 10.1007/s10514-012-9294-8 - Describes the application of deep neural networks as function approximators within the FQI framework, leading to Neural Fitted Q-Iteration (NFQ).