Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, 2018 (The MIT Press) - A canonical textbook in reinforcement learning, it details the theoretical underpinnings of optimal policies, value functions, and the Bellman optimality equations for MDPs.
Lecture Notes: Markov Decision Processes and Bellman Equations, Emma Brunskill, 2025 (Stanford University) - Stanford University lecture notes providing a clear and accessible introduction to Markov Decision Processes, optimal policies, and the derivation of Bellman optimality equations.
Markov Decision Processes: Discrete Stochastic Dynamic Programming, Martin L. Puterman, 1994 (John Wiley & Sons) - A classic and mathematically rigorous text on Markov Decision Processes, providing a deep look into optimal policies, value functions, and the Bellman optimality equations.