Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, 2018 (MIT Press) - Provides fundamental definitions of states, observations, and actions, and a comprehensive overview of classic and modern RL algorithms, including discussions on function approximation for large spaces.
Human-level control through deep reinforcement learning, Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller, 2015Nature, Vol. 518DOI: 10.1038/nature14236 - Introduces Deep Q-Networks (DQN), demonstrating the use of CNNs for image observations, preprocessing techniques like resizing, grayscaling, and frame stacking, and addresses discrete action spaces.
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor, Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, and Sergey Levine, 2018International Conference on Machine Learning (ICML)DOI: 10.48550/arXiv.1801.01290 - A prominent algorithm for continuous action spaces, discussing stochastic policies and explicitly detailing the use of the tanh squashing function for action bounding, a primary implementation detail mentioned.
Decision Transformer: Reinforcement Learning via Sequence Modeling, Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch, 2021Advances in Neural Information Processing Systems (NeurIPS), Vol. 34DOI: 10.48550/arXiv.2106.01345 - Introduces the Decision Transformer architecture, demonstrating how reinforcement learning tasks can be framed as sequence modeling problems, using Transformer networks for handling temporal dependencies.
Gymnasium Documentation, The Gymnasium Contributors, 2024 - Official documentation for standardized environment interfaces in reinforcement learning, detailing observation_space and action_space objects.