Rainbow: Combining Improvements in Deep Reinforcement Learning, Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, David Silver, 2018Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32 (Association for the Advancement of Artificial Intelligence)DOI: 10.1609/aaai.v32i1.11796 - 介绍完整的 Rainbow DQN 智能体,详细说明多种技术的集成及其性能优势。
A Distributional Perspective on Reinforcement Learning, Marc G. Bellemare, Will Dabney, Rémi Munos, 2017Proceedings of the 34th International Conference on Machine Learning, Vol. 70 (PMLR) - 介绍分布式强化学习,这是 Rainbow 中用于建模回报分布的关键组件。
Dueling Network Architectures for Deep Reinforcement Learning, Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Hasselt, Marc Lanctot, Nando Freitas, 2016Proceedings of The 33rd International Conference on Machine Learning, Vol. 48 (PMLR) - 介绍对偶网络架构以改进策略评估,该架构可适应分布式环境。