Stochastic Games, Lloyd S. Shapley, 1953Proceedings of the National Academy of Sciences of the United States of America, Vol. 39 (National Academy of Sciences)DOI: 10.1073/pnas.39.10.1095 - 首次正式定义了随机博弈,为多智能体序列决策建立了理论框架。
Multi-Agent Reinforcement Learning: A Comprehensive Survey, Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D’Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaid Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson, Justin Hsu, Martin Jaggi, Tara Javidi, Gauri Joshi, Mikhail Khodak, Jakub Konecný, Aleksandra Korolova, Farinaz Koushanfar, Sanmi Koyejo, Tancrède Lepoint, Yang Liu, Prateek Mittal, Mehryar Mohri, Richard Nock, Ayfer Özgür, Rasmus Pagh, Hang Qi, Daniel Ramage, Ramesh Raskar, Mariana Raykova, Dawn Song, Weikang Song, Sebastian U. Stich, Ziteng Sun, Ananda Theertha Suresh, Florian Tramèr, Praneeth Vepakomma, Jianyu Wang, Li Xiong, Zheng Xu, Qiang Yang, Felix X. Yu, Han Yu and Sen Zhao, 2021Foundations and Trends® in Machine Learning, Vol. 14 (now publishers)DOI: 10.1561/2200000083 - 这项广泛的综述提供了多智能体强化学习的最新视角,涵盖了随机博弈、非平稳性以及当前的解决方案。