Learning to Communicate with Deep Reinforcement Learning, Jakob Foerster, Yannis Assael, Nando de Freitas, Shimon Whiteson, 2016Advances in Neural Information Processing Systems 29 (NIPS 2016) (NeurIPS) - 提出了DIAL,一个在多智能体强化学习中学习离散通信协议的框架,训练时利用可微分通信通道进行梯度回传。
TarMAC: Targeted Multi-Agent Communication, Abhishek Das, Samyak Parajuli, Souvik Bhattacharya, Akshat Rastogi, Stefano Ermon, Joshua Meier, Andrew M. Saxe, Dhruv Batra, Devi Parikh, 2019Proceedings of the 36th International Conference on Machine Learning (ICML), Vol. PMLR 97 (Proceedings of Machine Learning Research)DOI: 10.48550/arXiv.1906.01220 - 介绍了用于多智能体通信的注意力机制,使智能体能够选择性地关注相关消息和伙伴以增强协调。
Multi-Agent Reinforcement Learning: A Review of Foundational Concepts and Recent Trends, Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D’Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaid Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson, Justin Hsu, Martin Jaggi, Tara Javidi, Gauri Joshi, Mikhail Khodak, Jakub Konecný, Aleksandra Korolova, Farinaz Koushanfar, Sanmi Koyejo, Tancrède Lepoint, Yang Liu, Prateek Mittal, Mehryar Mohri, Richard Nock, Ayfer Özgür, Rasmus Pagh, Hang Qi, Daniel Ramage, Ramesh Raskar, Mariana Raykova, Dawn Song, Weikang Song, Sebastian U. Stich, Ziteng Sun, Ananda Theertha Suresh, Florian Tramèr, Praneeth Vepakomma, Jianyu Wang, Li Xiong, Zheng Xu, Qiang Yang, Felix X. Yu, Han Yu and Sen Zhao, 2021Foundations and Trends® in Machine Learning, Vol. 14 (Now Publishers)DOI: 10.1561/2200000083 - 对多智能体强化学习的全面回顾,涵盖基础概念、通信挑战和各类解决方案,对更广泛的背景有益。