Value-Decomposition Networks For Cooperative Multi-Agent Reinforcement Learning, Peter Sunehag, Guy Lever, Audrunas Gruslys, Wojciech Marian Czarnecki, Vinicius Zambaldi, Max Jaderberg, Marc Lanctot, Nicolas Sonnerat, Joel Z. Leibo, Karl Tuyls, Thore Graepel, 2017International Conference on Learning Representations (ICLR)DOI: 10.48550/arXiv.1706.05296 - 这项工作介绍了价值分解网络(VDN),这是一种有影响力的早期CTDE方法,它将总Q值分解为个体智能体Q值,用于合作多智能体任务,为QMIX等后续方法奠定了基础。