Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments, Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel, Igor Mordatch, 2017Advances in Neural Information Processing Systems 30, Vol. 30 (Curran Associates, Inc.) - Introduces MADDPG, a significant algorithm addressing non-stationarity in multi-agent systems through centralized training and decentralized execution.
Reinforcement Learning: An Introduction (2nd Edition), Richard S. Sutton and Andrew G. Barto, 2018 (MIT Press) - Chapter 15 offers foundational context on multi-agent learning and game theory, explaining how agent interactions cause issues like non-stationarity.