Many sequential decision-making problems involve multiple interacting agents, ranging from robotic teams and autonomous vehicles to economic modeling and game playing. Single-agent reinforcement learning methods often fall short in these scenarios because each agent's optimal strategy depends on the actions of others. This chapter introduces Multi-Agent Reinforcement Learning (MARL), extending RL principles to settings with more than one agent.
We will start by formalizing multi-agent problems, often using the framework of Stochastic Games (or Markov Games). A key difficulty in MARL is non-stationarity: from any single agent's perspective, the environment dynamics change as other agents adapt their policies. We will examine different approaches to address this and other challenges, contrasting centralized training methods with decentralized execution strategies (CTDE).
You will learn about:
By the end of this chapter, you will understand the core issues in MARL and be familiar with several standard algorithms for training multiple interacting agents.
© 2025 ApX Machine Learning