Reinforcement Learning: An Introduction, Richard S. Sutton and Andrew G. Barto, 2018 (MIT Press) - Standard textbook on reinforcement learning. This edition details policy gradient methods, the use of baselines for variance reduction, and the principles of actor-critic algorithms. 2nd edition.
Actor-Critic Algorithms, Vijay R. Konda and John N. Tsitsiklis, 1999Advances in Neural Information Processing Systems 12 (MIT Press)DOI: 10.1137/S0363012901385691 - This paper introduced actor-critic algorithms, an approach that uses baselines for policy gradient variance reduction.
Lecture 9: Policy Gradients & Actor-Critic, Emma Brunskill, 2023CS234: Reinforcement Learning (Stanford University) - Lecture slides from Stanford University's CS234 course. These slides cover policy gradients, the benefits of baselines for variance reduction, and introduce actor-critic methods.