Latent Dirichlet Allocation, David M. Blei, Andrew Y. Ng, Michael I. Jordan, 2003Journal of Machine Learning Research, Vol. 3 (MIT Press)DOI: 10.1162/jmlr.2003.3.blei03a - Introduces Latent Dirichlet Allocation (LDA) and the foundational Variational Bayes inference algorithm for it, including the mean-field approximation and coordinate ascent updates.
Variational Inference: A Review for Statisticians, David M. Blei, Alp Kucukelbir, Jon D. McAuliffe, 2017Journal of the American Statistical Association, Vol. 112 (Taylor & Francis on behalf of the American Statistical Association)DOI: 10.1080/01621459.2017.1285773 - Provides a comprehensive and modern review of variational inference, detailing the theoretical foundations, mean-field approximation, ELBO maximization, and connections to various models, including LDA.
Stochastic Variational Inference, Matthew D. Hoffman, David M. Blei, Chong Wang, John Paisley, 2013Journal of Machine Learning Research, Vol. 14DOI: 10.5555/2503926.2503947 - Introduces Stochastic Variational Inference (SVI), an algorithm that extends Variational Bayes to large-scale datasets by using mini-batches, relevant to the scaling discussion in the section.
Pattern Recognition and Machine Learning, Christopher M. Bishop, 2006 (Springer) - Chapter 10 offers a clear and fundamental introduction to variational inference, including its theoretical basis, the mean-field approximation, and its application to various probabilistic models.