Bayesian Data Analysis, Andrew Gelman, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin, 2013 (Chapman and Hall/CRC)DOI: 10.1201/b16018 - Definitive textbook covering theoretical and practical aspects of Bayesian modeling, with extensive treatment of MCMC algorithms, including Metropolis-Hastings and its advanced variants.
Optimal Scaling of Discrete Approximations to Langevin Diffusions, Gareth O. Roberts, Jeffrey S. Rosenthal, 1998Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 60 (John Wiley & Sons)DOI: 10.1111/1467-9868.00123 - A seminal paper that introduces the Metropolis-Adjusted Langevin Algorithm (MALA) and rigorously analyzes its optimal acceptance rates and scaling behavior in high dimensions.
Monte Carlo Statistical Methods, Christian P. Robert and George Casella, 2004 (Springer)DOI: 10.1007/978-1-4757-4145-1 - This textbook provides a comprehensive theoretical foundation for Monte Carlo methods, with dedicated chapters on Markov Chain Monte Carlo algorithms, including detailed explanations of Metropolis-Hastings and its variations.
Examples of adaptive MCMC, Gareth O. Roberts, Jeffrey S. Rosenthal, 2009Journal of Computational and Graphical Statistics, Vol. 18 (Taylor & Francis)DOI: 10.1198/jcgs.2009.06134 - Discusses the design and properties of adaptive Metropolis-Hastings algorithms, which are crucial for automatically tuning proposal distributions for improved efficiency.