Speech and Language Processing, Daniel Jurafsky and James H. Martin, 2025 (Pearson) - A comprehensive textbook providing a detailed explanation of n-gram language models, their estimation, the Markov assumption, and various smoothing techniques.
Improved backing-off for M-gram language models, Reinhard Kneser and Hermann Ney, 1995Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)DOI: 10.1109/ICASSP.1995.488975 - The original paper introducing Kneser-Ney smoothing, a highly effective and widely used method for probability estimation in n-gram models, addressing the zero-frequency problem.
KenLM: Faster and smaller language model queries, Kenneth Heafield, 2011Proceedings of the Sixth Workshop on Statistical Machine Translation (Association for Computational Linguistics) - Describes the KenLM toolkit, a high-performance library for building and querying n-gram language models, which is frequently used in speech recognition systems.