ROUGE: A Package for Automatic Evaluation of Summaries, Chin-Yew Lin, 2004Text Summarization Branches Out (Association for Computational Linguistics) - Introduces the ROUGE metric, widely used for evaluating automatic summarization.
BLEU: a Method for Automatic Evaluation of Machine Translation, Kishore Papineni, Salim Roukos, Todd Ward, Wei-Jing Zhu, 2002Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (Association for Computational Linguistics)DOI: 10.3115/1073083.1073135 - Presents the BLEU score, a standard metric for assessing machine translation quality.
Speech and Language Processing (3rd Edition Draft), Daniel Jurafsky, James H. Martin, 2025 - Foundational textbook covering natural language processing concepts, including the definition and calculation of perplexity in Chapter 3.