BLEU: a Method for Automatic Evaluation of Machine Translation, Kishore Papineni, Salim Roukos, Todd Ward, Wei-Jing Zhu, 2002Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (Association for Computational Linguistics)DOI: 10.3115/1073083.1073135 - This foundational paper introduces BLEU, a widely used automatic metric for evaluating the quality of machine-translated text by comparing it to human reference translations.
ROUGE: A Package for Automatic Evaluation of Summaries, Chin-Yew Lin, 2004Text Summarization Branches Out (Association for Computational Linguistics)DOI: 10.3115/1621300.1621316 - This paper introduces ROUGE, a set of metrics widely used for automatic evaluation of summaries and other text generation tasks, based on n-gram overlap with reference summaries.