Quantitative Analysis of Synthetic Text Properties
Was this section helpful?
BLEU: a Method for Automatic Evaluation of Machine Translation, Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu, 2002Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (Association for Computational Linguistics)DOI: 10.3115/1073083.1073135 - Introduces the widely adopted BLEU metric for automatic evaluation of machine translation quality, which is important for n-gram precision assessment.
ROUGE: A Package for Automatic Evaluation of Summaries, Chin-Yew Lin, 2004Text Summarization Branches Out. Proceedings of the ACL-04 Workshop (Association for Computational Linguistics)DOI: 10.3115/1621375.1621382 - Presents the ROUGE metric suite for automatic evaluation of summaries, focusing on n-gram recall and common subsequence matching.
BERTScore: Evaluating Text Generation with BERT, Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi, 2020International Conference on Learning Representations (ICLR) (International Conference on Learning Representations (ICLR))DOI: 10.48550/arXiv.1904.09675 - Introduces BERTScore, an embedding-based metric for text generation evaluation that uses contextual embeddings for semantic similarity.