Bag of Tricks for Efficient Text Classification, Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthieu Douze, Hervé Jégou, 2017Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (Association for Computational Linguistics)DOI: 10.18653/v1/E17-2007 - 介绍 fastText 库的原创研究论文,详细说明其架构和高效文本分类及语言识别方法。
N-gram-based Text Categorization, William B. Cavnar, John M. Trenkle, 1994Proceedings of the SDAIR-94, 3rd Annual Symposium on Document Analysis and Information Retrieval - 基于字符 N-gram 进行文本分类(包括语言识别)的奠基性论文,影响了许多后续方法。