CS224N: Natural Language Processing with Deep Learning, Christopher Manning and Richard Socher and Abolfazl Asudeh and John Hewitt and Chenhao Tan, 2023 (Stanford University) - 一门优秀的大学课程,涵盖应用于自然语言处理的深度学习方法,对于理解现代语言模型至关重要。
Attention Is All You Need, Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, 2017Advances in Neural Information Processing Systems, Vol. 30DOI: 10.48550/arXiv.1706.03762 - 这篇开创性论文介绍了Transformer架构,该架构构成了大多数大型语言模型及其进展的基础。