Scaling Laws for Neural Language Models, Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei, 2020arXiv preprint arXiv:2001.08361DOI: 10.48550/arXiv.2001.08361 - 建立了描述LLM性能与模型大小、数据集大小和计算预算之间关系的经验缩放定律。