torch.compile: Explaining PyTorch's Newest Speedup, Horace He, Michael Lazos, Jeremy Howard, Susan Sun, Edward Yang, Geeta Chauhan, Elias Ellison, Quentin Gallouédec, Daniel Hess, Christian Sarofeen, Brandon Pyper, Natalia Gimelshein, Zachary DeVito, Mike Ruberry, Peter Bell, Roman Ring, 2022 (PyTorch) - 一篇官方PyTorch博客文章,介绍了torch.compile及其用于以最少代码更改加速PyTorch模型的底层编译机制。
NVIDIA CUDA C++ Programming Guide, NVIDIA Corporation, 2024 (NVIDIA Corporation) - 使用CUDA C++语言和运行时在NVIDIA GPU上进行并行编程的权威指南,对于理解自定义内核开发至关重要。