NVIDIA TensorRT Documentation, NVIDIA Corporation, 2023 (NVIDIA Corporation) - The official guide for using TensorRT to optimize and deploy deep learning models for high-performance inference on NVIDIA GPUs.
torch.compile: Explaining PyTorch's Newest Speedup, Horace He, Michael Lazos, Jeremy Howard, Susan Sun, Edward Yang, Geeta Chauhan, Elias Ellison, Quentin Gallouédec, Daniel Hess, Christian Sarofeen, Brandon Pyper, Natalia Gimelshein, Zachary DeVito, Mike Ruberry, Peter Bell, Roman Ring, 2022 (PyTorch) - An official PyTorch blog post introducing torch.compile and its underlying compilation mechanisms for accelerating PyTorch models with minimal code changes.
NVIDIA CUDA C++ Programming Guide, NVIDIA Corporation, 2024 (NVIDIA Corporation) - The definitive guide to parallel programming on NVIDIA GPUs using the CUDA C++ language and runtime, essential for understanding custom kernel development.