NVIDIA Hopper Architecture In-Depth, NVIDIA Corporation, 2022 (NVIDIA) - Describes the architecture of the NVIDIA Hopper H100 GPU, including detailed information on NVLink 4.0 and its role in GPU-to-GPU communication.
NVIDIA Collective Communications Library (NCCL) Developer Guide, NVIDIA Corporation, 2023 (NVIDIA Corporation) - The official guide for NCCL, detailing its architecture, supported collective operations, and how it leverages high-speed interconnects for optimized GPU communication.
Deep Learning Systems: Algorithms, Architectures, and Frameworks, Chen-Yi Lee, Chang-Won Jin, Ru-Han Jhou, Hsin-Fu Lin, Wei Li, Ya-Ping Hsieh, Yu-Syuan Jhou, 2020 (Springer Nature Singapore)DOI: 10.1007/978-981-15-4682-1 - A textbook that provides an overview of deep learning systems, including discussions on distributed training and the role of high-performance interconnects.