NVIDIA Data Center GPU Manager (DCGM) User Guide, NVIDIA Corporation, N.D. (NVIDIA Corporation) - Provides official details and usage instructions for monitoring GPU metrics critical for right-sizing.
Multi-Instance GPU (MIG) User Guide, NVIDIA Corporation, N.D. (NVIDIA Developer Documentation) - Describes how to partition GPUs into isolated instances to increase utilization and share resources.
NVIDIA Triton Inference Server, NVIDIA Corporation, N.D. (NVIDIA Developer Documentation) - Official portal for the Triton Inference Server, covering perf_analyzer and strategies like model co-location for inference optimization.