Accelerate Documentation - Device Map, Hugging Face, 2024 (Hugging Face) - Official guide on using device_map for distributing large models across available devices, including CPU offloading.
FastAPI Documentation, Sebastián RamÃrez, 2024 - Official documentation for FastAPI, the framework used to build the web API serving the optimized model.