Designing Machine Learning Systems: An Introduction to MLOps, Chip Huyen, 2022 (O'Reilly Media) - Covers fundamental concepts of designing and deploying ML systems, including model serving strategies, API design for inference, and operational considerations.
Building Microservices: Designing Fine-Grained Systems, Sam Newman, 2021 (O'Reilly Media) - Provides architectural guidance for building microservices, applicable to containerized inference services, covering API contracts, communication patterns, and independent deployability.
RESTful Web Services, Leonard Richardson, Sam Ruby, and David Thomas, 2007 (O'Reilly Media) - A foundational text on designing RESTful web APIs, essential for defining service interfaces, HTTP methods, and error handling for prediction endpoints.