Site Reliability Engineering: How Google Runs Production Systems, Betsy Beyer, Chris Jones, Jennifer Petoff, Niall Murphy, 2017 (O'Reilly Media) - Provides foundational principles for designing and operating reliable, scalable systems, including Service Level Objectives (SLOs), capacity planning, and various testing strategies applicable to feature stores.
MLOps Engineering at Scale, Sumit Sethuraman, Sumit Kumar, Sanjeev Singh, Anoop Sharma, Sayan Patra, 2022 (O'Reilly Media) - Offers practical guidance on implementing MLOps, including considerations for performance, scalability, monitoring, and operational best practices for machine learning systems like feature stores.
AWS Well-Architected Framework - Performance Efficiency Pillar, Amazon Web Services, 2022 (Amazon Web Services) - Provides architectural guidance for achieving performance efficiency in cloud environments, covering resource selection, monitoring, and scaling strategies relevant to capacity planning for cloud-native feature stores.