Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, Douwe Kiela, 2020Advances in Neural Information Processing Systems (NeurIPS)DOI: 10.48550/arXiv.2005.11401 - This paper introduced the Retrieval-Augmented Generation (RAG) framework, which relies on efficient text embedding and retrieval, making embedding caching an important optimization.
OpenAI Embeddings, OpenAI, 2024 (OpenAI) - Official documentation providing practical guidance on using OpenAI's embedding models, detailing their function, available models, and API integration, relevant to managing embedding API calls.
Building Machine Learning Powered Applications: Going from Idea to Product, Emmanuel Ameisen, 2020 (O'Reilly Media) - This resource covers practical considerations for building and deploying machine learning applications, including strategies for performance improvement and cost management in production environments.