While caching is a powerful tool for improving performance and reducing costs, it introduces a new challenge: data staleness. A cache that serves outdated information can lead to incorrect application behavior, which is often worse than having no cache at all. Cache invalidation is the process of removing or updating entries in the cache when the underlying data changes, ensuring your application remains accurate and up-to-date.Choosing the right invalidation strategy depends on how your data changes over time. We will explore several common and effective strategies you can implement.Time-to-Live (TTL) InvalidationThe simplest invalidation strategy is Time-to-Live (TTL). With TTL, you assign an expiration time to each cache entry. Once the time expires, the entry is considered invalid and will be removed from the cache either automatically on the next access or during a periodic cleanup.This strategy is ideal for data that becomes stale after a predictable period. For example, weather forecasts, stock prices, or news headlines are good candidates for TTL-based caching. If you're building an application that reports the current weather, you might cache the result for 15 minutes.You can set a TTL when adding an entry to the cache using the ttl parameter in the set method. The time is specified in seconds.from kerb.cache import create_memory_cache, generate_prompt_key # Create a cache for time-sensitive data weather_cache = create_memory_cache() prompt = "What is the current stock price for AAPL?" key = generate_prompt_key(prompt, model="gpt-4") # Simulate an API call to get a stock price response = {"stock": "AAPL", "price": 180.25, "timestamp": "2024-10-26T10:00:00Z"} # Cache the response with a 5-minute (300 seconds) TTL weather_cache.set(key, response, ttl=300) print(f"Cached stock price for '{prompt}' with a 5-minute TTL.")After 300 seconds, a get call for this key will return None, triggering a fresh API call to fetch the latest stock price.Manual InvalidationFor data that changes at unpredictable times, manual invalidation is necessary. This approach gives you explicit control over when to remove a cache entry. It's useful for content that is updated by user actions or external events, such as editing a document in a knowledge base, changing a product description, or updating a user's profile.When the underlying data changes, your application logic should explicitly delete the corresponding entry from the cache. You can do this using the delete method.from kerb.cache import create_memory_cache, generate_prompt_key # Cache for product descriptions product_cache = create_memory_cache() # A prompt to summarize a product product_id = "prod_123" product_description = "A durable, high-performance laptop for professionals." prompt = f"Summarize this product: {product_description}" key = generate_prompt_key(prompt, model="gpt-4", product_id=product_id) # Cache the initial summary summary = "A powerful and reliable laptop for professional use." product_cache.set(key, summary) print(f"Initial summary for {product_id} is cached.") # Later, the product description is updated... updated_description = "An ultra-light, high-performance laptop for creative professionals." # Your application should now invalidate the old cache entry. # We generate the same key used before to delete it. was_deleted = product_cache.delete(key) if was_deleted: print(f"Cache for {product_id} was invalidated due to an update.") else: print(f"Cache entry for {product_id} not found.") # The next time a summary is requested, it will be a cache miss, # forcing a new generation based on the updated description.This event-driven approach ensures that your cache is always synchronized with your source of truth, such as a database or content management system.Version-Based KeysA clean way to handle invalidation is to incorporate a version identifier into your cache keys. When your prompt, model, or underlying logic changes, you simply increment the version number. This automatically creates new cache keys, effectively invalidating all old entries without needing to delete them. The old, unreferenced entries will eventually be evicted by the cache's replacement policy (like LRU).This strategy is excellent for managing updates to prompts or changes in your application's configuration. It avoids complex deletion logic and provides a clear history of changes.The generate_prompt_key function is designed to handle arbitrary keyword arguments, making it easy to add a version.from kerb.cache import generate_prompt_key prompt = "Analyze customer sentiment" model = "gpt-4o-mini" # Generate a version 1.0 of our prompt logic key_v1 = generate_prompt_key(prompt, model=model, version="1.0") print(f"For v1.0: {key_v1[:24]}...") # After improving the prompt, we update the version to 1.1 key_v2 = generate_prompt_key(prompt, model=model, version="1.1") print(f"For v1.1: {key_v2[:24]}...") print(f"\nKeys are different: {key_v1 != key_v2}")Because key_v1 and key_v2 are different, a request made after the version bump will result in a cache miss, forcing a new LLM call with the updated logic. The old v1.0 cached responses remain but will no longer be accessed.Bulk Invalidation with PrefixesSometimes you need to invalidate a whole group of related entries at once. For instance, if you update an entire document in a RAG system, you'll want to clear all cached chunks and embeddings associated with that document. Manually deleting each entry would be inefficient.The LLMCache class provides invalidate_by_prefix, which removes all entries whose keys start with a specific string. This requires a disciplined approach to naming, where you include a common prefix for related items.from kerb.cache import create_llm_cache, create_memory_cache # We use LLMCache for its advanced invalidation features llm_cache = create_llm_cache(backend=create_memory_cache()) # Cache embeddings for different chunks of a document document_id = "doc_abc" chunks = ["First part of the text.", "Second part of the text."] models = ["text-embedding-3-small", "text-embedding-3-large"] # Create keys with a common prefix: doc_abc:embedding: for i, chunk in enumerate(chunks): for model in models: # A simple keying strategy for demonstration key = f"{document_id}:embedding:{model}:{i}" llm_cache.cache_embedding(text=chunk, embedding=[0.1, 0.2], model=model) # mock embedding print(f"Cached entry with key: {key}") # Later, the document 'doc_abc' is updated. We need to invalidate all its embeddings. prefix_to_invalidate = f"{document_id}:embedding:" invalidated_count = llm_cache.invalidate_by_prefix(prefix_to_invalidate) print(f"\nDocument '{document_id}' updated. Invalidating all related embeddings...") print(f"Invalidated {invalidated_count} entries with prefix '{prefix_to_invalidate}'.")This pattern is highly effective for managing cache consistency in systems with structured, interconnected data.Choosing an Invalidation StrategySelecting the right strategy is a matter of understanding your data's lifecycle. Here are some guidelines:Use TTL for data that expires on a predictable schedule, like news feeds or temporary session data.Use Manual Invalidation when data changes unpredictably due to external events, like a user updating their profile.Use Version-Based Keys when you update your application's logic, prompts, or models. This is one of the cleanest and safest methods.Use Bulk Invalidation for managing groups of related data, such as all chunks belonging to a single document in a RAG system.In practice, an application will often use a combination of these strategies to handle different types of cached data effectively.