Buffer and summary memory excel at maintaining a general sense of a conversation, but they often struggle to retain specific, structured facts. Details such as a user's name, a product model number, or a shipping address can easily get lost or garbled in a long conversational history or summary. To address this limitation, entity memory provides a specialized technique for accurately tracking and retrieving these important entities throughout a conversation.
Entity memory acts like a structured knowledge base for the current conversation. It extracts and tracks specific named entities, such as people, places, organizations, or products, and stores them in an organized way. This gives your application a reliable "fact sheet" to reference, ensuring important details are never forgotten.
Instead of just storing raw text, entity memory maintains a list of objects, where each object represents a unique entity mentioned in the dialogue. When a new message is added to the conversation, an entity extraction process (often using an LLM) identifies new entities or updates existing ones.
The ConversationBuffer provides this capability out of the box. By enabling entity tracking, it automatically analyzes new messages and maintains an internal list of entities. You can enable this with a simple flag during initialization:
from kerb.memory import ConversationBuffer
# Enable entity tracking when creating the buffer
entity_buffer = ConversationBuffer(enable_entity_tracking=True)
With enable_entity_tracking set to True, every call to add_message will trigger a background process to identify and track entities within the message content.
Each tracked item is stored as an Entity object, which contains useful information:
name: The name of the entity (e.g., "Alex", "San Francisco").type: The category of the entity (e.g., "PERSON", "LOCATION").mentions: A count of how many times the entity has been mentioned.first_seen: The timestamp of the first mention.last_seen: The timestamp of the most recent mention.Let's see this in action. We will simulate a customer support conversation and then inspect the entities the buffer has automatically tracked.
# Initialize a buffer with entity tracking enabled
entity_buffer = ConversationBuffer(
max_messages=100,
enable_entity_tracking=True
)
# Simulate a conversation with several entities
entity_buffer.add_message(
"system",
"You are a support agent for a network hardware company."
)
entity_buffer.add_message(
"user",
"Hi, I'm Alex from Innovate Corp. We're having issues with our Kerb-Router-X500 at our San Francisco office."
)
entity_buffer.add_message(
"assistant",
"Hello Alex, I can certainly help with the Kerb-Router-X500. Could you confirm the device's serial number?"
)
entity_buffer.add_message(
"user",
"Sure, it's KRX500-12345. Innovate Corp purchased it last year."
)
# Retrieve the tracked entities from the buffer
tracked_entities = entity_buffer.get_entities()
print(f"Tracked {len(tracked_entities)} unique entities:")
for entity in tracked_entities:
print(f"\n- Name: {entity.name}")
print(f" Type: {entity.type}")
print(f" Mentions: {entity.mentions}")
When you run this code, the buffer processes each message and identifies entities. The get_entities() method returns a consolidated list. You'll notice that "Alex", "Innovate Corp", and "Kerb-Router-X500" are mentioned multiple times, and the mentions count reflects this, giving you a sense of the entity's importance in the conversation.
Entity memory is particularly useful in applications where retaining specific details is important for functionality.
The primary advantage is token efficiency and reliability. Instead of forcing the LLM to find a specific detail in a long conversation history, you can provide a concise, structured list of known entities. This can be included in the system prompt for every turn, giving the model an always-up-to-date summary of important facts. For example, you could format the entities and prepend them to your prompt:
Known Entities:
- PERSON: Alex (mentioned 2 times)
- ORGANIZATION: Innovate Corp (mentioned 2 times)
- PRODUCT: Kerb-Router-X500 (mentioned 2 times)
- LOCATION: San Francisco (mentioned 1 time)
This compact format provides the LLM with immediate access to important information, improving its ability to generate relevant and accurate responses while consuming fewer tokens than including the full conversation history. By combining entity memory with buffer or summary memory, you create a powerful, multi-layered context management system for building sophisticated conversational applications.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with