As we've seen, an agent's short-term memory, primarily managed within its context window, is a finite resource. When dealing with extensive interactions, lengthy documents, or detailed tool outputs, the raw information can quickly overwhelm this limited space. This is where information condensation techniques become indispensable. By intelligently reducing the volume of information while preserving its essential meaning, you can enable your agent to handle more complex scenarios, maintain longer dialogues, and operate more efficiently. This section focuses on how to use prompts to guide an LLM in condensing information effectively for an agent's memory.
The core idea is to transform verbose information into a more compact form that the agent can readily use or store. This isn't just about making things shorter; it's about retaining the most relevant details for the agent's current objectives and future actions.
One of the most direct ways to condense information is through summarization. You can instruct the LLM to create a concise version of a given text. Your prompts can guide the style, length, and focus of these summaries.
While you don't typically specify "abstractive" or "extractive" in a prompt, your wording can lead the LLM towards one or the other:
To encourage abstractive summarization (where the LLM generates new phrasing to capture the essence), use prompts like:
To encourage extractive summarization (where the LLM pulls key sentences or phrases directly from the text), you might ask:
In practice, LLMs often produce a hybrid. The important part is that your prompt clearly defines the desired outcome and the information to prioritize.
Sometimes, a full summary isn't needed, but rather a set of significant terms that represent the core topics of a piece of text. Prompting for keywords or keyphrases can be a very efficient condensation method.
Keywords can be used as tags, for quick relevance checks, or as input for further information retrieval from a knowledge base.
Often, information spread across natural language can be more effectively stored and processed if converted into a structured format like JSON. This is a powerful condensation technique because it not only reduces verbosity but also organizes the information for easier programmatic access by the agent or other systems.
This method is particularly useful for extracting specific pieces of information from unstructured text and making them directly usable by the agent's logic or tools.
Effective condensation relies on clear instructions within your prompts.
Length and Detail: Be explicit about the desired conciseness.
Focus and Relevance: Direct the LLM on what information is important to retain, especially in relation to the agent's current task or overall goal.
Maintaining Integrity: While brevity is good, ensure critical information isn't lost.
For ongoing interactions or tasks that evolve over time, progressive condensation is a valuable strategy. Instead of re-processing all historical data, the agent maintains a running summary or a condensed state that is updated with new information.
Provide the current condensed summary/state.
Provide the new piece of information (e.g., latest user message, new tool output).
Instruct the LLM to integrate the new information into the existing summary, creating an updated, still condensed, version.
This approach helps manage context effectively in long-running agentic workflows.
The following diagram illustrates the general flow of information condensation guided by prompts:
An agent uses a specific prompt to instruct the LLM to process raw information and produce a condensed version, which is then used to update the agent's working memory or inform subsequent prompts.
The primary trade-off in information condensation is between conciseness and information fidelity. Overly aggressive condensation can lead to the loss of important details, hindering the agent's ability to perform its tasks accurately. Conversely, insufficient condensation fails to alleviate context window pressure.
By mastering these prompt-driven condensation techniques, you equip your AI agents to handle larger volumes of information more effectively, leading to improved coherence, better decision-making over extended interactions, and more efficient use of the LLM's capabilities. This condensed information is not only useful for short-term recall but also serves as an excellent candidate for storage in long-term memory systems, which we will explore further.
Was this section helpful?
© 2025 ApX Machine Learning