Now that we've discussed why memory is essential for an agent to maintain a coherent interaction, and we've looked at the idea of short-term memory, it's time to put this into practice. In this hands-on section, we'll modify a basic agent to include a simple form of conversational memory. This will allow our agent to "remember" previous turns in the conversation, leading to more natural and context-aware responses.
The goal here is to store the history of interactions (both user inputs and agent responses) and feed this history back to the Large Language Model (LLM) with each new request. This way, the LLM has the context of what has already been said.
Imagine you have a basic agent whose primary job is to greet users and perhaps ask how their day is going. Without memory, every time you interact with it, it's like meeting for the first time.
With memory, we want the agent to remember Alex.
For this exercise, we'll assume you have a way to interact with an LLM. This could be through an API (like OpenAI's, Anthropic's, or Cohere's) or a locally running model. The core idea of adding memory is independent of the specific LLM you use.
We'll focus on Python for our examples, as it's commonly used in AI development. You'll need:
requests
library for a custom API, or an official client library like openai
).The simplest way to implement short-term memory is to use a list. Each element in the list can represent a part of the conversation. We can store a history of user messages and agent responses.
Let's start by initializing an empty list to hold our conversation history:
conversation_history = []
We need a consistent way to add messages to our history. A good approach is to store dictionaries, where each dictionary represents a single message and indicates who sent it (the "user" or the "assistant"/"agent") and the content of the message.
For example:
{"role": "user", "content": "Hello, my name is Alex."}
{"role": "assistant", "content": "Nice to meet you, Alex! How are you today?"}
This structure is common, especially if you're using APIs like OpenAI's chat completions.
Every time the user says something, we add it to conversation_history
. Similarly, every time the agent responds, we add its response to the history.
Let's imagine a function get_llm_response(prompt_with_history)
that takes the current state of our conversation and returns the LLM's next message.
# (Assuming get_llm_response is defined elsewhere)
def chat_with_agent():
conversation_history = []
print("Agent: Hello! I'm your friendly assistant. Type 'quit' to exit.")
while True:
user_input = input("You: ")
if user_input.lower() == 'quit':
print("Agent: Goodbye!")
break
# Add user's message to history
conversation_history.append({"role": "user", "content": user_input})
# Prepare the input for the LLM, including the history
# For some LLMs, you pass the whole history directly.
# For others, you might need to format it into a single string.
# We'll assume our get_llm_response function handles this.
# In a real scenario, the prompt to the LLM would be constructed
# by combining a system message (defining the agent's persona/task)
# with the conversation_history.
# For simplicity here, let's assume get_llm_response just needs the history.
# The core idea: the LLM sees the past conversation
agent_response_content = get_llm_response(conversation_history)
print(f"Agent: {agent_response_content}")
# Add agent's response to history
conversation_history.append({"role": "assistant", "content": agent_response_content})
# A placeholder for your LLM interaction function
# In a real application, this would call an LLM API
def get_llm_response(current_history):
# This is a very simplified mock.
# A real LLM would use the history to generate a contextually relevant response.
# For instance, if "Alex" was mentioned, it might use that name.
# Simulate LLM behavior based on history
last_user_message = ""
if current_history and current_history[-1]["role"] == "user":
last_user_message = current_history[-1]["content"].lower()
if "my name is" in last_user_message:
name = last_user_message.split("my name is")[-1].strip().capitalize()
return f"Nice to meet you, {name}! How can I help you further?"
elif "how are you" in last_user_message:
return "I'm doing well, thank you for asking!"
elif "weather" in last_user_message:
# Check if a location was mentioned previously
location = None
for message in reversed(current_history[:-1]): # Look in past messages
if "i live in" in message["content"].lower():
location = message["content"].lower().split("i live in")[-1].strip().capitalize()
break
if "near" in message["content"].lower(): # simple check
location_parts = message["content"].lower().split("near")
if len(location_parts) > 1:
location = location_parts[-1].strip().capitalize()
break
if location:
return f"I don't have real-time weather access, but I remember you mentioned {location}. I hope it's nice there!"
else:
return "I don't have real-time weather access. Where are you located?"
else:
return "I'm here to chat. What's on your mind?"
# To run the chat (in a Python environment):
# chat_with_agent()
The critical part is how conversation_history
is used when calling the LLM. Most modern LLMs designed for chat can accept a list of messages, often with roles like "system", "user", and "assistant".
A typical call to an LLM API might look something like this (conceptual example):
# This is a simplified, illustrative example of what happens inside get_llm_response
# client = YourLLMProviderClient()
# response = client.chat.completions.create(
# model="some-llm-model-name",
# messages=[
# {"role": "system", "content": "You are a helpful assistant."},
# *conversation_history # Unpacking the list of user/assistant messages
# ]
# )
# agent_response_content = response.choices[0].message.content
The "system" message helps set the agent's persona or overall instructions. Then, the entire conversation_history
is passed. The LLM uses this sequence of messages to understand the ongoing dialogue and generate a relevant next response.
If you implement the chat_with_agent()
function (and a real get_llm_response
function that calls an actual LLM), you can test it:
Interaction 1:
Agent: Hello! I'm your friendly assistant. Type 'quit' to exit.
You: Hi, my name is Sarah.
Agent: Nice to meet you, Sarah! How can I help you further?
(At this point, conversation_history
would contain your message and the agent's reply.)
Interaction 2:
You: What did I tell you my name was?
Agent: You told me your name is Sarah.
(The agent "remembers" because "Hi, my name is Sarah" was part of the history sent to the LLM for this second query.)
Interaction 3:
You: I live in London.
Agent: I'm here to chat. What's on your mind? (Or a more specific acknowledgement based on LLM)
(Memory updated: {"role": "user", "content": "I live in London."}
)
Interaction 4:
You: What's the weather like today?
Agent: I don't have real-time weather access, but I remember you mentioned London. I hope it's nice there!
(The agent uses the previously mentioned location "London" from the history to provide a more contextual, albeit still limited, response.)
This example uses a very basic mock for get_llm_response
. A real LLM, when provided with the history, would naturally pick up on these contextual cues much more effectively.
By adding this simple list-based conversation_history
, our agent can now:
However, this basic form of short-term memory has limitations:
conversation_history
might exceed this limit. When this happens, you'd need strategies like truncating the history (e.g., keeping only the last N turns) or summarizing older parts of the conversation.conversation_history
list is reset. For persistent memory across sessions, you'd need to save the history to a file or a database.This hands-on exercise demonstrates the fundamental principle of providing conversational context to an LLM agent. Even this simple implementation significantly enhances the agent's ability to engage in coherent dialogues. As you explore more complex agents, you'll encounter more sophisticated memory management techniques, but the core idea of feeding relevant past information back to the LLM remains a building block.
Was this section helpful?
© 2025 ApX Machine Learning