With your agent's instructions prepared and its initial goal specified, it's time to write the code that enables your agent to take its first meaningful step. This step involves translating the instructions and goal into a request for the Large Language Model (LLM), which then processes this information and generates a response. For our first simple agent, this LLM response itself can be considered the outcome of the agent's "action."
At its core, an LLM agent relies on the LLM to understand, reason, and generate text. When we talk about the agent's "first action," we're referring to the process of:
This output might be a direct answer to a question, a rephrased instruction, a piece of a plan, or, as we'll see later in the chapter, an acknowledgment of a task like adding an item to a to-do list. The nature of the LLM's response is shaped by how you design the prompt.
Interacting with most powerful LLMs typically involves making a request to an Application Programming Interface (API). This is like sending a message to a service and getting a reply. While the specifics can vary depending on the LLM provider (like OpenAI, Cohere, or AI21 Labs, or even a locally run model), the general idea is consistent.
For our purposes, we'll use Python to demonstrate this. We'll start by defining a simple function that represents this interaction. In a real-world application, this function would contain the code to make an HTTP request to the LLM's API endpoint, handle authentication (usually with an API key), and parse the response. For this introductory stage, we'll create a mock version of this function. This allows us to focus on the agent's logic without needing to set up API keys or manage network requests just yet.
Let's define our mock LLM call function:
def call_llm_mock(prompt_text):
"""
A simplified mock function to simulate an LLM API call.
In a real application, this would interact with an actual LLM service.
"""
print(f"\n[LLM Mock] Processing prompt snippet: \"{prompt_text[:120]}...\"") # Shows a part of the prompt sent
# Example responses based on keywords in the prompt
if "capital of france" in prompt_text.lower():
return "The capital of France is Paris, of course!"
elif "add 'buy milk' to my to-do list" in prompt_text.lower():
return "Okay, I've noted 'buy milk' for your to-do list."
elif "summarize the following text" in prompt_text.lower() and "llms are ai models" in prompt_text.lower():
return "LLMs are advanced AI models capable of understanding and generating human-like text."
else:
# A generic response if no specific keywords match
return "This is a generic response from the mock LLM. I understood your request."
This call_llm_mock
function takes the prompt_text
as input, prints a part of it to simulate sending it, and then returns a pre-programmed string based on some simple keyword matching. This is a useful technique when developing initially, as it lets you test your agent's flow without incurring API costs or dealing with network latency.
The effectiveness of an LLM heavily depends on the quality of the prompt it receives. As discussed in "Instructing Your Agent for a Task" and "Specifying the Agent's Goal," we need to combine the general instructions
(which define the agent's persona or overall behavior) with the specific goal
(the immediate task).
Here’s how you can combine them in Python using an f-string:
agent_instructions = "You are a helpful assistant. Your role is to understand user requests and provide clear, concise answers or confirmations."
current_goal = "What is the capital of France?"
# Combining instructions and goal into a single prompt for the LLM
full_prompt = f"{agent_instructions}\n\nUser's Request: {current_goal}\n\nAssistant's Response:"
Notice the structure:
agent_instructions
set the context for the LLM.\n\nUser's Request:
helps the LLM distinguish the user's specific query.\n\nAssistant's Response:
cues the LLM to generate the part that comes next, effectively role-playing as the assistant.Now, let's create a Python function that encapsulates the agent's first action: constructing the prompt and getting the LLM's (mocked) response.
def perform_agent_core_action(instructions, goal):
"""
Constructs a prompt from the agent's instructions and current goal,
then queries the (mock) LLM to get a response.
This represents the agent's primary "thinking" or "acting" step.
"""
# 1. Combine instructions and the goal to form a comprehensive prompt
full_prompt = f"{instructions}\n\nUser's Request: {goal}\n\nAssistant's Response:"
print(f"Agent is preparing to act on the goal: \"{goal}\"")
# 2. Call the LLM (using our mock function for this example)
llm_response = call_llm_mock(full_prompt)
return llm_response
This function, perform_agent_core_action
, takes the instructions
and goal
as input. It then:
full_prompt
.call_llm_mock
function with this prompt.llm_response
. This response is the result of the agent's action in this simple setup.Let's put all the pieces together and see our agent perform its first action with a couple of different goals.
# (Assuming call_llm_mock function from above is already defined)
# (Assuming perform_agent_core_action function from above is already defined)
# Define the agent's general instructions
agent_instructions = "You are a helpful assistant. Your role is to understand user requests and provide clear, concise answers or confirmations."
# --- Example 1: A general question ---
goal_1 = "What is the capital of France?"
print(f"\n--- Agent tackling Goal 1: General Question ---")
action_result_1 = perform_agent_core_action(agent_instructions, goal_1)
print(f"Agent's Output (from LLM): {action_result_1}")
# --- Example 2: A task related to a to-do list ---
# This previews the kind of task our chapter project will handle
goal_2 = "Add 'buy milk' to my to-do list."
print(f"\n--- Agent tackling Goal 2: To-Do List Command ---")
action_result_2 = perform_agent_core_action(agent_instructions, goal_2)
print(f"Agent's Output (from LLM): {action_result_2}")
# --- Example 3: A summarization task ---
goal_3 = "Summarize the following text: LLMs are AI models that process and generate text."
print(f"\n--- Agent tackling Goal 3: Summarization Task ---")
action_result_3 = perform_agent_core_action(agent_instructions, goal_3)
print(f"Agent's Output (from LLM): {action_result_3}")
If you run this Python code, you'll see output similar to this:
--- Agent tackling Goal 1: General Question ---
Agent is preparing to act on the goal: "What is the capital of France?"
[LLM Mock] Processing prompt snippet: "You are a helpful assistant. Your role is to understand user requests and provide clear, concise answers or confirmations..."
Agent's Output (from LLM): The capital of France is Paris, of course!
--- Agent tackling Goal 2: To-Do List Command ---
Agent is preparing to act on the goal: "Add 'buy milk' to my to-do list."
[LLM Mock] Processing prompt snippet: "You are a helpful assistant. Your role is to understand user requests and provide clear, concise answers or confirmations..."
Agent's Output (from LLM): Okay, I've noted 'buy milk' for your to-do list.
--- Agent tackling Goal 3: Summarization Task ---
Agent is preparing to act on the goal: "Summarize the following text: LLMs are AI models that process and generate text."
[LLM Mock] Processing prompt snippet: "You are a helpful assistant. Your role is to understand user requests and provide clear, concise answers or confirmations..."
Agent's Output (from LLM): LLMs are advanced AI models capable of understanding and generating human-like text.
Each "Agent's Output (from LLM)" is the direct result of the agent's first coded action. It has taken its instructions and goal, used the (mocked) LLM to process them, and produced a response.
This first action is foundational. The agent has successfully taken input, "thought" about it using the LLM's simulated capabilities, and produced an output. Right now, that output is just text that we print. However, this is the building block for more complex behaviors. In more advanced agents:
For the to-do list agent we'll build at the end of this chapter, the LLM's response to a command like "add 'buy milk'" (as seen in goal_2
) would be the first step. The agent "understands" the command, and its action is to confirm this understanding. Subsequent steps would involve actually managing the to-do list data structure, which we'll explore.
You've now coded the very first, simple action for your LLM agent. This involves orchestrating the flow of information to and from the LLM, which is a fundamental skill in agent development. In the next section, "Executing and Monitoring Your Agent," we'll look at how to run this code and observe its behavior more formally.
Was this section helpful?
© 2025 ApX Machine Learning