Having conceptualized the ReAct framework as a synergy between reasoning (Thought
) and environment interaction (Action
), let's move to the practical aspects of implementing such an agent. Implementing ReAct involves careful prompt engineering, parsing the LLM's structured output, executing actions, and managing the iterative loop.
At its core, a ReAct agent operates in a loop, processing information step-by-step. Each step typically involves generating a thought, proposing an action based on that thought, executing the action, and incorporating the resulting observation.
The iterative process within a ReAct agent, showing the flow from prompting the LLM to parsing its output, executing actions, observing results, and updating the context for the next cycle.
The effectiveness of ReAct heavily relies on designing prompts that guide the LLM to produce output in the desired Thought: ... Action: ...
format. This is typically achieved through few-shot prompting, where the prompt includes examples of successful reasoning trajectories.
A typical ReAct prompt structure might look like this:
You are an expert assistant designed to solve complex tasks by reasoning step-by-step and interacting with available tools.
Available Tools:
[
{
"name": "Search",
"description": "Performs a web search to find up-to-date information.",
"arguments": {"query": "The search query string."}
},
{
"name": "Calculator",
"description": "Computes mathematical expressions.",
"arguments": {"expression": "The mathematical expression to evaluate."}
},
# ... potentially more tools
]
Follow this format strictly:
Question: The user's input question.
Thought: Your reasoning about the current state, the goal, and what action to take next.
Action: The action to take. Choose one tool from the Available Tools list or use 'Final Answer'. Format as a JSON object: {"action": "ToolName", "action_input": {"arg1": "value1", ...}} or {"action": "Final Answer", "action_input": "Your final answer here."}
--- Previous Steps ---
Thought: Previous thought 1...
Action: {"action": "PreviousAction1", "action_input": {...}}
Observation: Previous observation 1...
Thought: Previous thought 2...
Action: {"action": "PreviousAction2", "action_input": {...}}
Observation: Previous observation 2...
--- Current Step ---
Question: {current_user_question}
Thought:
Key Prompting Considerations:
Thought -> Action -> Observation
sequences within the initial prompt significantly improves the LLM's ability to follow the format and reasoning pattern.--- Previous Steps ---
section is crucial. It provides the context of the ongoing reasoning process. As the interaction progresses, this history grows. Strategies for managing context length, like summarizing earlier steps or using sliding windows, become important for long tasks (covered further in Chapter 3).Observation:
) for the LLM generation can help ensure the model pauses after generating the Action
, allowing your code to execute it before prompting for the next step.Once the LLM generates its response, robust parsing is required to extract the Thought
and Action
components. Regular expressions or string manipulation techniques are often used. For actions formatted as JSON (as in the example prompt), a JSON parser is employed.
# Simplified Python-like pseudocode for parsing
llm_output = llm.generate(prompt) # Assume llm_output contains "Thought: ... \nAction: ..."
thought = parse_thought(llm_output) # Extract text after "Thought:" and before "Action:"
action_json_str = parse_action_string(llm_output) # Extract text after "Action:"
try:
action_data = json.loads(action_json_str)
action_name = action_data.get("action")
action_input = action_data.get("action_input")
except json.JSONDecodeError:
# Handle cases where the LLM didn't produce valid JSON
handle_parsing_error()
# Potentially ask the LLM to reformat or try a default action
Error handling during parsing is significant. LLMs might occasionally fail to adhere strictly to the format. Your implementation needs strategies to handle malformed outputs, perhaps by re-prompting with specific error feedback or attempting a default recovery action.
The parsed action_name
and action_input
determine what happens next.
action_name
to one of the available tools/functions defined in your system.action_input
contains the necessary arguments for the selected tool.Observation
.# Simplified Python-like pseudocode for execution
available_tools = {
"Search": search_function,
"Calculator": calculator_function
}
if action_name == "Final Answer":
# Task is complete
final_response = action_input
mark_task_done()
elif action_name in available_tools:
tool_function = available_tools[action_name]
try:
# Validate action_input against tool's expected arguments
validate_tool_input(action_name, action_input)
observation = tool_function(**action_input) # Execute the tool
except Exception as e:
# Capture execution errors as observations
observation = f"Error executing {action_name}: {str(e)}"
else:
# Handle cases where the LLM hallucinated a tool name
observation = f"Error: Unknown action '{action_name}'. Please choose from available tools."
# observation is now ready to be added to the history for the next prompt
The Observation
should be informative. If a search tool returns results, the observation might be the snippets. If a calculator computes a value, that value is the observation. Crucially, if a tool fails, the error message itself becomes the observation, allowing the agent to potentially reason about the failure in its next Thought
step (e.g., "Thought: The search failed because the query was malformed. I should try rephrasing the query.").
The core loop continues by appending the Thought
, Action
, and Observation
to the history section of the prompt and generating the next step. Termination conditions are essential:
{"action": "Final Answer", "action_input": "..."}
.Implementing ReAct requires orchestrating these components: careful prompt design, robust parsing, reliable tool execution, and clear state management through the iterative loop. The next sections explore other architectures, but the principles of structured prompting, parsing, and action execution seen here are fundamental building blocks for many agentic systems. The hands-on practice later in this chapter will involve building a ReAct agent incorporating these elements.
© 2025 ApX Machine Learning