All Courses

Hands-on: Guiding an Agent to Use a Web Search Tool

Throughout this chapter, we've explored how prompt engineering allows AI agents to interact with external tools, significantly broadening their capabilities. Now, let's put these principles into practice by guiding an agent to use a web search tool to answer a question requiring up-to-date information. This hands-on exercise will walk you through constructing prompts, simulating an agent's interaction, and understanding how to manage the flow of information.

Our goal is to have an AI agent answer the question: "What is the current market capitalization of ExampleCorp?" This information is dynamic and not typically part of an LLM's static training data, making a web search tool essential.

Setting the Stage: The Agent and the Tool

Imagine we have an AI agent powered by an LLM. We need to equip it with a web_search tool. For the agent to use this tool effectively, we must clearly define it within the prompt.

Tool Specification: The agent needs to understand what the tool does, what input it expects, and what output it will provide. We'll present this information to the agent in a structured way.

You have access to the following tool:
Tool Name: web_search
Description: Use this tool to find current information from the internet, such as stock prices, news, or recent events.
Input: A single string representing the search query.
Output: A string containing a summary of the search results.

Next, we need to instruct the agent on how to request the use of this tool. A common method is to ask the agent to output its reasoning (a "thought") and then a structured request for the tool, often in JSON format. If the agent believes it can answer without a tool, it should provide a final answer directly.

Action Formatting Instructions:

When you need to use a tool, respond with a JSON object with two keys: "thought" and "action".
The "thought" key should contain a string explaining your reasoning for choosing the tool and the query.
The "action" key should be an object containing "tool_name" and "tool_input".

Example of tool usage:
{
  "thought": "I need to find the weather in London. I will use the web_search tool.",
  "action": {
    "tool_name": "web_search",
    "tool_input": "weather in London"
  }
}

If you have enough information to answer the user's question, respond with a JSON object with two keys: "thought" and "final_answer".
Example of final answer:
{
  "thought": "I have found the information and can now answer the question.",
  "final_answer": "The answer is..."
}

Constructing the Initial Prompt

Now, let's combine these elements with the user's question to form the initial prompt for our agent.

You are a helpful AI assistant. Your task is to answer user questions accurately.

You have access to the following tool:
Tool Name: web_search
Description: Use this tool to find current information from the internet, such as stock prices, news, or recent events.
Input: A single string representing the search query.
Output: A string containing a summary of the search results.

When you need to use a tool, respond with a JSON object with two keys: "thought" and "action".
The "thought" key should contain a string explaining your reasoning for choosing the tool and the query.
The "action" key should be an object containing "tool_name" and "tool_input".

Example of tool usage:
{
  "thought": "I need to find the weather in London. I will use the web_search tool.",
  "action": {
    "tool_name": "web_search",
    "tool_input": "weather in London"
  }
}

If you have enough information to answer the user's question, respond with a JSON object with two keys: "thought" and "final_answer".
Example of final answer:
{
  "thought": "I have found the information and can now answer the question.",
  "final_answer": "The answer is..."
}

User Question: What is the current market capitalization of ExampleCorp?

Simulating the Agent's First Interaction

When this prompt is sent to the LLM, we expect it to recognize that the question requires external, current data and decide to use the web_search tool.

Expected LLM Output (Agent's Action):

{
  "thought": "The user is asking for the current market capitalization of ExampleCorp. This is financial data that changes frequently, so I should use the web_search tool to get the most up-to-date information.",
  "action": {
    "tool_name": "web_search",
    "tool_input": "current market capitalization of ExampleCorp"
  }
}

In a live agentic system, your code would parse this JSON. It would identify web_search as the tool and "current market capitalization of ExampleCorp" as the input. The system would then execute this search (e.g., by calling a search engine API).

System Response: Executing the Tool and Providing Observation

Let's assume our web_search tool (or the API it calls) returns the following information:

Simulated Tool Output (Observation): Observation: According to market data from today, ExampleCorp has a market capitalization of $150 Billion.

This "Observation" is crucial. It's the new piece of information the agent has gained. Now, we need to feed this back to the agent so it can formulate the final answer.

Second Interaction: Processing the Observation

The agent's next prompt will include the original instructions, the history of its previous action, and this new observation.

You are a helpful AI assistant. Your task is to answer user questions accurately.

You have access to the following tool:
Tool Name: web_search
Description: Use this tool to find current information from the internet, such as stock prices, news, or recent events.
Input: A single string representing the search query.
Output: A string containing a summary of the search results.

When you need to use a tool, respond with a JSON object with two keys: "thought" and "action".
The "thought" key should contain a string explaining your reasoning for choosing the tool and the query.
The "action" key should be an object containing "tool_name" and "tool_input".

Example of tool usage:
{
  "thought": "I need to find the weather in London. I will use the web_search tool.",
  "action": {
    "tool_name": "web_search",
    "tool_input": "weather in London"
  }
}

If you have enough information to answer the user's question, respond with a JSON object with two keys: "thought" and "final_answer".
Example of final answer:
{
  "thought": "I have found the information and can now answer the question.",
  "final_answer": "The answer is..."
}

User Question: What is the current market capitalization of ExampleCorp?

You previously took the following action:
{
  "thought": "The user is asking for the current market capitalization of ExampleCorp. This is financial data that changes frequently, so I should use the web_search tool to get the most up-to-date information.",
  "action": {
    "tool_name": "web_search",
    "tool_input": "current market capitalization of ExampleCorp"
  }
}

Observation: According to market data from today, ExampleCorp has a market capitalization of $150 Billion.

Now, provide your response based on this observation.

Expected LLM Output (Agent's Final Answer):

{
  "thought": "The web_search tool returned the current market capitalization of ExampleCorp as $150 Billion. I now have the information needed to answer the user's question.",
  "final_answer": "The current market capitalization of ExampleCorp is $150 Billion."
}

The system would then parse this JSON and present the final_answer to the user.

An Illustrative Python Snippet

To help visualize this process, here's a simplified Python function showing how such an interaction loop might be structured. This is not a complete agent framework but illustrates the flow of prompts and responses.

# Assume an LLM function `call_llm(prompt_text)` exists
# and a `web_search_tool(query)` function exists.

def execute_agent_step(current_prompt):
    # In a real scenario, this calls an LLM API
    # print("Sending to LLM:\n", current_prompt)
    llm_response_str = call_llm(current_prompt) # Simulated LLM call
    # print("LLM Response:\n", llm_response_str)
    
    # In a real system, you'd parse the JSON robustly
    import json
    try:
        response_json = json.loads(llm_response_str)
    except json.JSONDecodeError:
        print("Error: LLM did not return valid JSON.")
        return {"error": "Invalid JSON response from LLM"}, None

    if "action" in response_json:
        tool_name = response_json["action"].get("tool_name")
        tool_input = response_json["action"].get("tool_input")
        
        if tool_name == "web_search":
            # Simulate executing the tool
            observation = web_search_tool(tool_input)
            print(f"Tool Executed: {tool_name}, Input: '{tool_input}', Output: '{observation}'")
            return response_json, observation # Return the agent's action and the observation
        else:
            print(f"Error: Unknown tool '{tool_name}'")
            return response_json, f"Error: Unknown tool '{tool_name}'"
            
    elif "final_answer" in response_json:
        print("Final Answer:", response_json["final_answer"])
        return response_json, None # Signifies completion
        
    else:
        print("Error: LLM response did not contain 'action' or 'final_answer'.")
        return response_json, "Error: LLM response malformed."

# --- Example Usage (Conceptual - replace with actual LLM and tool calls) ---

# Placeholder for the actual LLM call
def call_llm(prompt_text):
    # This is where you would integrate with an actual LLM API (e.g., OpenAI, Anthropic)
    # For this example, we'll simulate responses based on the prompt content.
    if "User Question: What is the current market capitalization of ExampleCorp?" in prompt_text and "Observation:" not in prompt_text:
        return """
        {
          "thought": "The user is asking for the current market capitalization of ExampleCorp. This is financial data that changes frequently, so I should use the web_search tool to get the most up-to-date information.",
          "action": {
            "tool_name": "web_search",
            "tool_input": "current market capitalization of ExampleCorp"
          }
        }
        """
    elif "Observation: According to market data from today, ExampleCorp has a market capitalization of $150 Billion." in prompt_text:
        return """
        {
          "thought": "The web_search tool returned the current market capitalization of ExampleCorp as $150 Billion. I now have the information needed to answer the user's question.",
          "final_answer": "The current market capitalization of ExampleCorp is $150 Billion."
        }
        """
    return """{"thought": "I am unsure how to proceed.", "final_answer": "I cannot answer this question."}"""

# Placeholder for the actual web search tool
def web_search_tool(query):
    if query == "current market capitalization of ExampleCorp":
        return "According to market data from today, ExampleCorp has a market capitalization of $150 Billion."
    return "No information found for your query."

# Initial prompt construction (as defined earlier)
initial_prompt = """You are a helpful AI assistant. Your task is to answer user questions accurately.

You have access to the following tool:
Tool Name: web_search
Description: Use this tool to find current information from the internet, such as stock prices, news, or recent events.
Input: A single string representing the search query.
Output: A string containing a summary of the search results.

When you need to use a tool, respond with a JSON object with two keys: "thought" and "action".
The "thought" key should contain a string explaining your reasoning for choosing the tool and the query.
The "action" key should be an object containing "tool_name" and "tool_input".

Example of tool usage:
{
  "thought": "I need to find the weather in London. I will use the web_search tool.",
  "action": {
    "tool_name": "web_search",
    "tool_input": "weather in London"
  }
}

If you have enough information to answer the user's question, respond with a JSON object with two keys: "thought" and "final_answer".
Example of final answer:
{
  "thought": "I have found the information and can now answer the question.",
  "final_answer": "The answer is..."
}

User Question: What is the current market capitalization of ExampleCorp?
"""

# First agent step
print("--- Agent Interaction: Step 1 ---")
agent_action_1, observation_1 = execute_agent_step(initial_prompt)

if observation_1 and "final_answer" not in agent_action_1:
    # Construct the next prompt, including the history and observation
    prompt_for_step_2 = f"""{initial_prompt.split("User Question:")[0]}
User Question: What is the current market capitalization of ExampleCorp?

You previously took the following action:
{json.dumps(agent_action_1, indent=2)}

Observation: {observation_1}

Now, provide your response based on this observation.
"""
    print("\n--- Agent Interaction: Step 2 ---")
    agent_action_2, _ = execute_agent_step(prompt_for_step_2)

This snippet demonstrates the cycle: the agent (simulated by call_llm) receives a prompt, decides on an action (or a final answer), the system (your Python code) processes that action (e.g., calls web_search_tool), and then formulates a new prompt with the results for the agent's next step.

Refining Prompts for Better Performance

While our example worked, real-world scenarios often require more robust prompting:

Improving Query Formulation: If the agent generates poor search queries (e.g., too broad or too narrow), you can add instructions to the prompt like: "Ensure your search query is specific and targeted. If looking for a number, include terms like 'value' or 'amount'."
Handling Ambiguous or Missing Information: What if the search tool returns "No information found" or ambiguous results? You can instruct the agent to try rephrasing the query, breaking down the question, or even informing the user it cannot find a precise answer. For example: "If the initial search is unsuccessful, try a different phrasing for your query. If multiple attempts fail, state that the information could not be found."
Error Handling in Tool Execution: As discussed in "Addressing Errors During Tool Execution via Prompts," you can provide guidance on how the agent should react if a tool returns an error message instead of useful data.

Key Takeaways from This Exercise

Clarity in Tool Description: The agent's ability to use a tool effectively starts with a clear, concise description of the tool's purpose, inputs, and outputs within the prompt.
Structured Action Format: Defining a consistent format (like JSON) for tool invocation and final answers makes it easier for your system to parse the agent's responses and orchestrate actions.
Iterative Interaction: Agentic tool use is often a multi-step process. The agent thinks, acts, observes, and then thinks again. Your prompt design must support this conversational loop, incorporating history and new observations.
Importance of "Thought": Encouraging the agent to output its "thought" process before an action or final answer is valuable for debugging and understanding its reasoning. It also often improves the quality of the action itself.
Simulation for Development: Simulating tool outputs, as we did, is a useful technique when developing and testing your prompts before integrating with live APIs.

By working through this hands-on example, you've taken a significant step towards building more capable AI agents. Remember that prompt engineering for tool use is an iterative process. Test your prompts, observe the agent's behavior, and refine your instructions to achieve the desired outcomes. As you integrate more tools and tackle more complex tasks, these foundational techniques will serve you well.

Was this section helpful?