Agents gain their power by interacting with the world beyond the LLM's internal knowledge. This interaction happens through Tools. While LangChain provides a suite of pre-built tools for common tasks like web searches or accessing calculators, real-world applications often require bespoke capabilities tailored to specific APIs, databases, or proprietary logic. Creating custom tools is therefore a fundamental skill for building sophisticated, production-ready agents.
A LangChain Tool is essentially a component that encapsulates a specific capability the agent can invoke. It bundles the execution logic with metadata, most importantly a name
and a description
, which the agent's LLM uses to decide when and how to use the tool.
BaseTool
At its core, a custom tool in LangChain inherits from the BaseTool
class. Let's examine the essential components you'll need to define:
name
(str): A unique identifier for the tool. This name must be distinct among all tools provided to an agent. It should be descriptive yet concise, often using snake_case (e.g., weather_reporter
, database_query_executor
). The agent uses this name internally when deciding to call the tool.description
(str): This is arguably the most significant part of a custom tool. The description tells the agent's LLM what the tool does, what inputs it expects, and what output it produces. Crafting clear, accurate, and informative descriptions is essential for the agent to use the tool effectively. Think of it as writing documentation specifically for the LLM. Poor descriptions lead to incorrect tool usage or the agent failing to use the tool when appropriate._run(self, *args, **kwargs)
(Method): This method contains the synchronous execution logic for your tool. It receives the input arguments determined by the agent and performs the intended action, returning the result as a string._arun(self, *args, **kwargs)
(Optional Method): If your tool involves I/O-bound operations (like network requests or database queries), implementing the asynchronous _arun
method is highly recommended for better performance, especially in concurrent applications. This method uses Python's async
/await
syntax. If _arun
is not implemented, LangChain will typically wrap the synchronous _run
method for asynchronous calls, which might block the event loop.Here's a basic example of a custom tool using the BaseTool
class:
import os
import requests
from langchain_core.tools import BaseTool
from typing import Type, Optional
from pydantic import BaseModel, Field # Optional for structured args
# Define an input schema (optional but recommended)
class WeatherInput(BaseModel):
location: str = Field(description="The city and state, e.g., San Francisco, CA")
class GetCurrentWeatherTool(BaseTool):
name: str = "get_current_weather"
description: str = (
"Useful for when you need to find out the current weather conditions "
"in a specific location. Input should be a location string."
)
# If using structured input, uncomment the following line:
# args_schema: Type[BaseModel] = WeatherInput
# Example: Store API key securely (e.g., environment variable)
api_key: str = os.environ.get("OPENWEATHERMAP_API_KEY")
def _run(self, location: str) -> str:
"""Use the tool synchronously."""
if not self.api_key:
return "Error: Weather API key not set."
if not location:
return "Error: Location must be provided."
try:
base_url = "http://api.openweathermap.org/data/2.5/weather"
params = {"q": location, "appid": self.api_key, "units": "metric"}
response = requests.get(base_url, params=params)
response.raise_for_status() # Raise HTTPError for bad responses (4xx or 5xx)
data = response.json()
# Extract relevant information
main_weather = data['weather'][0]['main']
description = data['weather'][0]['description']
temp = data['main']['temp']
feels_like = data['main']['feels_like']
humidity = data['main']['humidity']
return (
f"Current weather in {location}: {main_weather} ({description}). "
f"Temperature: {temp}°C (Feels like: {feels_like}°C). "
f"Humidity: {humidity}%."
)
except requests.exceptions.RequestException as e:
return f"Error fetching weather data: {e}"
except KeyError:
return f"Error: Unexpected response format from weather API for {location}."
except Exception as e:
# Catch unexpected errors during processing
return f"An unexpected error occurred: {e}"
async def _arun(self, location: str) -> str:
"""Use the tool asynchronously."""
# For production, use an async HTTP client like aiohttp
# This example uses synchronous requests within an async func for simplicity
# but this is NOT best practice for true async performance.
import asyncio
# Simulate async call by running sync _run in a thread pool
# In a real async implementation, you'd use an async library (e.g., aiohttp)
loop = asyncio.get_running_loop()
result = await loop.run_in_executor(None, self._run, location)
return result
# Example usage (assuming OPENWEATHERMAP_API_KEY is set)
# weather_tool = GetCurrentWeatherTool()
# sync_result = weather_tool.run("London, UK")
# print(sync_result)
# async_result = await weather_tool.arun("Tokyo, JP")
# print(async_result)
In this example:
GetCurrentWeatherTool
.name
is get_current_weather
.description
clearly explains its purpose and expected input._run
implements the logic using the requests
library to call an external weather API. It includes basic error handling._arun
provides an asynchronous interface. Note that the example uses run_in_executor
for simplicity, but a production implementation should use an asynchronous HTTP client like aiohttp
for true non-blocking I/O.@tool
DecoratorManually defining classes inheriting from BaseTool
can become verbose, especially for simpler tools. LangChain provides a convenient @tool
decorator that can turn any Python function or coroutine directly into a Tool object.
The decorator infers the name
from the function name and uses the function's docstring as the description
.
from langchain_core.tools import tool
import math
@tool
def simple_calculator(expression: str) -> str:
"""
Useful for evaluating simple mathematical expressions involving addition,
subtraction, multiplication, division, and exponentiation.
Input MUST be a valid Python numerical expression string.
Example input: '2 * (3 + 4) / 2**2'
"""
try:
# Use a safe evaluation method if possible, or restrict operations.
# eval() is generally unsafe with arbitrary user input.
# For production, consider using ast.literal_eval or a dedicated math parser.
allowed_chars = "0123456789+-*/(). "
if not all(c in allowed_chars for c in expression):
# A very basic sanitization check
return "Error: Invalid characters in expression."
# Using eval here for simplicity, but BEWARE of security risks in production.
result = eval(expression, {"__builtins__": None}, {'math': math})
return f"The result of '{expression}' is {result}"
except Exception as e:
return f"Error evaluating expression '{expression}': {e}"
# The 'simple_calculator' object is now a LangChain Tool
print(f"Tool Name: {simple_calculator.name}")
print(f"Tool Description: {simple_calculator.description}")
# print(simple_calculator.run("5 * (10 - 2)"))
The @tool
decorator automatically handles creating the BaseTool
subclass structure for you. It examines the function's type hints to determine input arguments. For asynchronous functions (async def
), it automatically populates the _arun
method.
While passing simple strings as input works, complex tools often benefit from structured inputs with multiple parameters, type validation, and clear descriptions for each parameter. You can achieve this by defining a Pydantic BaseModel
and assigning it to the args_schema
attribute of your tool.
When you provide an args_schema
, the LLM is instructed to format its input for the tool as a JSON object matching that schema. LangChain handles parsing this JSON and passing the arguments correctly to your _run
or _arun
method.
from langchain_core.tools import BaseTool
from pydantic import BaseModel, Field
from typing import Type, Optional
import datetime
# Define the input schema using Pydantic
class FlightSearchInput(BaseModel):
departure_city: str = Field(description="The city where the flight departs.")
arrival_city: str = Field(description="The city where the flight arrives.")
departure_date: str = Field(description="The desired departure date in YYYY-MM-DD format.")
max_stops: Optional[int] = Field(None, description="Optional maximum number of stops allowed.")
class FlightSearchTool(BaseTool):
name: str = "flight_search_engine"
description: str = (
"Searches for flight options based on departure city, arrival city, "
"departure date, and optionally the maximum number of stops. "
"Returns available flight details."
)
args_schema: Type[BaseModel] = FlightSearchInput
def _run(
self,
departure_city: str,
arrival_city: str,
departure_date: str,
max_stops: Optional[int] = None
) -> str:
"""Synchronous execution with structured arguments."""
# Input validation is partially handled by Pydantic
print(f"Searching flights from {departure_city} to {arrival_city} on {departure_date}...")
if max_stops is not None:
print(f"Constraint: Maximum {max_stops} stops.")
# --- Placeholder for actual flight search API call ---
# In a real tool, you would call a flight API here using the arguments.
# Example dummy response:
if departure_city.lower() == "london" and arrival_city.lower() == "new york":
return (f"Found flights for {departure_date}: "
f"Flight BA001 (Direct, $850), Flight UA934 (1 Stop, $720)")
else:
return f"No flights found for the specified route on {departure_date}."
# --- End Placeholder ---
async def _arun(
self,
departure_city: str,
arrival_city: str,
departure_date: str,
max_stops: Optional[int] = None
) -> str:
"""Asynchronous execution with structured arguments."""
# In production use an async http client (e.g. aiohttp) here
print(f"(Async) Searching flights from {departure_city} to {arrival_city} on {departure_date}...")
if max_stops is not None:
print(f"(Async) Constraint: Maximum {max_stops} stops.")
# Simulate async work
import asyncio
await asyncio.sleep(0.5)
# --- Placeholder for actual async flight search API call ---
if departure_city.lower() == "london" and arrival_city.lower() == "new york":
return (f"(Async) Found flights for {departure_date}: "
f"Flight BA001 (Direct, $850), Flight UA934 (1 Stop, $720)")
else:
return f"(Async) No flights found for the specified route on {departure_date}."
# --- End Placeholder ---
# Instantiate the tool
# flight_tool = FlightSearchTool()
# How LangChain might invoke it internally (simplified):
# tool_input = '{"departure_city": "London", "arrival_city": "New York", "departure_date": "2024-12-25"}'
# result = flight_tool.run(tool_input)
# print(result)
# Or using the structured input directly if using run method directly:
# result_structured = flight_tool.run(
# departure_city="London",
# arrival_city="New York",
# departure_date="2024-12-25"
# )
# print(result_structured)
Using args_schema
makes tool interactions more robust and predictable, especially when dealing with multiple parameters or optional arguments. It also helps the LLM understand exactly what information it needs to provide to the tool. The @tool
decorator can also automatically infer the args_schema
from type-hinted function arguments, especially if they use Pydantic models.
_run
/ _arun
methods to catch common issues (e.g., API errors, invalid input). Return informative error messages as strings so the agent knows the tool execution failed. More advanced error handling strategies will be discussed later._arun
using appropriate asynchronous libraries (aiohttp
, asyncpg
, etc.) for better performance in concurrent agent setups.By mastering custom tool development, you unlock the ability to grant LangChain agents specialized skills, allowing them to interact with virtually any system or data source required by your application. Remember that the effectiveness of your agent heavily relies on the quality and clarity of the tools you provide it.
© 2025 ApX Machine Learning