In this hands-on exercise, we'll apply the principles discussed in this chapter to build a practical Python tool: a database query tool. This tool will enable an LLM agent to retrieve information from a relational database, a common requirement for agents needing access to structured data. We'll focus on creating a Python function that connects to an SQLite database, executes a query, and returns the results in an LLM-friendly format. This exercise will touch upon defining tool interfaces, handling inputs and outputs, and interacting with an external data source.
Imagine an LLM agent designed to answer customer questions about products. The product information (names, categories, prices) is stored in a database. Our goal is to create a tool that the agent can use to query this database.
First, we need a simple database to work with. We'll use SQLite, which is built into Python, making setup straightforward. Let's create a small database named product_catalog.db
with a products
table.
Create a Python script (e.g., setup_db.py
) with the following content and run it once to create and populate your database:
import sqlite3
def initialize_database():
"""Creates and populates the product_catalog.db if it doesn't exist."""
db_path = 'product_catalog.db'
conn = sqlite3.connect(db_path)
cursor = conn.cursor()
# Create products table
cursor.execute('''
CREATE TABLE IF NOT EXISTS products (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
category TEXT,
price REAL,
stock_quantity INTEGER
)
''')
# Check if table is empty before inserting sample data
cursor.execute("SELECT COUNT(*) FROM products")
if cursor.fetchone()[0] == 0:
sample_data = [
('Laptop Pro', 'Electronics', 1299.99, 50),
('Wireless Mouse', 'Electronics', 25.50, 200),
('Coffee Maker', 'Appliances', 79.00, 75),
('Office Chair', 'Furniture', 150.75, 30),
('Python Programming Book', 'Books', 45.00, 120)
]
cursor.executemany('''
INSERT INTO products (name, category, price, stock_quantity)
VALUES (?, ?, ?, ?)
''', sample_data)
print("Database initialized and populated with sample data.")
else:
print("Database already exists and contains data.")
conn.commit()
conn.close()
if __name__ == '__main__':
initialize_database()
After running this script, you'll have a product_catalog.db
file in the same directory.
Our tool needs to:
SELECT
statements.product_catalog.db
.Let's implement this tool as a Python function.
import sqlite3
import json
DB_PATH = 'product_catalog.db' # Define database path as a constant
def query_product_database(sql_query: str, parameters: tuple = ()) -> str:
"""
Executes a read-only SQL query (SELECT) against the product_catalog.db database
and returns the results.
Args:
sql_query (str): The SELECT SQL query to execute.
Only SELECT statements are permitted.
parameters (tuple): A tuple of parameters to safely substitute into the query.
Use this for dynamic values in WHERE clauses.
Example: ("Electronics",) for "WHERE category = ?"
Returns:
str: A JSON string representing the query results (list of dictionaries)
or an error message if the query fails or is not allowed.
"""
# Input validation: Ensure it's a SELECT query
if not sql_query.strip().upper().startswith("SELECT"):
return json.dumps({
"error": "Invalid operation. Only SELECT queries are allowed for this tool."
})
try:
conn = sqlite3.connect(DB_PATH)
cursor = conn.cursor()
cursor.execute(sql_query, parameters)
rows = cursor.fetchall()
if not rows:
return json.dumps({
"message": "Query executed successfully. No records found matching your criteria."
})
# Get column names from cursor.description
column_names = [description[0] for description in cursor.description]
# Convert list of tuples to list of dictionaries
results = [dict(zip(column_names, row)) for row in rows]
return json.dumps(results, indent=2) # indent for readability, optional for LLM
except sqlite3.IntegrityError as e: # Example of specific DB error
return json.dumps({
"error": f"Database integrity error: {str(e)}. Check your query parameters and table constraints."
})
except sqlite3.OperationalError as e: # Example for table not found, syntax errors
return json.dumps({
"error": f"Database operational error: {str(e)}. Check your SQL syntax or if the table exists."
})
except sqlite3.Error as e: # Catch other sqlite3 specific errors
return json.dumps({
"error": f"A database error occurred: {str(e)}"
})
except Exception as e: # Catch any other unexpected errors
return json.dumps({
"error": f"An unexpected error occurred during query execution: {str(e)}"
})
finally:
if 'conn' in locals() and conn:
conn.close()
Key aspects of this implementation:
sql_query: str
, parameters: tuple
, -> str
improves code readability and can be used by agent frameworks.SELECT
. This is a simple but important security measure for a tool that executes arbitrary SQL. In more advanced scenarios, you might parse the SQL more deeply or use a query builder.cursor.execute(sql_query, parameters)
is used. This is the standard way to prevent SQL injection vulnerabilities. The LLM would be instructed to provide the query structure and the values separately.try...except
blocks are used to catch various database-related errors and other exceptions, returning a JSON object with an error message. This provides useful feedback to the LLM if something goes wrong.finally
block to ensure it's always released, even if errors occur.Before integrating this tool with an LLM agent, it's good practice to test it directly. You can add the following to your Python script to see it in action:
if __name__ == '__main__':
print("Ensure 'product_catalog.db' exists. Run 'setup_db.py' if not.\n")
# Test Case 1: Successful query for products in 'Electronics'
print("Test Case 1: Products in Electronics category")
query1 = "SELECT name, price FROM products WHERE category = ? ORDER BY price DESC"
params1 = ('Electronics',)
result1 = query_product_database(query1, params1)
print(json.dumps(json.loads(result1), indent=2)) # Pretty print the JSON output
print("-" * 40)
# Test Case 2: Successful query for a specific product
print("Test Case 2: Specific product 'Office Chair'")
query2 = "SELECT name, stock_quantity FROM products WHERE name = ?"
params2 = ('Office Chair',)
result2 = query_product_database(query2, params2)
print(json.dumps(json.loads(result2), indent=2))
print("-" * 40)
# Test Case 3: Query that returns no results
print("Test Case 3: Non-existent category 'Toys'")
query3 = "SELECT name, price FROM products WHERE category = ?"
params3 = ('Toys',)
result3 = query_product_database(query3, params3)
print(json.dumps(json.loads(result3), indent=2))
print("-" * 40)
# Test Case 4: Attempting a non-SELECT query (should be blocked)
print("Test Case 4: Attempting an UPDATE query (security check)")
query4 = "UPDATE products SET price = 10.00 WHERE name = ?"
params4 = ('Wireless Mouse',)
result4 = query_product_database(query4, params4)
print(json.dumps(json.loads(result4), indent=2))
print("-" * 40)
# Test Case 5: Query with a syntax error
print("Test Case 5: Query with SQL syntax error")
query5 = "SELEC name, price FROM products WHERE category = ?" # Deliberate typo in SELECT
params5 = ('Books',)
result5 = query_product_database(query5, params5)
print(json.dumps(json.loads(result5), indent=2))
print("-" * 40)
# Test Case 6: Query a non-existent column
print("Test Case 6: Querying a non-existent column")
query6 = "SELECT non_existent_column FROM products WHERE category = ?"
params6 = ('Books',)
result6 = query_product_database(query6, params6)
print(json.dumps(json.loads(result6), indent=2))
print("-" * 40)
Running these tests will help you verify that the tool behaves as expected in various scenarios, including successful queries, queries with no results, and error conditions.
Once the tool is developed and tested, it can be registered with an LLM agent framework (like LangChain, LlamaIndex, or a custom agent). The agent, when faced with a task requiring product information, would:
query_product_database
is appropriate.sql_query
string and the parameters
tuple. For the example "What electronics do you have under $50?", it might generate:
sql_query
: "SELECT name, price FROM products WHERE category = ? AND price < ?"
parameters
: ('Electronics', 50.0)
query_product_database
function with these arguments.The following diagram illustrates this interaction flow:
Flow of an LLM agent using the
query_product_database
tool to answer a user's question.
This hands-on exercise demonstrates how to build a fundamental type of tool that grants LLM agents access to structured data. By defining clear interfaces, handling inputs and outputs carefully, and ensuring robust error management, you can create powerful and reliable tools that significantly expand the capabilities of your agents. As you build more complex agents, you'll often find yourself creating variations of such database interaction tools tailored to different data sources and requirements.
Was this section helpful?
© 2025 ApX Machine Learning