Let's put the concepts from this chapter into practice by building a functional, albeit simple, question-answering (Q&A) bot. This exercise will solidify your understanding of making API calls, sending prompts, and processing responses programmatically. We'll create a command-line application that takes a user's question, sends it to an LLM via its API, and displays the answer.
Before we begin, ensure you have the following set up:
openai
library for convenience. Install it using pip:
pip install openai
OPENAI_API_KEY
with your actual key.export OPENAI_API_KEY='your_api_key_here'
set OPENAI_API_KEY=your_api_key_here
$env:OPENAI_API_KEY='your_api_key_here'
Our bot will follow these steps in a loop:
temperature
or max_tokens
).Let's write the Python code. Create a file named qa_bot.py
.
import os
import openai
from openai import OpenAI # Use the updated OpenAI library structure
# --- Configuration ---
# Attempt to load the API key from an environment variable
try:
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
except openai.OpenAIError as e:
print(f"Error initializing OpenAI client: {e}")
print("Please ensure the OPENAI_API_KEY environment variable is set.")
exit()
MODEL_NAME = "gpt-3.5-turbo" # Or choose another suitable model
# --- Core Function ---
def get_llm_response(user_question):
"""
Sends the user's question to the LLM API and returns the response.
"""
system_prompt = "You are a helpful assistant. Answer the user's question clearly and concisely."
try:
completion = client.chat.completions.create(
model=MODEL_NAME,
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_question}
],
temperature=0.7, # Adjust for creativity vs. determinism
max_tokens=150 # Limit the response length
)
# Extracting the response text
# Note: The exact structure might vary slightly based on API version updates
if completion.choices and len(completion.choices) > 0:
response_text = completion.choices[0].message.content.strip()
return response_text
else:
return "Error: No response received from the API."
except openai.APIConnectionError as e:
return f"API Connection Error: {e}"
except openai.RateLimitError as e:
return f"API Rate Limit Exceeded: {e}"
except openai.AuthenticationError as e:
return f"API Authentication Error: {e}. Check your API key."
except openai.APIError as e:
return f"API Error: {e}"
except Exception as e:
# Catch any other unexpected errors
return f"An unexpected error occurred: {e}"
# --- Main Interaction Loop ---
def main():
print("Simple Q&A Bot (type 'quit' or 'exit' to stop)")
print("-" * 30)
while True:
user_input = input("You: ")
if user_input.lower() in ["quit", "exit"]:
print("Bot: Goodbye!")
break
if not user_input:
continue
print("Bot: Thinking...") # Provide feedback while waiting
llm_answer = get_llm_response(user_input)
print(f"Bot: {llm_answer}")
print("-" * 30)
if __name__ == "__main__":
main()
os
to access environment variables and openai
for the API interaction.OpenAI
client. The API key is fetched from the OPENAI_API_KEY
environment variable. Basic error handling is included here to catch initialization problems. We also define the MODEL_NAME
.get_llm_response(user_question)
Function:
user_question
as input.system_prompt
to provide context or instructions to the LLM (in this case, setting its role).client.chat.completions.create
to call the API. This is the standard method for chat-based models like GPT-3.5 Turbo or GPT-4.messages
parameter follows the chat format, including roles ("system", "user").temperature
and max_tokens
are included as examples of parameters discussed earlier in the chapter. temperature=0.7
allows for some variability in responses, while max_tokens=150
limits the length to avoid overly long answers.completion.choices[0].message.content
. The structure of the response object is important here; you might need to inspect the actual response object (print(completion)
) during development if using different libraries or models.try...except
blocks catch specific openai
exceptions (like connection errors, rate limits, authentication issues) as well as general exceptions, returning informative error messages instead of crashing the bot.main()
Function:
while True
loop for continuous interaction.input("You: ")
prompts the user and reads their input.get_llm_response
with the user's input.if __name__ == "__main__":
: This standard Python construct ensures that the main()
function is called only when the script is executed directly.qa_bot.py
.OPENAI_API_KEY
environment variable is set in your terminal session.python qa_bot.py
Simple Q&A Bot (type 'quit' or 'exit' to stop)
------------------------------
You: What is the capital of France?
Bot: Thinking...
Bot: The capital of France is Paris.
------------------------------
You: Explain the concept of temperature in LLM APIs.
Bot: Thinking...
Bot: In LLM APIs, the 'temperature' parameter controls the randomness of the output. A lower temperature (e.g., 0.2) makes the output more focused and deterministic, often choosing the most likely next word. A higher temperature (e.g., 0.8) increases randomness, leading to more creative or diverse responses, but potentially less coherent ones. It essentially adjusts the probability distribution of the next token prediction.
------------------------------
You: quit
Bot: Goodbye!
This bot is basic, but it demonstrates the core interaction pattern. Here are some ways you could extend it:
get_llm_response
function to accept and include previous turns of the conversation in the messages
list. This allows the LLM to have context about what was discussed earlier. (Covered conceptually in later chapters on frameworks).temperature
, max_tokens
, top_p
, etc., to see how they affect the output quality and style for different types of questions.This hands-on exercise provides a tangible example of how the API interaction concepts covered in this chapter translate into a working application. By successfully building and running this bot, you've taken a significant step towards integrating LLMs into your own software projects.
© 2025 ApX Machine Learning