Now, let's apply the principles of iterative design and evaluation to a practical scenario. Often, your first attempt at a prompt won't yield the ideal results. The goal of this exercise is to take a suboptimal prompt, analyze its shortcomings based on the concepts covered in this chapter, and systematically refine it to improve the quality, consistency, and structure of the Large Language Model's (LLM) output.
Imagine you have customer feedback emails and need to extract specific pieces of information: the customer's main sentiment (Positive, Negative, Neutral), the product mentioned (if any), and a brief summary of the feedback's core issue or compliment.
Initial Scenario:
You start with a large block of text containing multiple customer emails. Here's a snippet representing one email:
Subject: Loving the new Xylos feature!
Hi team,
Just wanted to say the recent update to the Xylos platform, especially the dashboard customization, is fantastic! It makes my workflow so much smoother. I did notice a small glitch where the date filter sometimes resets unexpectedly, but overall, a huge improvement. Keep up the great work!
Best,
Alex Chen
Suboptimal Prompt (Attempt 1):
Here is some customer feedback:
[Insert Email Text Here]
What is this feedback about?
Typical Output (Attempt 1):
The LLM might respond with something like:
This feedback is about the Xylos platform update. The customer likes the dashboard customization but found a glitch with the date filter. They think it's a big improvement overall.
Analysis of Attempt 1:
While the LLM understood the basic content, the output has several problems concerning our goal:
Let's refine the prompt to be more specific about the task and the desired output format. We'll use clear instructions and request a structured format like JSON, which is easier for applications to parse.
Improved Prompt (Attempt 2):
Analyze the following customer feedback email. Extract the main sentiment (Positive, Negative, or Neutral), the specific product mentioned (if any, otherwise use "None"), and a concise summary (1-2 sentences) of the core feedback point. Format the output as a JSON object with keys: "sentiment", "product", and "summary".
Feedback Email:
'''
Subject: Loving the new Xylos feature!
Hi team,
Just wanted to say the recent update to the Xylos platform, especially the dashboard customization, is fantastic! It makes my workflow so much smoother. I did notice a small glitch where the date filter sometimes resets unexpectedly, but overall, a huge improvement. Keep up the great work!
Best,
Alex Chen
'''
Output:
Expected Output (Attempt 2):
{
"sentiment": "Positive",
"product": "Xylos",
"summary": "Customer appreciates the dashboard customization in the Xylos platform update but reported a minor bug with the date filter resetting."
}
Analysis of Attempt 2:
This is significantly better!
What if the feedback was more ambiguous or didn't mention a product? Let's consider feedback like this:
Subject: Problem logging in
I can't seem to access my account this morning. It just keeps spinning. Is there an issue? - Sam
Using Prompt 2 might produce:
{
"sentiment": "Negative",
"product": "None",
"summary": "Customer is unable to log into their account, encountering an indefinite loading issue."
}
This is good, but perhaps we want to ensure the summary always captures the primary problem for negative feedback or the primary highlight for positive feedback. We can refine the instructions slightly.
Further Improved Prompt (Attempt 3):
You are a customer support assistant analyzing feedback. Analyze the following customer feedback email. Determine the main sentiment (classify as strictly "Positive", "Negative", or "Neutral"). Identify the specific product mentioned (use "None" if no specific product is named). Create a concise summary (1-2 sentences) focusing on the core issue if sentiment is Negative/Neutral, or the main compliment if Positive.
Output the result as a JSON object with keys: "sentiment", "product", and "summary".
Feedback Email:
'''
[Insert Email Text Here]
'''
JSON Output:
This version adds a 'role' (customer support assistant
) and slightly refines the instruction for the summary
based on the sentiment
. This adds robustness, guiding the LLM more precisely for different feedback types.
How do we know our prompts are getting better?
You can create a small test suite of diverse emails and run each prompt version against them, comparing the outputs against a manually created "ideal" extraction. This forms the basis for systematic evaluation, which is essential for building reliable LLM applications.
The iterative process of prompt optimization involves generating output, analyzing its weaknesses, refining the prompt based on that analysis, and evaluating the new output.
This hands-on process demonstrates that prompt engineering isn't always about finding a single "magic" prompt immediately. It's often a methodical cycle of crafting, testing, analyzing, and refining to steer the LLM towards generating the precise output your application requires. Keep these principles of iterative refinement and careful evaluation in mind as you build your own prompts.
© 2025 ApX Machine Learning