By George M. on Apr 15, 2025
Prompt engineering isn't just for AI researchers anymore. Whether you're automating tasks with large language models (LLMs), building intelligent assistants, or integrating generative AI into your stack, the quality of your prompts determines the quality of your results. Google's February 2025 whitepaper on Prompt Engineering offers detailed insights grounded in their work with the Gemini model on Vertex AI. This guide's combination of theoretical depth and actionable clarity sets this guide apart.
Google's document stands out because it ties prompt strategies to configuration tuning (like temperature, top-k, and top-p), shows prompt types (system, role, contextual), and provides realistic examples across domains, like code generation and sentiment classification. These tips can save engineers hours of trial and error.
Providing clear examples in your prompt, even just one, dramatically improves the chances that the model outputs what you expect. This approach is especially helpful for structured tasks like data extraction or formatting. It shows the model what pattern to follow without leaving it to guess.
Do:
Parse the following pizza order to JSON:
Example: "Small pizza with cheese and pepperoni" =>
{
"size": "small",
"ingredients": ["cheese", "pepperoni"]
}
Order: "Large pizza with mushroom and olives"
Don't:
Parse this to JSON: "Large pizza with mushroom and olives"
Overly verbose or indirect prompts confuse the model. Writing like you're issuing a command is better than describing a situation. Simple, clear prompts reduce model hallucination and improve repeatability, especially when integrating prompts into production pipelines.
Do:
Act as a travel guide. Recommend places in New York for toddlers.
Don't:
I'm in New York with two toddlers and want to find fun things to do. Any ideas?
If you leave the output structure open, you risk getting irrelevant, excessive, or inconsistent results. Being specific about output length, style, or structure helps the model focus and makes parsing the result programmatically easier.
Do:
Write a 3-paragraph blog post on the 5 best consoles. Use a conversational tone.
Don't:
Tell me about consoles.
Models respond better to what you want rather than what you don't want. Positive instructions are more intuitive for the model and more likely to yield usable outputs. Over-relying on constraints increases the chance of unclear results or conflicting rules.
Do:
Generate a blog post that only includes company name, release year, and sales numbers.
Don't:
Write a blog post. Don't include video game titles.
Controlling token output matters, whether for cost savings, performance, or format constraints. You can either limit it via model settings or be explicit in the prompt. If you want a summary, say so; if you need a JSON object, define the schema.
Do:
Summarize this article in one tweet-length sentence.
Don't: Leave token length to default if brevity matters.
Variables let you generalize your prompts into templates, which is critical when deploying prompts inside code or pipelines. Instead of rewriting prompts for every case, you pass in dynamic values—reducing bugs and improving prompt reusability.
Example:
{city} = "Amsterdam"
Prompt: Give a travel fact about {city}.
When doing classification tasks with few-shot prompts, mixing up the order of class examples prevents overfitting to one class. The model learns to recognize the structure of the task, rather than blindly favoring whichever class comes first.
Do:
Classify reviews:
"Great acting but terrible ending" => NEUTRAL
"I loved every minute of it" => POSITIVE
"This was painful to watch" => NEGATIVE
The same task can be prompted in multiple ways: as a command, a question, or a statement. Each style nudges the model differently. Testing variations can help you find which one produces more stable, relevant outputs—especially for open-ended tasks.
Try:
LLM behaviours change with model versions. Prompts that worked on one version might underperform on another. That's why prompt testing should be part of your update workflow. Track changes, run comparisons, and version your prompts just like code.
Action: Regularly test prompts against new model versions and keep a changelog of prompt adjustments.
Prompt engineering is not a one-size-fits-all task. Google's tips, drawn from direct experience building with Gemini on Vertex AI, show that small phrasing, structure, or example choice changes can significantly improve LLM outputs. If you're building anything with generative models, adopt these practices early. They'll save time, reduce costs, and enhance output quality, especially as you scale across prompts or integrate with user-facing products.
Recommended Posts
© 2025 ApX Machine Learning. All rights reserved.