Structuring Prompts

While Resources provide the raw data context for an application, Prompts define the standard operating procedures for interacting with that data. A Prompt in the Model Context Protocol (MCP) acts as a reusable template that guides the Large Language Model (LLM) to perform specific tasks. By structuring prompts on the server side, you remove the burden of prompt engineering from the user and ensure consistent, high-quality interactions with your exposed tools and resources.

The Role of Prompts in MCP

In a typical chat interface, users manually type instructions like "Review this code for errors" or "Summarize the last five database records." This approach relies heavily on the user knowing exactly how to phrase their request to get the best output. MCP changes this dynamic by treating prompts as executable primitives.

When a server exposes a Prompt, it provides a structured definition containing the prompt's name, a description, and a list of arguments. The client (such as Claude Desktop or an IDE) retrieves this list and presents these prompts as available commands, often accessible via UI elements or slash commands. When selected, the client collects the necessary arguments from the user and sends a request to the server. The server then processes these arguments and returns a series of messages, potentially including embedded context from Resources, ready to be sent to the LLM.

This architecture separates the intent (what the user wants to do) from the context construction (how the prompt is assembled).

The communication flow for Prompt discovery and execution demonstrates how the server controls the template logic while the client handles the user interface.

Defining Prompt Primitives

To implement a prompt, you must define its structure and the logic that populates it. The MCP SDK simplifies this by using decorators to register functions as prompt handlers. The core components of a prompt definition include:

Name: A unique identifier (e.g., review-code).
Description: A human-readable explanation of what the prompt does. This is critical as it helps the client UI and the user understand the prompt's purpose.
Arguments: A list of input parameters required to generate the prompt. These act as variables within your template.

For example, imagine a scenario where we want to create a prompt that helps users debug error logs. We need an argument for the error message and optionally the log level.

The following Python example demonstrates how to define this using the MCP server instance:

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("LogAssistant")

@mcp.prompt()
def analyze_error(error_message: str, context_level: str = "brief"):
    """
    Analyzes an error message and suggests potential fixes.
    
    Args:
        error_message: The raw error text from the log.
        context_level: Depth of analysis (brief or detailed).
    """
    instruction = f"Analyze the following {context_level} error: {error_message}"
    
    return [
        {
            "role": "user",
            "content": {
                "type": "text",
                "text": instruction
            }
        }
    ]

In this implementation, the server automatically inspects the function signature to generate the required JSON schema for the arguments. When the client invokes analyze_error, the function executes and returns a list of message objects.

Message Structure and Roles

The return value of a prompt handler is a list of messages. This format mirrors the chat history structure used by most LLM APIs. Each message contains a role (usually user or assistant) and content.

While simple text content is common, MCP prompts derive their utility from the ability to mix text with other content types. The content field can be a simple text string or a more complex object containing image data or embedded resources.

Using the user role allows you to simulate the user typing a perfectly crafted instruction. Alternatively, you can use the assistant role to "pre-fill" the model's response, a technique known as "pre-filling" which effectively guides the LLM to follow a specific output format.

Embedding Resources in Prompts

One of the most significant capabilities of MCP prompts is the ability to verify and embed Resources directly into the prompt context. This bridges the gap between static data (Resources) and active instruction (Prompts).

Instead of asking the user to copy-paste file contents into the chat, a Prompt can accept a file path as an argument, use the internal Resource logic to read that file, and inject its content into the message sent to the LLM.

This integration ensures that the LLM receives the data exactly as the server intends, without truncation or formatting errors introduced by manual copy-pasting.

The internal logic of a Prompt often involves validating arguments and retrieving internal resources before constructing the final message payload.

Here is how you might implement a prompt that reads a specific file resource and asks the LLM to summarize it:

from mcp.types import TextContent, EmbeddedResource

@mcp.prompt()
def summarize_file(filepath: str):
    """Summarize the contents of a distinct file."""
    
    # In a real implementation, you would read the file content here
    # or call an internal function that handles resource reading.
    file_content = read_internal_file(filepath) 
    
    return [
        {
            "role": "user",
            "content": {
                "type": "resource",
                "resource": {
                    "uri": f"file:///{filepath}",
                    "mimeType": "text/plain",
                    "text": file_content
                }
            }
        },
        {
            "role": "user",
            "content": {
                "type": "text",
                "text": "Please provide a concise summary of the file above."
            }
        }
    ]

In this example, the client receives a structured object indicating that a resource is part of the conversation history. The LLM sees the content of the file associated with the URI, followed immediately by the instruction to summarize it.

Argument Validation and Discovery

When a client connects to your server, it performs a handshake and requests the capabilities of the server. If your server supports prompts, the client will issue a prompts/list request.

The server responds with a JSON structure defining available prompts. It is important to ensure your argument names are descriptive and your docstrings are clear. The MCP SDK uses these docstrings to populate the description fields sent to the client. If an argument is typed as an int or bool in your Python function, the SDK translates this into the appropriate JSON Schema type constraints.

For instance, if you define an argument lines: int, the client knows to enforce numeric input. This type safety reduces the chance of runtime errors when the prompt logic attempts to process the input.

Best Practices for Prompt Engineering in MCP

When structuring your prompts, follow these guidelines to maximize reliability and usability:

Atomic Intent: specific prompts often perform better than generic ones. Instead of a single "Help" prompt, create distinct prompts for "Debug", "Explain", and "Refactor".
Context Window Management: Be mindful of the size of the resources you embed. While MCP handles the transport, the receiving LLM still has a context window limit.
Clear Role Definitions: Use the system role (if supported by the specific LLM implementation and client) or the user role to clearly demarcate instructions from data.
Statelessness: Prompts should generally be stateless. They take arguments and return messages. Avoid relying on server-side variables that persist between different prompt execution requests, as the client may maintain its own conversation state.

By adhering to these structures, you ensure that your MCP server acts not just as a passive data store, but as an intelligent partner that guides users toward successful interactions with your data.

References

Prompt engineering, OpenAI, 2024 - Provides practical guidelines and strategies for designing effective prompts, covering techniques to optimize large language model outputs and interaction patterns.
Function calling, OpenAI, 2024 - Explains how to integrate large language models with external tools and APIs, detailing the process of defining callable functions and parsing structured arguments for models.
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, Donghao Xu, Zhezhang Ding, Xu He, Huijing Zhao, Mathieu Moze, François Aioun, Franck Guillemard, 2020 arXiv preprint arXiv:2005.11470 DOI: 10.48550/arXiv.2005.11470 - Introduces the Retrieval-Augmented Generation (RAG) framework, demonstrating how to enhance large language models by incorporating external knowledge bases to improve factual consistency and reduce hallucinations.