This hands-on exercise will guide you through defining the scope for a mock LLM red team operation. Defining the scope is a primary element of successful LLM red teaming, establishing clear targets, limitations, and goals for the engagement. A well-defined scope is fundamental for a successful and focused engagement, ensuring that all participants understand its parameters.The Scenario: "GenieQuery" - Innovatech Corp's Internal AssistantImagine you're part of the newly formed AI red team at Innovatech Corp, a mid-sized technology company. Innovatech has recently deployed "GenieQuery," an LLM-powered internal assistant.Purpose: GenieQuery is designed to help employees by answering questions based on a repository of internal company documents. This includes HR policies, project documentation (past and present), technical guides, and meeting summaries.Access: Employees access GenieQuery via a web-based chat interface.Underlying Technology: It uses a proprietary LLM fine-tuned by Innovatech's AI team on their internal documents. The system has an API that the web interface calls.Data Sensitivity: The documents GenieQuery can access contain a mix of information, from publicly shareable HR benefits to highly confidential details about unreleased products ("Project Chimera"), internal financial forecasts, and sensitive employee information snippets that might be present in meeting notes.Management's Concerns:Confidentiality Breach: The primary concern is the leakage of sensitive information, especially details about "Project Chimera" or internal financials, to unauthorized employees.Misinformation: Providing incorrect or misleading information regarding HR policies or critical project details.Abuse: Employees attempting to break the system for non-work purposes or to find information they aren't privy to.Reputational Damage (Internal): If the system is easily manipulated or provides harmful outputs, it could erode trust in AI initiatives within the company.Your Task: Draft an Initial Scope DocumentYour task is to draft an initial scope document for a red team engagement against GenieQuery. This document will serve as a foundational agreement on what will be tested, how, and within what boundaries. Remember the principles discussed in "Setting Objectives and Scope for LLM Red Teaming" and "LLM Vulnerabilities: An Introduction" earlier in this chapter.Main Elements for Your Scope DocumentStructure your scope document around the following main elements. Think critically about each one in the context of GenieQuery.Objectives of the Red Team EngagementWhat are the primary goals of this assessment? Be specific.What questions are you trying to answer for Innovatech management?Example thinking: Given management's concern about "Project Chimera," an important objective might be: "Assess the risk of GenieQuery inadvertently disclosing confidential information related to 'Project Chimera' through targeted promptin_g techniques."_Target System DefinitionClearly define the boundaries of the "GenieQuery" system.In-Scope Components: List all parts of the GenieQuery system that ARE part of this engagement.Consider: Web UI, API endpoints, the LLM model itself, any specific databases or document repositories it directly interfaces with for its knowledge.Out-of-Scope Components: List specific systems, infrastructure, or areas that ARE NOT part of this engagement.Consider: The general corporate network, employee workstations, physical security of the data center, the underlying cloud provider's infrastructure (unless specific misconfigurations of Innovatech's services on it are relevant).Critical Assets to ProtectIdentify the most important assets related to GenieQuery that the red team will try to impact.Consider:Confidentiality of specific datasets (e.g., "Project Chimera" documents, PII).Integrity of information provided by the LLM (e.g., accuracy of HR policy responses).Availability of the GenieQuery service (though usually, disruptive testing is limited).Reputation of the system and the AI team.Threats to Investigate (Attack Vectors)Based on the LLM vulnerabilities discussed earlier (e.g., prompt injection, jailbreaking, data poisoning, sensitive information extraction), list the types of attacks or threat scenarios that will be explored.Tailor this to GenieQuery. For example, data poisoning of its training data might be out of scope if you're only testing the deployed system, but attempting to influence its behavior through its immediate input (prompt injection) would be in scope.Example: "Investigate susceptibility to direct and indirect prompt injection attacks aimed at exfiltrating information about unannounced projects."Example: "Test for jailbreaking techniques that bypass safety filters to elicit inappropriate or non-work-related responses."Rules of Engagement (Constraints & Limitations)Timeframe: Specify a realistic duration for the active testing phase (e.g., "2 weeks, from YYYY-MM-DD to YYYY-MM-DD").Allowed Techniques: What methods are permissible? Are there any restrictions? For instance, "No denial-of-service (DoS) attacks that could significantly impact GenieQuery's availability for regular employees." "Social engineering of Innovatech employees is out of scope for this engagement."Testing Accounts/Access: Will the red team use standard employee accounts, specially provisioned test accounts, or attempt unauthenticated attacks?Incident Handling: If a critical vulnerability is discovered, what is the immediate reporting protocol?Data Handling: How will any sensitive data discovered by the red team be handled, stored, and reported?AssumptionsList any assumptions made during the scope definition.Example: "The red team assumes the provided test environment is a faithful representation of the production GenieQuery system."Example: "It is assumed that the core LLM model will not be updated during the engagement period."Visualizing Scope BoundariesUnderstanding what's in and out of scope is very important. A simple diagram can often clarify this for all stakeholders.digraph GenieQueryScope { rankdir=TB; graph [fontname="Arial", fontsize=12, bgcolor="transparent"]; node [shape=box, style="rounded,filled", fontname="Arial", margin="0.2,0.1", color="#495057", fillcolor="#dee2e6"]; edge [fontname="Arial", fontsize=10, color="#495057"]; subgraph cluster_in_scope { label="In Scope: GenieQuery System"; style="filled"; color="#ced4da"; // Light gray border for the subgraph bgcolor="#e9ecef"; // Lighter gray background for the subgraph node [fillcolor="#a5d8ff"]; // Blue for in-scope components "GenieQuery Web UI"; "GenieQuery API"; "Proprietary LLM"; "Internal Document DB"; "GenieQuery Web UI" -> "GenieQuery API" [label="Calls"]; "GenieQuery API" -> "Proprietary LLM" [label="Queries"]; "Proprietary LLM" -> "Internal Document DB" [label="Accesses for RAG"]; } subgraph cluster_out_of_scope { label="Out of Scope"; style="filled"; color="#ffc9c9"; // Light red border bgcolor="#ffe3e3"; // Lighter red background node [fillcolor="#ffa8a8"]; // Lighter red for out-of-scope components "Employee Laptops"; "Corporate Network Infrastructure"; "Physical Data Center"; "Third-party Software (e.g., OS, Browser)"; } "Innovatech Employee" [shape=oval, fillcolor="#b2f2bb", color="#37b24d"]; // Green for legitimate user "Red Team Operator" [shape=oval, fillcolor="#ffd8a8", color="#f76707"]; // Orange for red team "Innovatech Employee" -> "GenieQuery Web UI" [label="Interacts via"]; "Red Team Operator" -> "GenieQuery Web UI" [label="Tests", style=dashed]; "Red Team Operator" -> "GenieQuery API" [label="Tests", style=dashed]; }The diagram shows the main components of the GenieQuery system considered in-scope for the red team assessment, such as its web interface, API, the LLM, and the document database it uses. It also delineates elements like employee laptops and general corporate infrastructure as out-of-scope. Both regular employees and red team operators interact with the system, typically via its UI or API.Putting It All Together: Your TurnNow, take the scenario details and the elements above and draft your own scope document for the GenieQuery red team engagement. Don't worry about making it perfect; the goal is to practice the thought process. Focus on being clear and specific.For example, when defining Objectives, you might write:Objective 1: Identify and document vulnerabilities in GenieQuery that could lead to the unauthorized disclosure of confidential information pertaining to "Project Chimera."Objective 2: Assess GenieQuery's susceptibility to prompt injection attacks aimed at bypassing safety mechanisms or generating responses that violate Innovatech's internal communication policies.Objective 3: Determine if GenieQuery can be manipulated to provide verifiably false or misleading information regarding HR policies, and evaluate the potential impact of such misinformation.Continue this for all sections.A Note on IterationRemember, a scope document is often a living document, especially in the early stages. It might be drafted, then discussed with stakeholders (like the AI development team, management, and legal/compliance if necessary), and then refined based on feedback or new information gathered during initial, non-intrusive reconnaissance.Final CheckBefore you consider your mock scope definition complete, review it. Is your defined scope:Specific: Are the objectives, targets, and constraints clearly defined?Measurable: Can you determine if the objectives have been met?Achievable: Is the scope realistic given potential constraints (like time, resources, allowed methods)?Relevant: Does the scope address the primary risks and concerns of Innovatech?Time-bound: Is there a clear timeframe for the engagement?"This exercise provides a solid foundation for planning LLM red team operations. As you progress through this course, you'll learn the techniques to execute the activities defined within such a scope."