Traditional compiler infrastructures often struggle when optimizing heavy machine learning workloads because they force an immediate transition from high-level graph representations to low-level instruction sets. This "loss of semantics" occurs when a complex operation, such as a convolution, is lowered directly into a nest of scalar loops. Once this conversion happens, the compiler loses sight of the original intent, that this was a convolution, and optimization becomes a difficult pattern-matching exercise on scalar instructions.MLIR (Multi-Level Intermediate Representation) solves this by acting not as a single intermediate representation, but as an infrastructure for building intermediate representations. It allows the compiler to retain high-level structure (like tensor dimensions and padding) alongside low-level details (like memory allocation and vector intrinsics) within the same compilation unit.The Structure of an OperationIn MLIR, the fundamental unit of execution is the Operation. Unlike LLVM IR, where the set of instructions is fixed and defined by the language specification, MLIR is open. Users can define new operations without modifying the core compiler infrastructure.An operation in MLIR is a generic container. It does not map 1:1 to a C++ class for every specific instruction. Instead, it utilizes a lightweight storage mechanism that holds the necessary components to define the behavior of the instruction.Every operation consists of these primary components:Operation Name: A unique string identifier, typically namespaced (e.g., tosa.matmul or scf.for).Operands: A list of input values (SSA values) that the operation uses.Results: A list of output values produced by the operation.Attributes: Constant data derived at compile-time, such as integer literals, string names, or tensor shapes.Regions: Nested blocks of code, allowing operations to contain other operations.Successors: Links to other blocks for control flow operations (like branches).The following diagram illustrates the hierarchy of an MLIR module, showing how operations reside within blocks, which form regions inside other operations or functions.digraph MLIR_Hierarchy { rankdir=LR; node [shape=rect, style=filled, fontname="Arial", fontsize=10, margin=0.2]; edge [color="#adb5bd", penwidth=1.2]; subgraph cluster_module { label="MLIR Module"; style=filled; color="#e9ecef"; fontname="Arial"; node_func [label="Op: func.func", fillcolor="#4dabf7", fontcolor="white", color="#339af0"]; subgraph cluster_region { label="Region"; style=filled; color="#f8f9fa"; fontcolor="#495057"; subgraph cluster_block { label="Block (Basic Block)"; style=filled; color="#dee2e6"; node_op1 [label="Op: arith.constant\n(Attribute: value=10)", fillcolor="#69db7c", fontcolor="white", color="#40c057"]; node_op2 [label="Op: linalg.matmul\n(Operands: %A, %B)", fillcolor="#ff6b6b", fontcolor="white", color="#fa5252"]; node_op3 [label="Op: func.return\n(Result: %C)", fillcolor="#4dabf7", fontcolor="white", color="#339af0"]; node_op1 -> node_op2 [label="SSA Value", fontcolor="#868e96", fontsize=9]; node_op2 -> node_op3 [label="SSA Value", fontcolor="#868e96", fontsize=9]; } } } node_func -> node_op1 [lhead=cluster_region]; }Hierarchical structure of an MLIR module. Operations can define regions, which contain blocks of sequential operations, creating a recursive structure suitable for representing loops and functions.Textual RepresentationMLIR provides a transparent textual format that mirrors its in-memory structure. Being able to read this IR is necessary for debugging compiler passes. Consider the following generic syntax for an operation:$$ %result:2 = \text{"dialect.op_name"}(%arg0, %arg1) { attribute = 42 : i32 } : (f32, f32) \to (f32, i1) $$In this structure:%result:2 indicates that this operation produces two results."dialect.op_name" is the unique identifier.(%arg0, %arg1) are the operands (inputs).{ ... } contains a dictionary of attributes (compile-time constants).The final segment (f32, f32) -> (f32, i1) defines the functional type signature: two float32 inputs yielding a float32 and a 1-bit integer.The Dialect EcosystemThe power of MLIR lies in Dialects. A dialect is a logical grouping of operations, types, and attributes under a unique namespace. If traditional compilers are like a workshop with a fixed set of tools, MLIR is a factory floor where you can bring in specialized machinery for specific tasks.Dialects allow different levels of abstraction to coexist. In a single MLIR file, you might see:tf (TensorFlow): High-level graph nodes like tf.Softmax.tosa (Tensor Operator Set Architecture): Standardized tensor algebra suitable for hardware inference.scf (Structured Control Flow): Loops (for, while) and conditionals (if) that preserve structure, unlike low-level "goto" branches.arith (Arithmetic): Basic scalar math like integer addition or floating-point multiplication.llvm: Instructions that map 1:1 to the LLVM IR backend.This modularity enables progressive lowering. Instead of translating a high-level graph directly to machine code, the compiler lowers it step-by-step. A tf.MatMul might first be lowered to a linalg.matmul (a structured loop representation), then to loops using scf.for and arith operations, and finally to the llvm dialect for code generation.Traits and InterfacesSince MLIR operations are generic, the compiler needs a way to reason about their behavior without knowing exactly what every specific operation does. This is handled through Traits and Interfaces.Traits describe inherent properties of an operation. For example, if you define a custom matrix multiplication operation, you might attach the NoSideEffect trait. This tells the compiler that the operation is pure, it relies only on its inputs and produces outputs without modifying global memory or IO. Consequently, the Dead Code Elimination (DCE) pass can automatically remove this operation if its results are unused, even though the DCE pass knows nothing about matrix multiplication specifically.Interfaces provide a way to query or modify operations generically.InferTypeOpInterface: Allows an operation to compute its output shapes based on input shapes (shape inference).LoopLikeInterface: Allows optimization passes to identify an operation as a loop (whether it is a scf.for or a custom my_accelerator.repeat) and apply loop-invariant code motion.Type System and AttributesThe type system in MLIR is as extensible as the operation set. While it includes standard types like i32 (32-bit integer) or f16 (half-precision float), dialects can define complex domain-specific types.For machine learning, the RankedTensorType is ubiquitous. It represents a tensor with a known shape and element type, denoted as tensor<4x4xf32>. MLIR also supports dynamic dimensions, represented by ?. A type tensor<?x128xf32> implies a batch size that is unknown at compile time but a feature dimension that is fixed at 128.Attributes differ from operands in that they are static. They are heavily used in ML hardware compilers to store configuration data. For example, a convolution operation requires padding, strides, and dilation rates. These are stored as attributes because they must be known during the compilation passes to perform layout transformation or memory tiling effectively.Verifiers and InvariantsWhen defining a dialect, correctness is enforced through Verifiers. A verifier is a C++ hook associated with an operation that checks invariants.For instance, a Reshape operation might require that the total number of elements in the input tensor equals the total number of elements in the output tensor. If an optimization pass accidentally violates this rule (e.g., by constant folding a shape incorrectly), the verifier will trigger an error immediately. This fast-fail mechanism is critical when building complex lowering pipelines, as it catches invalid IR states before they propagate to the backend where debugging becomes notoriously difficult.The following diagram visualizes the interaction between the generic operation storage and the specialized dialect logic that defines traits and verifiers.digraph MLIR_Architecture { rankdir=LR; node [shape=rect, style=filled, fontname="Arial", fontsize=10, margin=0.2]; edge [color="#adb5bd", penwidth=1.2]; subgraph cluster_storage { label="Generic Storage (In-Memory)"; style=filled; color="#e9ecef"; node_op_state [label="OperationState\n(Generic C++ Object)", fillcolor="#ced4da", fontcolor="#212529"]; } subgraph cluster_dialect { label="Dialect Definition (ODS)"; style=filled; color="#e9ecef"; node_definition [label="Op Definition\n(TableGen)", fillcolor="#9775fa", fontcolor="white", color="#845ef7"]; node_traits [label="Traits\n(Commutative, NoSideEffect)", fillcolor="#74c0fc", fontcolor="white", color="#4dabf7"]; node_verifier [label="Verifier Logic\n(C++ Check)", fillcolor="#ff8787", fontcolor="white", color="#fa5252"]; } node_definition -> node_op_state [label="Defines Structure", style=dashed, fontcolor="#868e96", fontsize=9]; node_traits -> node_op_state [label="Attaches Properties", fontcolor="#868e96", fontsize=9]; node_verifier -> node_op_state [label="Validates", fontcolor="#868e96", fontsize=9]; }Interaction between the Generic Operation Storage and Dialect Definitions. TableGen definitions and Traits provide the semantic meaning to the raw underlying storage.By decoupling the storage representation from the semantic definition, MLIR allows the compiler to manipulate operations efficiently. Passes can iterate over operations, check for specific traits (like "is this commutative?"), and perform transformations without needing to cast every operation to a specific C++ class. This architecture is what enables the high scalability of the MLIR infrastructure across diverse hardware targets.