Legacy MPP systems operated on a shared-nothing architecture where storage capacity and computational power were inextricably linked. Adding storage to accommodate historical logs in those environments required adding compute nodes, even if the CPU utilization remained near zero. Conversely, increasing processing power for complex transformations necessitated provisioning unnecessary storage capacity. Modern data warehousing platforms fundamentally alter this dynamic by physically and logically separating the compute layer from the storage layer.This architectural shift relies on the elasticity of cloud object storage (such as Amazon S3, Google Cloud Storage, or Azure Blob Storage) serving as the persistent repository, while ephemeral compute clusters processing the data can be spun up, resized, or suspended independently.Architectural Components of Disaggregated SystemsIn a decoupled architecture, the database engine is split into three distinct layers. Understanding the interaction between these layers is essential for optimizing query performance and cost.Storage Layer: This is the persistent backing store. Data is typically stored in proprietary, optimized columnar formats (like micro-partitions in Snowflake or capacitor files in BigQuery) within cloud object storage. This layer handles data durability and availability.Compute Layer: This consists of clusters of virtual machines (often called "virtual warehouses" or "slots"). These nodes possess local SSDs and memory but do not hold the "source of truth." They are stateless workers that cache data during query execution.Cloud Services / Metadata Layer: This is the control plane. It manages transaction logs, metadata, optimization, and security. It maps the logical SQL tables to the physical files in the storage layer.The following diagram illustrates how multiple independent compute clusters interact with a single shared storage layer, mediated by a global metadata service.digraph G { rankdir=TB; node [shape=box style="filled" fontname="Helvetica" fontsize=10 penwidth=0]; edge [color="#adb5bd" penwidth=1.2]; splines=ortho; bgcolor="transparent"; subgraph cluster_services { label="Global Services Layer (Metadata & Optimization)"; style=filled; color="#f8f9fa"; fontcolor="#495057"; Metadata [label="Metadata Manager\n(Transaction Logs)" fillcolor="#748ffc" fontcolor="white"]; Optimizer [label="Query Optimizer" fillcolor="#748ffc" fontcolor="white"]; } subgraph cluster_compute { label="Compute Layer (Stateless Resources)"; style=filled; color="#f8f9fa"; fontcolor="#495057"; subgraph cluster_vw1 { label="ETL Cluster (Large)"; style=filled; color="#e9ecef"; Node1 [label="Node" fillcolor="#4dabf7" fontcolor="white"]; Node2 [label="Node" fillcolor="#4dabf7" fontcolor="white"]; } subgraph cluster_vw2 { label="BI Dashboard Cluster (Small)"; style=filled; color="#e9ecef"; Node3 [label="Node" fillcolor="#63e6be" fontcolor="white"]; } } subgraph cluster_storage { label="Storage Layer (Object Store)"; style=filled; color="#f8f9fa"; fontcolor="#495057"; S3 [label="Remote Object Storage\n(S3 / GCS / Azure Blob)" fillcolor="#adb5bd" fontcolor="white" shape=cylinder]; } Metadata -> Node1 [style=dashed label="Plan"]; Metadata -> Node3 [style=dashed]; Node1 -> S3 [dir=both label="I/O"]; Node2 -> S3 [dir=both]; Node3 -> S3 [dir=both label="Read-Only"]; }Interaction between the Global Services Layer, distinct Compute Clusters, and the shared Storage Layer.The Latency Challenge and Local CachingThe primary engineering challenge in decoupled architectures is network latency. Reading data from object storage over the network is significantly slower than reading from a local disk in a traditional coupled system. To mitigate this, platforms utilize aggressive caching strategies.When a query executes, the compute nodes fetch the required micro-partitions from the remote storage layer. These files are then cached on the compute node's local SSD (often referred to as the "SSD cache" or "disk cache"). Subsequent queries accessing the same data can read directly from the local SSD, bypassing the network hop.Performance in this model is a function of the cache hit rate. The effective bandwidth $B_{eff}$ for a query can be approximated as:$$ B_{eff} = (h \cdot B_{local}) + ((1 - h) \cdot B_{network}) $$Where:$h$ is the cache hit rate ($0 \le h \le 1$).$B_{local}$ is the I/O bandwidth of the local SSD.$B_{network}$ is the throughput from the object store.Since $B_{local} \gg B_{network}$, maintaining a high cache hit rate is critical for performance. This implies that while compute is technically stateless, there is a "warming" period where a cold cluster must download data before it reaches peak performance.Multi-Cluster Concurrency and IsolationA significant advantage of separating compute from storage is the elimination of resource contention between different workloads. In a coupled system, a heavy ETL job loading terabytes of data competes for the same CPU and I/O resources as a CEO's executive dashboard.In a decoupled system, you can spin up a dedicated "ETL Cluster" and a separate "Reporting Cluster." Both clusters reference the same underlying storage files. The metadata layer ensures that the Reporting Cluster sees a consistent snapshot of the data, even while the ETL Cluster is writing to it. This relies on Multi-Version Concurrency Control (MVCC), where write operations create new immutable files rather than modifying existing ones.Dynamic Scaling and Cost ImplicationsDecoupling allows for distinct scaling behaviors. Storage costs grow linearly with data volume, while compute costs fluctuate based on query complexity and user concurrency. This elasticity prevents over-provisioning.Consider a scenario where data volume remains constant, but query load spikes during business hours. A decoupled system allows the compute layer to expand horizontally (adding more clusters) or vertically (resizing clusters) without data redistribution.The chart below demonstrates the efficiency of decoupled resource utilization compared to a traditional coupled approach over a 24-hour cycle.{"layout": {"title": "Resource Utilization: Coupled vs. Decoupled", "xaxis": {"title": "Time of Day (24h)", "showgrid": false}, "yaxis": {"title": "Resource Units (Cost)", "showgrid": true, "gridcolor": "#e9ecef"}, "plot_bgcolor": "white", "width": 700, "height": 400, "legend": {"x": 0.05, "y": 1}}, "data": [{"type": "scatter", "mode": "lines", "name": "Coupled (Fixed Capacity)", "x": [0, 4, 8, 12, 16, 20, 24], "y": [80, 80, 80, 80, 80, 80, 80], "line": {"color": "#adb5bd", "width": 3, "dash": "dot"}}, {"type": "scatter", "mode": "lines", "name": "Decoupled Compute Usage", "x": [0, 2, 4, 6, 8, 9, 12, 14, 17, 19, 22, 24], "y": [5, 5, 10, 20, 85, 95, 70, 65, 90, 40, 10, 5], "line": {"color": "#339af0", "shape": "spline", "width": 3}, "fill": "tozeroy", "fillcolor": "rgba(51, 154, 240, 0.1)"}, {"type": "scatter", "mode": "lines", "name": "Decoupled Storage Usage", "x": [0, 24], "y": [15, 16], "line": {"color": "#fa5252", "width": 3}}]}Comparison of resource provisioning. The coupled model requires provisioning for peak load at all times, whereas the decoupled model allows compute to align with actual demand.Impact on Data GovernanceWhile separation offers flexibility, it introduces complexity in data governance. Because the storage layer is accessible to any authorized compute cluster, ensuring consistent access controls is the responsibility of the centralized metadata layer. Security policies (such as Row-Level Security) must be enforced by the query engine at runtime, not by the storage layer itself.Furthermore, the "time-travel" capabilities inherent in this architecture, enabled by immutable storage blocks, allow engineers to query data as it existed at a previous point in time. This is achieved by simply referencing the metadata pointers to older, non-deleted storage files, providing a safety net against accidental data corruption without requiring traditional backup restores.