Legacy MPP systems operated on a shared-nothing architecture where storage capacity and computational power were inextricably linked. Adding storage to accommodate historical logs in those environments required adding compute nodes, even if the CPU utilization remained near zero. Conversely, increasing processing power for complex transformations necessitated provisioning unnecessary storage capacity. Modern data warehousing platforms fundamentally alter this dynamic by physically and logically separating the compute layer from the storage layer.
This architectural shift relies on the elasticity of cloud object storage (such as Amazon S3, Google Cloud Storage, or Azure Blob Storage) serving as the persistent repository, while ephemeral compute clusters processing the data can be spun up, resized, or suspended independently.
In a decoupled architecture, the database engine is split into three distinct layers. Understanding the interaction between these layers is essential for optimizing query performance and cost.
The following diagram illustrates how multiple independent compute clusters interact with a single shared storage layer, mediated by a global metadata service.
Interaction between the Global Services Layer, distinct Compute Clusters, and the shared Storage Layer.
The primary engineering challenge in decoupled architectures is network latency. Reading data from object storage over the network is significantly slower than reading from a local disk in a traditional coupled system. To mitigate this, platforms utilize aggressive caching strategies.
When a query executes, the compute nodes fetch the required micro-partitions from the remote storage layer. These files are then cached on the compute node's local SSD (often referred to as the "SSD cache" or "disk cache"). Subsequent queries accessing the same data can read directly from the local SSD, bypassing the network hop.
Performance in this model is a function of the cache hit rate. The effective bandwidth for a query can be approximated as:
Where:
Since , maintaining a high cache hit rate is critical for performance. This implies that while compute is technically stateless, there is a "warming" period where a cold cluster must download data before it reaches peak performance.
A significant advantage of separating compute from storage is the elimination of resource contention between different workloads. In a coupled system, a heavy ETL job loading terabytes of data competes for the same CPU and I/O resources as a CEO's executive dashboard.
In a decoupled system, you can spin up a dedicated "ETL Cluster" and a separate "Reporting Cluster." Both clusters reference the same underlying storage files. The metadata layer ensures that the Reporting Cluster sees a consistent snapshot of the data, even while the ETL Cluster is writing to it. This relies on Multi-Version Concurrency Control (MVCC), where write operations create new immutable files rather than modifying existing ones.
Decoupling allows for distinct scaling behaviors. Storage costs grow linearly with data volume, while compute costs fluctuate based on query complexity and user concurrency. This elasticity prevents over-provisioning.
Consider a scenario where data volume remains constant, but query load spikes during business hours. A decoupled system allows the compute layer to expand horizontally (adding more clusters) or vertically (resizing clusters) without data redistribution.
The chart below demonstrates the efficiency of decoupled resource utilization compared to a traditional coupled approach over a 24-hour cycle.
Comparison of resource provisioning. The coupled model requires provisioning for peak load at all times, whereas the decoupled model allows compute to align with actual demand.
While separation offers flexibility, it introduces complexity in data governance. Because the storage layer is accessible to any authorized compute cluster, ensuring consistent access controls is the responsibility of the centralized metadata layer. Security policies (such as Row-Level Security) must be enforced by the query engine at runtime, not by the storage layer itself.
Furthermore, the "time-travel" capabilities inherent in this architecture, enabled by immutable storage blocks, allow engineers to query data as it existed at a previous point in time. This is achieved by simply referencing the metadata pointers to older, non-deleted storage files, providing a safety net against accidental data corruption without requiring traditional backup restores.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with