Apache Parquet, Apache Software Foundation, 2023 - Official documentation for Apache Parquet, covering its columnar storage format, data types, compression, and performance benefits crucial for analytical workloads in data lakes.
Apache Iceberg: A Table Format for Analytic Datasets, Ryan Blue, Daniel Weeks, David Rolfe, 2020Proceedings of the VLDB Endowment, Vol. 13 (VLDB Endowment)DOI: 10.14778/3415494.3415555 - A foundational paper introducing Apache Iceberg, explaining its architecture, metadata storage, snapshot isolation, and how it addresses challenges in large-scale data lake table management.