DVC Documentation, Iterative, Inc., 2024 - Official documentation for Data Version Control (DVC), offering guides on versioning data and models, managing ML pipelines, and integrating with Git.
Git Large File Storage, GitHub, Inc., 2024 - Official documentation for Git Large File Storage (Git LFS), describing its mechanisms, usage, and integration with Git for large binary files.
Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores, Matei Zaharia, Tathagata Das, Michael Armbrust, Ali Ghodsi, and Reynold Xin, 2020Proceedings of the VLDB Endowment, Vol. 13 (VLDB Endowment)DOI: 10.14778/3384345.3384365 - Describes the architecture and features of Delta Lake, an open-source storage layer that brings ACID transactions and data versioning to data lakes, useful for large tabular datasets.