re - Regular expression operations, Python Software Foundation, 2024 - Essential guide to Python's built-in module for working with regular expressions, directly applicable to the log parsing exercise.
Data Lineage Management in Big Data Environments: A Review, José D. Hernández-Cruz, Ricardo V. Teixeira, Flávio R. C. Fernandes and Carlos H. N. E. Costa, 2019Computing, Vol. 101 (Springer Vienna)DOI: 10.1007/s00607-019-00778-9 - A comprehensive review of techniques and challenges for managing data lineage in large-scale data systems, providing context for the importance of lineage extraction.
OpenLineage Documentation, LF AI & Data Foundation, 2024 (LF AI & Data Foundation) - Official documentation for the OpenLineage standard, representing a modern approach to collecting and exchanging data lineage metadata programmatically.