#10. Slack time
Topics: Apache Airflow, architecture, data lake, data lineage, data quality, Apache Hudi, storage engine.
Data Lineage at Slack — Samuel Bock @ Slack Engineering.
Slack team’s perspective on the problem of Data Lineage. The article describes the architecture of their own solution.
DAG Writing Best Practices in Apache Airflow — Astronomer Guides
We all love best practices. Especially short. Especially with examples.
Apache Hudi - The Data Lake Platform — Apache Hudi Blog
Look at an article with a detailed description of Apache Hudi! What is Hudi? Storage architecture. Indexes. Concurrency. Caches.
What is Cost-based Optimization? — Alexey Goncharuk @ Querify Labs Blog
Response to the popular question: what are the mythical units of the query plan cost?
How Uber Achieves Operational Excellence in the Data Quality Experience — Uber Engineering Blog
What base principles lie at the heart of the Uber data quality platform? As always, detailed and understandable article by Uber engineers.