#16. We wish you a Merry Christmas Exactly-Once!

Topics: Apache Airflow, architecture, data lineage, data mesh, Apache Kafka, Apache Spark

A brief history of the metrics store — Nick Handel @ Towards Data Science.

We haven’t learned about feature store yet. But it won’t stop us to read about Metric stores :)

level:medium topic:architecture

Data Lineage with OpenLineage and Airflow — Astronomer.

Beneficial webinar about how to implement data lineage with Marquez and Airflow. With real examples, not only theory!

level:medium topic:airflow topic:data-lineage type:video

Building data platform in PySpark. Part 1. Python and Scala interop — Sergey Ivanychev @ Joom Blog.

Why and how to use Scala in PySpark.

level:medium topic:spark

HelloFresh Journey to the Data Mesh — HelloFresh Blog.

To read other people’s stories is interesting because it might look like yours.

level:beginner topic:data-mesh

Exactly-Once Semantics Are Possible: Here’s How Kafka Does It (Proposal: KIP-98) — Neha Narkhede @ Confluent Blog

This is a very interesting explanation about reaching exactly-once semantics. These are just two simple words, but this is very not trivial. You should keep in mind many different things and consider different failure scenarios. I think this understanding can help you when you design your pipelines.

level:medium topic:kafka

Written on December 24, 2021