#2. May the Force be with you

Topics: cost management, data quality, ETL, Apache Flink, Apache Iceberg, Apache Kafka, Kubernetes, streaming.

Some useful ideas of categorizing your infrastructure costs and how to keep it under control.

How to perform data downtime analysis: where should we look, and a quite controversial idea on “in which order” we should look at potential causes.

Bird’s eye Netflix data processing architecture overview. The interesting part is how they work with schema changes separately from the data.

In the world where k8s won the race we’re trying to run everything on it. Here is the recipe of running Flink on Kubernetes.

Three typical architectures for resilient message handling in Kafka. If you have Kafka source in your data pipelines, it can be interesting.

Written on May 4, 2021