Topics: Apache Airflow, community, data quality, Apache Kafka, storage

The State of Data Engineering 2022 — Einat Orr @ lakeFS Blog

Let’s take a look at the top technologies of 2022. Read it or you can lose the Pokémon or Big Data game :)

level:medium topic:community

Real-Time Supply Chain with Apache Kafka in the Food and Retail Industry — Walmart

I see a lot of discussions around the topic do you really need streaming data? Here is a real-world example from one of the biggest retailers of how to use and why to use real-time data streaming. This article is more an overview one a technical one. Just for taking a look at use cases where they adopted it.

level:beginner topic:kafka

Cache in distributed systems — AKT @ Medium

A detailed and easy-to-understand guide to caches for 14 min read.

level:beginner topic:storage

Airflow Summit 2022 — The Best Of — Jarek Potiuk

A short overview of the last summit and lists of talks grouped by topic. We hope this will inspire you to watch a couple of talks.

level:medium topic:airflow topic:community

Towards data quality management at LinkedIn — Liangzhao Zeng @ Towards Data Science

Short article about the top-level architecture of their Data Health Monitor system.

level:medium topic:data-quality

Written on July 1, 2022