#21. Read me in any order
Topics: Architecture, consistency, culture, databases, data quality, education, storage engine
5 Books that Make You a Better Data Engineer — Steve Russo
You know, we really like these books!
New terms keep appearing in the world of data engineering. Today we’re talking about Data Observability.
Introduction to the Join Ordering Problem — Alexey Goncharuk @ Querify Labs Blog
Query optimization details from Querify Labs. With lots of pictures!
Machine Learning Operations (MLOps): Overview, Definition, and Architecture — Dominik Kreuzberger, Niklas Kühl, Sebastian Hirschl
This article comes up quite often in Twitter reposts and is definitely worth your attention. It does a very good job of systematising your knowledge of what MLOps are about and it’s also easy enough to read.
Uber’s Highly Scalable and Distributed Shuffle as a Service — Mayank Bansal, Bo Yang, Mayur Bhosale, Kai Jiang @ Uber Engineering Blog
This article is worth reading in order to dive into a new level of understanding of data shuffle in Spark.