#35. Get your data just-in-time

Topics: Architecture, data mesh, data warehouse, Delta Lake, PostgreSQL, Python

Seamlessly Migrate Your Apache Parquet Data Lake to Delta Lake — Dipankar Kushari, Uday Satapathy @ Databricks Engineering Blog

Databricks is a company behind the Delta Lake format. The paper explains some drawbacks of building a data lake using Apache Parquet and explains how Delta Lake can solve such problems and how to migrate.

level:beginner topic:deltalake

How Meta built the infrastructure for Threads — Laine Campbell, Chunqiang (CQ) Tang @ Engineering at Meta

It’s always interesting to read/watch real system’s design with explanation. Especially when you are trying to design them in mind.

level:medium topic:architecture

Python 3.13 gets a JIT — Anthony Shaw

It’s not only data engineering, but probably will change the whole development landscape: Python gets JIT, which opens a door to massive performance improvements in future!

level:medium topic:python

How we built our customer data warehouse all on Postgres — Adam Hendel @ Tembo

Very interesting experience of usage of Postgres as a warehouse. I’m saying not only about storage, but the whole system including orchestration 🤯. Of course authors had to write some code in Rust, but the concept looks very interesting and maybe even promising.

level:medium topic:data-warehouse topic:postgresql

Data Domains — Where do I start? — Piethein Strengholt

Good and practical article about data domains. Not only from a data team perspective but also from a development perspective.

level:medium topic:architecture topic:data-mesh

Written on March 6, 2024