#35. Get your data just-in-time
Topics: Architecture, data mesh, data warehouse, Delta Lake, PostgreSQL, Python
Seamlessly Migrate Your Apache Parquet Data Lake to Delta Lake — Dipankar Kushari, Uday Satapathy @ Databricks Engineering Blog
Databricks is a company behind the Delta Lake format. The paper explains some drawbacks of building a data lake using Apache Parquet and explains how Delta Lake can solve such problems and how to migrate.
How Meta built the infrastructure for Threads — Laine Campbell, Chunqiang (CQ) Tang @ Engineering at Meta
It’s always interesting to read/watch real system’s design with explanation. Especially when you are trying to design them in mind.
Python 3.13 gets a JIT — Anthony Shaw
It’s not only data engineering, but probably will change the whole development landscape: Python gets JIT, which opens a door to massive performance improvements in future!
How we built our customer data warehouse all on Postgres — Adam Hendel @ Tembo
Very interesting experience of usage of Postgres as a warehouse. I’m saying not only about storage, but the whole system including orchestration 🤯. Of course authors had to write some code in Rust, but the concept looks very interesting and maybe even promising.
Data Domains — Where do I start? — Piethein Strengholt
Good and practical article about data domains. Not only from a data team perspective but also from a development perspective.