#28. No data — no problem

Topics: AWS, databases, data lineage, data thoughts, Python, streaming, testing


BIG DATA IS DEAD — Jordan Tigani

When size doesn’t matter.

level:medium topic:data-thoughts


Title: 5 Ways to Use Column Level Data Lineage — Montecarlo Data Blog

Have you ever thought of implementing column level lineage? If not, just read why it can be extremely helpful to have.

level:medium topic:data-lineage


Why your mock doesn’t work — Ned Batchelder

I often come across the situations where my Mock didn’t work as I expected. I hate Python for that. This pretty short article showed me that I should hate someone else [most likely underinvestment into language]. I just misunderstood the basics. Must-read article which provides a brief explanation of how exactly variable assignment in import statements works in Python and what we should keep in mind working with Mock.

level:beginner topic:python topic:testing


Driving efficiency and developer productivity at Facebook scale, Asynchronous computing at Meta: Overview and learnings — Engineering at Meta Blog

By the name you could think it is something from other world. You are right but only partially. Actually these are 2 articles of how to build a mix of scalable ETL and distributed computation in-house.

level:medium topic:streaming


How Amazon RDS Replication Works and Why the FAA’s Database Problem Won’t Happen in AWS — Bohan Zhang, Andy Pavlo @ OtterTune Blog

Basic overview of how Amazon RDS and Amazon Aurora replication works. A good starting point to better understand what database replication is.

level:beginner topic:aws topic:databases


Written on February 18, 2023