#36. A few moments later

Topics: Architecture, data quality, dbt, Apache Spark, streaming


Simplify PySpark testing with DataFrame equality functions — Haejoon Lee, Allison Wang and Amanda Liu @ Databricks Engineering Blog

Finally we have PySpark functions for testing! Starting from Spark 3.5. No more additional libraries for testing. Or, maybe…

level:medium topic:spark


Data Quality Score: How We Evolved the Data Quality Strategy at Airbnb — Clark Wright @ Netflix Data Engineering Open Forum 2024

Some time ago, we published an article from AirBnb about the Data Quality framework that they had built. Now, you may watch the video for more details. Inspired by the level of platform solutions that they’re building.

level:medium topic:data-quality type:video


Accelerating and Scaling dbt for the Enterprise — Dakota Kelley @ phData blog

This article highlights problems that you have to solve when preparing DBT as part of the Platform in your organization.

level:medium topic:dbt


Course Events and Event Streaming — Adam Bellemare @ Confluent Developer

If you, as a data engineer, have discussions about proper event modeling in your company, then this video course is for you (and for your software engineers).

level:beginner topic:streaming type:video


InfoQ Software Architecture and Design Trends Report - April 2024 — InfoQ

Have you already heard about architecture as a team sport? About cell-based architecture? Review the latest innovations and trends in the design trends report.

level:medium topic:architecture


Written on June 24, 2024