Articles tagged with storage
Cost Efficiency @ Scale in Big Data File Format — Uber Engineering blog
If you need to choose compression type for parquet files in you data lake, this article is good starting point.
Git’s database internals — Derrick Stolee @ GitHub Blog.
- Packed object store
- Commit history queries
- File history queries
- Distributed synchronization
- Scalability
5 part series to look at Git’s internals from the perspective of a database.
Cache in distributed systems — AKT @ Medium
A detailed and easy-to-understand guide to caches for 14 min read.
Hudi, Iceberg and Delta Lake: Data Lake Table Formats Compared — Oz Katz @ lakeFS blog.
Non-immutable formats are the new trend of data storage. Hudi, Iceberg, Delta Lake… which suits your needs better? Check out this article by @lakeFS and choose wisely!