Articles tagged with kafka
Incremental Cooperative Rebalancing in Apache Kafka: Why Stop the World When You Can Change It? — Konstantine Karantasis @ Confluent Blog.
Kafka rebalancing again. What is the difference between eager and incremental cooperative rebalancing protocol? What are the problems which had to implement a new one? How is it works on high-level abstraction?
Additionally, you can go through KIP to understand the problem even deeper.
From Eager to Smarter in Apache Kafka Consumer Rebalances — Sophie Blee-Goldman @ Confluent Blog.
We continue the Kafka topic. New rebalancing protocol from consumer client perspective.
2 talks from Apache Kafka® Meetup
Here are 2 talks from Apache Kafka® Meetup. The first one is The Silver Bullet for Endless Rebalances from Confluent. This one is highly useful to understand the theory behind rebalancing protocol and trade-offs of different implementations. Even if it’s not useful for you now it is very interesting for wide auditory from a theoretical perspective. The second one is Kafka as a service at Dropbox. Engineers from Dropbox share their experiences and problems how to provide Kafka as a service at scale.
Real-Time Supply Chain with Apache Kafka in the Food and Retail Industry — Walmart
I see a lot of discussions around the topic do you really need streaming data? Here is a real-world example from one of the biggest retailers of how to use and why to use real-time data streaming. This article is more an overview one a technical one. Just for taking a look at use cases where they adopted it.
Presto on Apache Kafka At Uber Scale — User Engineering Blog
We like Uber engineering posts so much. Because they seem like ADRs: problem, current environment description, alternatives, supposed architecture.
What does In-Sync Replicas in Apache Kafka Really Mean? — Lovisa Johansson @ CloudKarafka.
A little article for better understanding the separation line between producer and consumer guarantee. It’s also useful for a better understanding of how exactly one guarantee can be rich and which cost does it has
Exactly-Once Semantics Are Possible: Here’s How Kafka Does It (Proposal: KIP-98) — Neha Narkhede @ Confluent Blog
This is a very interesting explanation about reaching exactly-once semantics. These are just two simple words, but this is very not trivial. You should keep in mind many different things and consider different failure scenarios. I think this understanding can help you when you design your pipelines.
Understanding Kafka partition assignment strategies and how to write your own custom assignor — Florian Hussonnois @ StreamThoughts.
Do you like Kafka like we do? Continuation of the article from Digest 7. The author goes deeper into the work of consumer groups and the assignment of partitions.
5 More Reasons to Choose Apache Pulsar Over Apache Kafka — Chris Bartholomew @ DataStax Blog.
Do you like platform comparisons? I’ve read almost nothing about Apache Pulsar. But after the point about tiered storage in Pulsar, when you can move old messages to cheaper storage to store it always, I was inspired.
Apache Kafka Rebalance Protocol, or the magic behind your streams applications — Florian Hussonnois @ StreamThoughts.
If you would like to know a little bit deeper about the foundation of the Apache Kafka consumption mechanism, this is an excellent article for this. Personally, I didn’t understand clearly tons of “rebalancing” log entries till reading this article.
Understanding Kafka Topic Partitions — Dunith Dhanushka @ Event-driven Utopia.
Our editor Ksenia is a personal fan of Dunith Dhanushka articles. Many schemes, easy to read, practical use!
Error Handling Patterns for Apache Kafka Applications — Gerardo Villeda @ Confluent Blog.
Do you somehow handle error messages from Kafka? Do you know the best practices? This is a great article about typical approaches to handling error messages. With our favorite confluent pictures!
You Can Replace Kafka with a Database — Emil Koutanov @ Towards Data Science (Medium).
Kafka is the new gold. What if you don’t like it and want to replace it? Of course, there are options like Apache Pulsar, but… is it possible to replace Apache Kafka with relational DB? Looks like the answer is “yes”. Now it’s your turn to decide if you need it.
Kafka Resiliency — Retry/Delay Topic, Dead Letter Queue (DLQ) — Sheshnath Kumar @ Medium.
Three typical architectures for resilient message handling in Kafka. If you have Kafka source in your data pipelines, it can be interesting.