Kafka is the mainstream news flow system, there are still many concepts, the following through the way of illustration to sort out the core concepts of Kafka, in order to have a clear understanding in our minds.
Kafka is a stream processing system that allows back-end services to easily communicate with each other and is a commonly used component in microservices architectures.
consumers: The producer, the producer, sends messages to Kafka, and the consumer, the consumer, listens to Kafka to receive the message.
A service can be both a producer and a consumer.
Topics Subject: Topic is the destination address of the producer to send the message, which is the listening target of the consumer.
A service can listen to multiple topics and send multiple topics.
Kafka There is a concept of “consumer-group”. This is a group of services that act as a consumer group.
If a consumer group receives a message, Kafka routes a message to one of the services in the group.
which helps load balance messages , which is also convenient for expanding consumers.
A topic acts as a queue of messages. First, a message is sent to the topic.
The message is then logged and stored in this topic and is not allowed to be modified.
next , consumers will pull messages from the topic for consumption. However, the message is not deleted and remains in the queue.
Continue sending messages.
as before , the message is sent to the consumer, is not allowed to be altered, and remains in the queue (how long the message can stay in the queue is determined by Kafka’s configuration
Partitions Partitioning: In the description of the topic above, the topic is regarded as a queue, in fact, a topic is composed of multiple queues, called [partition]. This makes it easier to extend the topic.
When a producer sends a message, the message is routed to one of the partitions in this topic.
consumers listen on all partitions 。
When producers send messages, they are topic-oriented by default, and the topic decides which partition to put in, and the polling strategy is used by default.
can also be configured topic, so that messages of the same type are in the same partition. For example, to process user messages, you can make all messages of a certain user in a single partition (that is, the message determines the destination partition based on the hash value of the user ID).
For example, user 1 sends 3 messages: A, B, and C, and by default, these 3 messages are in different partitions (such as P1, P2, P3). After configuration, you can ensure that all messages for User 1 are sent to the same partition (such as P1).
What is the use of this feature? This is to provide [order] of the message. Messages are not guaranteed to be ordered in different partitions, and messages within only one partition are ordered.
Kafka It is a cluster architecture, and ZooKeeper is an important component.
ZooKeeper Manage all topics and partitions. Topics and partitions are stored in physical nodes of the node, and ZooKeeper is responsible for maintaining these nodes.
For example, there are 2 topics, each with 2 partitions.
which is logically in form , but the actual storage in a Kafka cluster might look like this:
Partition #1 of Topic A has 3 copies, distributed across nodes. This increases Kafka’s reliability and system resiliency. In 3 partition #1, ZooKeeper designates a leader who is responsible for receiving messages from producers.
The other 2 partitions #1 act as followers, and messages received by the leader are copied to the follower.
> so, Each partition contains full message data.
even somewhere If the Node node fails, you don’t have to worry about message corruption. The distribution of all partitions for Topic A and Topic B might look like this:
Finally, thanks for reading , hope it helps 😃 you
Explanation, all the pictures in this article are from:
public number (zhisheng) reply to Face, ClickHouse, ES, Flink, Spring, Java, Kafka, Monitoring < keywords such as span class="js_darkmode__148"> to view more articles corresponding to keywords.
like + Looking, less bugs 👇