Apache Kafka
What is Apache Kafka?
Apache Kafka is:
- distributed event streaming platform;
- open-source;
- used for:
- high-performance data pipelines;
- streaming analytics;
- data integration;
- mission-critical applications.
Apache Kafka properties
Apache Kafka properties:
- distribution — not one server, but many servers working together;
- fault tolerance;
- high availability;
- data consistency and availability — CA, see CAP theorem;
- high performance (bandwidth);
- horizontal scaling;
- integrability.
History of Apache Kafka
History of Apache Kafka:
- Developed by LinkedIn.
- Program code published in 2011.
- Member of the Apache Software Foundation since 2012.
- The product is developed in
Scala
andJava
programming languages.
What task does Apache Kafka solve
The main entities of Apache Kafka
Kafka Cluster
: brings together several Kafka Brokers.Kafka Broker
(Kafka Server | Kafka Node): receives, stores and issues messages.Zookeper
:- Cluster status, configuration, address book (data).
- Selects Controller, provides consistency.
Kafka Message
(Record | Event) —key-value
pair.Topic
/Partition
— stream of data.Producer
.Consumer
.
Message
Message field | Description |
---|---|
Key |
Key (optional). Used to distribute messages across the cluster. |
Value |
Message content, byte array. |
Timestamp |
Message time (from epoch). Set when sending or processing within the cluster. |
Headers |
A set of key-value pairs with custom message attributes. |
Topic / Partition
- Partitions are used to speed up reading and writing data (parallelization).
- Support for FIFO ordering at the partition level.
-
When reading from a topic, the data is not deleted.
-
The data is stored in the Broker File System in log files.
-
Partition files:
- Stored in segments 1 GB (default). The last one is active. The file name is start offset.
*.log
— message with some meta information:Offset
,Position
,Timestamp
,Message
.*.index
— mappingOffset
toPosition
.*.timeindex
— mappingTimestamp
toOffset
.
-
Removing data from Apache Kafka Topic:
Warning
The delete operation from Apache Kafka Topic is not supported.
Tip
Supports automatic deletion of data after TTL.
Entire partition segments are deleted, not individual messages. Segment timestamp expired → to delete.
-
Set
replication-factor
> 1.- That is, for each partition there should be > 1 replica, and they should not be stored on one node.
Kafka Controller
assignsLeader
replicas. Read and write operations are performed only on theLeader
replica: produce → Leader → consume.- It is not recommended to host all
Leader
replicas of topics on the same node. So she will be overloaded, and the rest nodes will be disabled.
-
Followers
fetch data fromLeader
periodically.- Synchronous recording from
Leader
toFollower
. Therefore, the recording slows down a bit. - ISR
Follower
is a reliable candidate forLeader
. - Set
min.insync.replicas = 3
. If there are not enoughLeader
+ISR Followers
online, then a write error occurs.
- Synchronous recording from
Producer
acks
— delivery guarantee parameter, can take the following values:0
—Producer
does not wait for confirmation of sending a message; the most unreliable mode, messages can be lost;1
—Producer
waits for confirmation of sending a message only from theLeader
replica; compromise mode, messages may be lost in some cases, for example, if a broker with aLeader
replica falls before execution message replication; frequently used option;-1
(all
) —Producer
waits for confirmation of sending messages from all ISR replicas, includingLeader
; most reliable option.
- Delivery semantic support:
at most once
;at least once
;exactly once
(idempotence).
flowchart TD
title[<u>Producer send message flow diagram</u>]
title---A
style title fill:#FFF,stroke:#FFF
linkStyle 0 stroke:#FFF,stroke-width:0;
A["<b>Producer.send()</b>"<br/>message] --> B[fetch metadata];
B -. fetch .-> C[(Zookeper)];
C -. metadata .-> B;
B ---> D[serialize message];
D --> E[define partition];
E --> F[compress message];
F --> G[accumulate batch];
G --> H[Sending a message to the right brokers and partitions]
Producer.send()
message flow:
-
fetch metadata
fromZookeper
through a broker.- Gets the state of the cluster and topic placement.
- This operation will block
send()
until the metadata has been received or the timeout is 60 seconds (default).
Note
It is not necessary to create a new
Producer
for every message sent. Use singleton instead. -
serialize message
to the desired format:key.serializer
,value.serializer
;- for example,
StringSerializer()
— convert data into a byte array.
define partition
where the message will go:explicit partition
— specify the partition manually;round-robin
;key-defined
(key_hash %n
) — frequently used feature, used to record messages with the same key to the same partition; for example, log user messages with UUIDs.
- compress message;
- accumulate batch:
batch.size
- by size;linger.ms
- by timeout.
- Sending a message to the right brokers and partitions.
Consumer
flowchart TD
title[<u>Consumer poll messages flow diagram</u>]
title---A
style title fill:#FFF,stroke:#FFF
linkStyle 0 stroke:#FFF,stroke-width:0;
A["<b>Consumer.poll()</b>"<br/>messages] --> B[fetch metadata];
B -. fetch .-> C[(Zookeper)];
C -. metadata .-> B;
B ---> D[Connecting to Leader-replicas<br/>of all topic partitions];
Single thread processing can be slow. To speed up, it is possible to combine several Consumer
into Consumer Group
.
The optimal design is when one Consumer
reads data from one Topic
.
Apache Kafka stores in a separate Topic
named __consumer_offsets
offsets for each Consumer Group
and Partition
:
Field | Value |
---|---|
Partition |
A/0 |
Group |
X |
Offset |
2 |
Types of commits:
- Auto commit → at most once → risk of losing message when
Consumer
fails after commit. - Manual commit → at least once → risk of receiving duplicate messages.
- Custom offset management → exactly once → not missed, no duplicates.
Consumer Group
is inactive for a long time:
- max inactive period:
offsets.retention.minutes
; - offsets deleted from
__consumer_offsets
; - after group activatio →
auto.offset.reset
:earliest
orlatest
.