On behalf of the Apache Kafka community, I am pleased to announce the release of Apache Kafka® 3.0. Apache Kafka 3.0 is a big, multifaceted release. Apache Kafka 3.0 introduces a variety of new features, breaking API changes, and improvements to KRaft – Apache Kafka’s built-in consensus mechanism will replace Apache ZooKeeper™.
While KRaft
is not yet recommended for production (known gap list), we have made many improvements to KRaft metadata and APIs. Exactly-once and partition reallocation support is worth highlighting. We encourage you to review the new features of KRaft and try it out in your development environment.
Starting with Apache Kafka 3.0, producers have the strongest delivery guarantee enabled by default (acks=all, enable.idempotence=true). This means that users now get ordering and persistence by default.
Also, don’t miss the Kafka Connect task restart enhancements, improvements to KStreams timestamp-based synchronization, and more flexible configuration options for MirrorMaker2.
General changes
KIP-750 (Part I): Deprecation of Java 8 support in
Kafka
In 3.0, all components of the Apache Kafka project have deprecated support for Java 8. This will give users time to make adjustments before the next major release (4.0), when Java 8 support will be removed.
KIP-751 (Part I): Deprecation of support for
Scala 2.12 in KafkaSupport for
Scala 2.12 is also deprecated in Apache Kafka 3.0. As with Java 8, we give users time to adapt as support for Scala 2.12 is planned to be removed in the next major release (4.0).
Kafka Broker, Producer, Consumer, and Management Client
KIP-630: Kafka Raft Snapshot
One of the key features we introduced in 3.0 is the ability for KRaft controllers and KRaft agents to be named __cluster_metadata The metadata topic section generates, copies, and loads snapshots. Kafka clusters use this topic to store and replicate metadata information about the cluster, such as broker configuration, topic partition assignment, leadership, and so on. As this state grows, Kafka Raft Snapshot provides an efficient way to store, load, and replicate this information
KIP-746: Modifying KRaft metadata records
Experience and ongoing development since the first version of the Kafka Raft controller has shown that some metadata record types need to be modified to use them when Kafka is configured to run without ZooKeeper (ZK).
KIP-730: Producer ID generation in KRaft mode
In 3.0 and KIP-730, the Kafka controller now fully takes over the responsibility for generating Kafka producer IDs. The controller does this in both ZK and KRaft modes. This brings us closer to the bridge version, which will allow users to transition from a Kafka deployment using ZK to a new deployment using KRaft.
KIP-679: Producer will enable the strongest delivery guarantee
by default
starting with 3.0, and Kafka producers will enable idempotency and delivery confirmation of all replicas by default. This makes the documented delivery guarantee stronger by default.
KIP-735: Increase
the default value of the configuration property for the default consumer session timeout Kafka Consumer from 10 seconds to 45 seconds session.timeout.ms. This will allow consumers to better adapt to transient network failures by default and avoid continuous rebalancing when consumers appear to be only temporarily leaving the group.
KIP-709: Extend OffsetFetch requests to accept multiple group IDs
for the current offset of Kafka consumer groups for some time. But getting the offset for multiple consumer groups requires a separate request for each group. In 3.0 and KIP-709, fetch and AdminClient APIs were extended to support simultaneous reading of offsets of multiple consumer groups in a single request/response.
KIP-699: Update FindCoordinator to resolve
multiple coordinators at once
Support for operations that can be applied to multiple consumer groups simultaneously in an efficient manner depends heavily on the client’s ability to effectively discover the coordinators of those groups. This is made possible by KIP-699, which adds support for coordinators that discover multiple groups with a single request. Kafka clients have been updated to use this optimization when talking to new Kafka agents that support this request.
KIP-724: Removed support for message formats v0 and
v1 The message format
v2 has been the default message format since it was introduced with Kafka 0.11.0 in June 2017. So, after enough water (or stream) flowed under the bridge, the major version of 3.0 gave us a good opportunity to deprecate the old message format, i.e. v0 and v1. These formats are rarely used today. In 3.0, if users configure the broker to use message format v0 or v1, they will receive a warning. This option will be removed in Kafka 4.0 (see KIP-724 for details and the impact of deprecating v0 and v1 message formats).
KIP-707: The Future
of KafkaFuture
When KafkaFuture introduced the type to facilitate the implementation of Kafka AdminClient, versions prior to Java 8 were still widely used, and Kafka officially supported Java 7. Fast forward a few years, and Kafka now runs on versions of Java that support CompletionStage and CompletableFuture class types. With KIP-707, KafkaFuture adds a way to return CompletionStage objects and enhances usability in a backwards-compatible way with KafkaFuture.
KIP-466: Added support for List serialization
and deserializationKIP-466
adds new classes and methods for serializing and deserializing generic lists – a feature that is useful for both Kafka clients and Kafka Streams.
KIP-734: Improved AdminClient.listOffsets to return timestamps and offsets for records with maximum timestamps
The ability for users to list Kafka topic/partition offsets has been extended. With KIP-734, users can now ask AdminClient to return the offset and timestamp of the record with the highest timestamp in the subject/partition. (This is not confused with what the AdminClient earnings have been for the latest offset, which is the next recorded offset, written in the subject/partition.) This extension to the existing ListOffsets API allows users to probe lively partitions by asking which is the offset of the most recently written record and what its timestamp is.
kafka
Connect
KIP-745: Connect APIs to restart connectors and tasks
In Kafka Connect, a connector is represented at runtime as a set of Connector class instances and one or more Tasks class instances, and most operations on connectors available through the Connect REST API can be applied to the entire group. From the beginning, a notable exception to restart was the endpoints of the Connector and Task instances. To restart the entire connector, the user must call separately to restart the connector instance and task instance. In 3.0, KIP-745 enables users to restart all or only failed connector connector and Task instances with a single call. This feature is add-on, and the previous behavior of the restartREST API remains unchanged.
KIP-738: Remove
Connect’s internal converter properties
and inter.keynal.value.converter in Connect after deprecating them in a previous major release (Apache Kafka 2.0). The worker’s configuration is deleted as a configuration attribute and prefix. Going forward, the internal Connect theme will specifically use JsonConverter to store records without an embedded schema. Any existing Connect cluster that uses a different translator must port its internal topics to the new format (see KIP-738 for details on upgrade paths).
KIP-722: Connector client override enabled by default
Starting with Apache Kafka 2.3.0, connector workers can be configured to allow connector configuration to override Kafka client properties used by connectors. This is a widely used feature and now has the opportunity to release a major version that enables the ability to override connector client properties by default (default connector.client.config.override.policy is set to All).
KIP-721: Enable connector log context in connection Log4j configuration
Another feature introduced in 2.3.0 but not enabled by default so far is connector log context. This changed in 3.0, where the connector context adds log4j to the log mode of the Connect worker by default. Upgrading from a previous version to 3.0 will log4j change the format of exported log lines by adding connector contexts where appropriate.
Kafka
Streams KIP-695
: Further improvements Kafka Streams timestamp synchronization KIP-695
enhances the semantics of how the Streams task chooses to get records, and expands the meaning of configuration properties and the max.task.idle.ms of available values 。 This change requires a new approach in the Kafka Consumer API that enables currentLag to return consumer lag for a specific partition if known locally and without contacting Kafka Broker.
KIP-715: Starting with Offset 3.0 of Exposed Commits
in Streams, three new methods were added to the TaskMetadata interface: committedOffsets, endOffsets, and timeCurrentIdlingStarted 。 These methods can allow Streams applications to track the progress and health of their tasks.
KIP-740: Clean up the public API TaskId
KIP-740 represents a major innovation in the TaskId class. There are several methods and all internal fields have been deprecated, and the new subtopology() and partition() stems will replace the old topicGroupId and partition fields (see KIP-744 for related changes and amendments to KIP-740).
KIP-744: Migration of TaskMetadata, and Interface of ThreadMetadata with Internal Implementation
KIP-744
takes the changes proposed by KIP-740 a step further and separates the implementation from the public API of many classes. To achieve this, new interfaces TaskMetadata, ThreadMetadata, and StreamsMetadata were introduced, while existing classes with the same name were deprecated.
KIP-666: Adding Instant-based methods to the ReadOnlySessionStore interactive query API extends a new set of methods in the ReadOnlySessionStore
and SessionStore interfaces. These methods accept parameters of the Instant data type. This change will affect any custom read-only interactive query session store implementations that require the new method.
KIP-622: Add currentSystemTimeMs and currentStreamTimeMs to
ProcessorContext
The ProcessorContext adds two new methods in 3.0. currentSystemTimeMs and currentStreamTimeMs. The new approach enables users to query cached system times and stream times separately and can use them in a uniform way in production and test code.
KIP-743: Removed configuration value 0.10.0-2.4 Streams built-inmetric version configuration Removed
support for legacy metric structures for built-in metrics in Streams in Configuration 3.0. KIP-743 is removing the value built.in.metrics.version from configuration properties in 0.10.0-2.4. This latest is currently the only valid value for this property (which has been the default since 2.5).
KIP-741: Changing default SerDe to null
removes the previous default value for the default SerDe property. The stream past defaults to ByteArraySerde. Starting with 3.0, there is no default, and users need either set of their SerDes as needed in the API or by setting the default DEFAULT_KEY_SERDE_CLASS_CONFIG and DEFAULT_VALUE_SERDE_CLASS_CONFIG in their stream configuration. Previous defaults almost always don’t work for real-world applications and create more confusion than convenience.
KIP-733: Opportunity to change the Kafka
Streams default replication factor configuration
with a major version, default replication.factor for the Streams configuration property changes from 1 to -1. This will allow new Streams applications to use the default replication factor defined in the Kafka broker, so this configuration value does not need to be set when they are moved to production. Note that the new default requires Kafka Brokers 2.5 or later.
KIP-732: Deprecation of eos-alpha and replacement of eos-beta with eos-v2
Another Streams configuration value deprecated in 3.0 is exactly_once as the value of the propertyprocessing.guarantee. The value exactly_once corresponds to the original implementation of Exactly Once Semantics (EOS) and can be used to connect to any Streams application of Kafka cluster version 0.11.0 or later. The first implementation of this EOS has been implemented by stream second implementation, which is replaced by a value representation exactly_once_beta in the processing.guarantee nature. Going forward, the name exactly_once_beta has also been deprecated and replaced with a new name exactly_once_v2. In the next major release (4.0), both exactly_once and exactly_once_beta will be removed exactly_once_v2 as the only option for the EOS delivery guarantee.
KIP-725: Optimized configuration properties for WindowedSerializer and WindowedDeserializer
default.windowed
.key.serde.inner and default.windowed.value.serde.inner Deprecated in favor of a single new property of windowed.inner.class.serde for consumer clients. Kafka Streams users are advised to configure their windowed SerDe by passing it to the SerDe constructor, and then provide SerDe wherever it is used in the topology.
KIP-633: Deprecation of the default value of 24 hours for
grace period
in Streams in Kafka Streams, which allows window operations to process out-of-window records based on a configuration property called grace period. Previously, this configuration was optional and easy to miss, resulting in a default of 24 hours. This is why Suppression operator users often get confused as it buffers records until the grace period ends, thus adding a 24-hour delay. In 3.0, Windows classes are enhanced with factory methods that require them to be constructed with a custom grace period or no grace period at all. The old factory method with a default grace period of 24 hours has been deprecated, and the corresponding API that is incompatible with the new factory method where grace() has set this configuration.
KIP-623: internal-topics add ” ” option to the streaming
application reset tool
by adding new command line arguments via kafka-streams-application-reset, making the use of streams by the application reset tool more flexible:-- internal-topics. The new parameter accepts a comma-separated list of topic names that correspond to internal topics that can be scheduled for deletion using this application tool. Combining this new parameter with an existing parameter, —dry-run allows the user to confirm which topics will be deleted and specify a subset of them if necessary before actually performing the delete operation.
MirrorMaker
KIP-720: Deprecated MirrorMaker v1
In 3.0, the first version of
MirrorMaker
was deprecated. Going forward, new feature development and significant improvements will focus on MirrorMaker 2 (MM2).
KIP-716: Allows MirrorMaker2 to configure the location of offset synchronization
themes
In 3.0, users can now configure where MirrorMaker2 creates and stores internal themes for converting consumer group offsets. This will allow users of MirrorMaker2 to maintain the source Kafka cluster as strictly read-only and use a different Kafka cluster to store drift records (i.e., the target Kafka cluster, or even a third cluster in addition to the source and target clusters).
Apache Kafka
3.0 is an important step forward for the Apache Kafka project.