On behalf of the Apache Kafka community, I am pleased to announce the release of Apache Kafka® 3.0. Apache Kafka 3.0 is a big, multifaceted release. Apache Kafka 3.0 introduces a variety of new features, breaking API changes, and improvements to KRaft – Apache Kafka’s built-in consensus mechanism will replace Apache ZooKeeper™.

While KRaft

is not yet recommended for production (known gap list), we have made many improvements to KRaft metadata and APIs. Exactly-once and partition reallocation support is worth highlighting. We encourage you to review the new features of KRaft and try it out in your development environment.

Starting with Apache Kafka 3.0, producers have the strongest delivery guarantee enabled by default (acks=all, enable.idempotence=true). This means that users now get ordering and persistence by default.

Also, don’t miss the Kafka Connect task restart enhancements, improvements to KStreams timestamp-based synchronization, and more flexible configuration options for MirrorMaker2.

General changes

KIP-750 (Part I): Deprecation of Java 8 support in

Kafka

In 3.0, all components of the Apache Kafka project have deprecated support for Java 8. This will give users time to make adjustments before the next major release (4.0), when Java 8 support will be removed.

KIP-751 (Part I): Deprecation of support for

Scala 2.12 in KafkaSupport for

Scala 2.12 is also deprecated in Apache Kafka 3.0. As with Java 8, we give users time to adapt as support for Scala 2.12 is planned to be removed in the next major release (4.0).

Kafka Broker, Producer, Consumer, and Management Client

KIP-630: Kafka Raft Snapshot

One of the key features we introduced in 3.0 is the ability for KRaft controllers and KRaft agents to be named __cluster_metadata  The metadata topic section generates, copies, and loads snapshots. Kafka clusters use this topic to store and replicate metadata information about the cluster, such as broker configuration, topic partition assignment, leadership, and so on. As this state grows, Kafka Raft Snapshot provides an efficient way to store, load, and replicate this information

KIP-746: Modifying KRaft metadata records

Experience and ongoing development since the first version of the Kafka Raft controller has shown that some metadata record types need to be modified to use them when Kafka is configured to run without ZooKeeper (ZK).

KIP-730: Producer ID generation in KRaft mode

In 3.0 and KIP-730, the Kafka controller now fully takes over the responsibility for generating Kafka producer IDs. The controller does this in both ZK and KRaft modes. This brings us closer to the bridge version, which will allow users to transition from a Kafka deployment using ZK to a new deployment using KRaft.

KIP-679: Producer will enable the strongest delivery guarantee

by default

starting with 3.0, and Kafka producers will enable idempotency and delivery confirmation of all replicas by default. This makes the documented delivery guarantee stronger by default.

KIP-735: Increase

the default value of the configuration property for the default consumer session timeout Kafka Consumer from 10 seconds to 45 seconds session.timeout.ms. This will allow consumers to better adapt to transient network failures by default and avoid continuous rebalancing when consumers appear to be only temporarily leaving the group.

KIP-709: Extend OffsetFetch requests to accept multiple group IDs

for the current offset of Kafka consumer groups for some time. But getting the offset for multiple consumer groups requires a separate request for each group. In 3.0 and KIP-709, fetch and AdminClient APIs were extended to support simultaneous reading of offsets of multiple consumer groups in a single request/response.

KIP-699: Update FindCoordinator to resolve

multiple coordinators at once

Support for operations that can be applied to multiple consumer groups simultaneously in an efficient manner depends heavily on the client’s ability to effectively discover the coordinators of those groups. This is made possible by KIP-699, which adds support for coordinators that discover multiple groups with a single request. Kafka clients have been updated to use this optimization when talking to new Kafka agents that support this request.

KIP-724: Removed support for message formats v0 and

v1 The message format

v2 has been the default message format since it was introduced with Kafka 0.11.0 in June 2017. So, after enough water (or stream) flowed under the bridge, the major version of 3.0 gave us a good opportunity to deprecate the old message format, i.e. v0 and v1. These formats are rarely used today. In 3.0, if users configure the broker to use message format v0 or v1, they will receive a warning. This option will be removed in Kafka 4.0 (see KIP-724 for details and the impact of deprecating v0 and v1 message formats).

KIP-707: The Future

of KafkaFuture

When KafkaFuture introduced the type to facilitate the implementation of Kafka AdminClient, versions prior to Java 8 were still widely used, and Kafka officially supported Java 7. Fast forward a few years, and Kafka now runs on versions of Java that support CompletionStage and CompletableFuture class types. With KIP-707, KafkaFuture adds a way to return CompletionStage objects and enhances usability in a backwards-compatible way with KafkaFuture.

KIP-466: Added support for List serialization

and deserializationKIP-466

adds new classes and methods for serializing and deserializing generic lists – a feature that is useful for both Kafka clients and Kafka Streams.

KIP-734: Improved AdminClient.listOffsets to return timestamps and offsets for records with maximum timestamps

The ability for users to list Kafka topic/partition offsets has been extended. With KIP-734, users can now ask AdminClient to return the offset and timestamp of the record with the highest timestamp in the subject/partition. (This is not confused with what the AdminClient earnings have been for the latest offset, which is the next recorded offset, written in the subject/partition.) This extension to the existing ListOffsets API allows users to probe lively partitions by asking which is the offset of the most recently written record and what its timestamp is.

kafka

Connect

KIP-745: Connect APIs to restart connectors and tasks

In Kafka Connect, a connector is represented at runtime as a set of Connector class instances and one or more Tasks class instances, and most operations on connectors available through the Connect REST API can be applied to the entire group. From the beginning, a notable exception to restart was the endpoints of the Connector and Task instances. To restart the entire connector, the user must call separately to restart the connector instance and task instance. In 3.0, KIP-745 enables users to restart all or only failed connector connector and Task instances with a single call. This feature is add-on, and the previous behavior of the restartREST API remains unchanged.

KIP-738: Remove

Connect’s internal converter properties

and inter.key nal.value.converter in Connect after deprecating them in a previous major release (Apache Kafka 2.0). The worker’s configuration is deleted as a configuration attribute and prefix. Going forward, the internal Connect theme will specifically use JsonConverter to store records without an embedded schema. Any existing Connect cluster that uses a different translator must port its internal topics to the new format (see KIP-738 for details on upgrade paths).

KIP-722: Connector client override enabled by default

Starting with Apache Kafka 2.3.0, connector workers can be configured to allow connector configuration to override Kafka client properties used by connectors. This is a widely used feature and now has the opportunity to release a major version that enables the ability to override connector client properties by default (default connector.client.config.override.policy is set to All).

KIP-721: Enable connector log context in connection Log4j configuration

Another feature introduced in 2.3.0 but not enabled by default so far is connector log context. This changed in 3.0, where the connector context adds log4j to the log mode of the Connect worker by default. Upgrading from a previous version to 3.0 will log4j change the format of exported log lines by adding connector contexts where appropriate.

Kafka

Streams KIP-695

: Further improvements Kafka Streams timestamp synchronization KIP-695

enhances the semantics of how the Streams task chooses to get records, and expands the meaning of configuration properties and the max.task.idle.ms of available values 。 This change requires a new approach in the Kafka Consumer API that enables currentLag to return consumer lag for a specific partition if known locally and without contacting Kafka Broker.

KIP-715: Starting with Offset 3.0 of Exposed Commits

in Streams, three new methods were added to the TaskMetadata interface: committedOffsets, endOffsets, and timeCurrentIdlingStarted 。 These methods can allow Streams applications to track the progress and health of their tasks.

KIP-740: Clean up the public API TaskId

KIP-740 represents a major innovation in the TaskId class. There are several methods and all internal fields have been deprecated, and the new subtopology() and partition() stems will replace the old topicGroupId and partition fields (see KIP-744 for related changes and amendments to KIP-740).

KIP-744: Migration of TaskMetadata, and Interface of ThreadMetadata with Internal Implementation

KIP-744

takes the changes proposed by KIP-740 a step further and separates the implementation from the public API of many classes. To achieve this, new interfaces TaskMetadata, ThreadMetadata, and StreamsMetadata were introduced, while existing classes with the same name were deprecated.

KIP-666: Adding Instant-based methods to the ReadOnlySessionStore interactive query API extends a new set of methods in the ReadOnlySessionStore

and SessionStore interfaces. These methods accept parameters of the Instant data type. This change will affect any custom read-only interactive query session store implementations that require the new method.

KIP-622: Add currentSystemTimeMs and currentStreamTimeMs to

ProcessorContext

The ProcessorContext adds two new methods in 3.0. currentSystemTimeMs and currentStreamTimeMs. The new approach enables users to query cached system times and stream times separately and can use them in a uniform way in production and test code.

KIP-743: Removed configuration value 0.10.0-2.4 Streams built-inmetric version configuration Removed

support for legacy metric structures for built-in metrics in Streams in Configuration 3.0. KIP-743 is removing the value built.in.metrics.version from configuration properties in 0.10.0-2.4. This latest is currently the only valid value for this property (which has been the default since 2.5).

KIP-741: Changing default SerDe to null

removes the previous default value for the default SerDe property. The stream past defaults to ByteArraySerde. Starting with 3.0, there is no default, and users need either set of their SerDes as needed in the API or by setting the default DEFAULT_KEY_SERDE_CLASS_CONFIG and DEFAULT_VALUE_SERDE_CLASS_CONFIG in their stream configuration. Previous defaults almost always don’t work for real-world applications and create more confusion than convenience.

KIP-733: Opportunity to change the Kafka

Streams default replication factor configuration

with a major version, default replication.factor for the Streams configuration property changes from 1 to -1. This will allow new Streams applications to use the default replication factor defined in the Kafka broker, so this configuration value does not need to be set when they are moved to production. Note that the new default requires Kafka Brokers 2.5 or later.

KIP-732: Deprecation of eos-alpha and replacement of eos-beta with eos-v2

Another Streams configuration value deprecated in 3.0 is exactly_once as the value of the propertyprocessing.guarantee. The value exactly_once corresponds to the original implementation of Exactly Once Semantics (EOS) and can be used to connect to any Streams application of Kafka cluster version 0.11.0 or later. The first implementation of this EOS has been implemented by stream second implementation, which is replaced by a value representation exactly_once_beta in the processing.guarantee nature. Going forward, the name exactly_once_beta has also been deprecated and replaced with a new name exactly_once_v2. In the next major release (4.0), both exactly_once and exactly_once_beta will be removed exactly_once_v2 as the only option for the EOS delivery guarantee.

KIP-725: Optimized configuration properties for WindowedSerializer and WindowedDeserializer

default.windowed

.key.serde.inner and default.windowed.value.serde.inner Deprecated in favor of a single new property of windowed.inner.class.serde for consumer clients. Kafka Streams users are advised to configure their windowed SerDe by passing it to the SerDe constructor, and then provide SerDe wherever it is used in the topology.

KIP-633: Deprecation of the default value of 24 hours for

grace period

in Streams in Kafka Streams, which allows window operations to process out-of-window records based on a configuration property called grace period. Previously, this configuration was optional and easy to miss, resulting in a default of 24 hours. This is why Suppression operator users often get confused as it buffers records until the grace period ends, thus adding a 24-hour delay. In 3.0, Windows classes are enhanced with factory methods that require them to be constructed with a custom grace period or no grace period at all. The old factory method with a default grace period of 24 hours has been deprecated, and the corresponding API that is incompatible with the new factory method where grace() has set this configuration.

KIP-623: internal-topics add ” ” option to the streaming

application reset tool

by adding new command line arguments via kafka-streams-application-reset, making the use of streams by the application reset tool more flexible:-- internal-topics. The new parameter accepts a comma-separated list of topic names that correspond to internal topics that can be scheduled for deletion using this application tool. Combining this new parameter with an existing parameter, —dry-run allows the user to confirm which topics will be deleted and specify a subset of them if necessary before actually performing the delete operation.

MirrorMaker

KIP-720: Deprecated MirrorMaker v1

In 3.0, the first version of

MirrorMaker

was deprecated. Going forward, new feature development and significant improvements will focus on MirrorMaker 2 (MM2).

KIP-716: Allows MirrorMaker2 to configure the location of offset synchronization

themes

In 3.0, users can now configure where MirrorMaker2 creates and stores internal themes for converting consumer group offsets. This will allow users of MirrorMaker2 to maintain the source Kafka cluster as strictly read-only and use a different Kafka cluster to store drift records (i.e., the target Kafka cluster, or even a third cluster in addition to the source and target clusters).

Apache Kafka

3.0 is an important step forward for the Apache Kafka project.

END

Popular content

two years of experience to win an Ant/Headline/PingCAP Offer, awesome

." Kuaishou big data platform as a service

deep understanding of the Java memory model

Follow me, Java learning is not lost!"

Like + watch, less bugs 👇