You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Stateful Functions offers an Apache Kafka I/O Module for reading from and writing to Kafka topics.
21
21
It is based on Apache Flink's universal `Kafka connector <https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html>`_ and provides exactly-once processing semantics.
22
+
The Kafka I/O Module is configurable in Yaml or Java.
22
23
23
24
.. contents:: :local:
24
25
25
26
Dependency
26
-
===========
27
+
==========
27
28
28
-
To use the Kafka I/O Module, please include the following dependency in your pom.
29
+
To use the Kafka I/O Module in Java, please include the following dependency in your pom.
29
30
30
31
.. code-block:: xml
31
32
@@ -36,44 +37,178 @@ To use the Kafka I/O Module, please include the following dependency in your pom
36
37
<scope>provided</scope>
37
38
</dependency>
38
39
39
-
Kafka Ingress Builder
40
-
=====================
40
+
Kafka Ingress Spec
41
+
==================
41
42
42
-
A ``KafkaIngressBuilder`` declares an ingress spec for consuming from Kafka cluster.
43
+
A ``KafkaIngressSpec`` declares an ingress spec for consuming from Kafka cluster.
43
44
44
45
It accepts the following arguments:
45
46
46
47
1) The ingress identifier associated with this ingress
47
48
2) The topic name / list of topic names
48
49
3) The address of the bootstrap servers
49
50
4) The consumer group id to use
50
-
5) A ``KafkaIngressDeserializer`` for deserializing data from Kafka
51
+
5) A ``KafkaIngressDeserializer`` for deserializing data from Kafka (Java only)
The ingress allows configuring the startup position to be one of the following:
56
+
.. group-tab:: Java
58
57
59
-
* ``KafkaIngressStartupPosition#fromGroupOffsets()`` (default): starts from offsets that were committed to Kafka for the specified consumer group.
60
-
* ``KafkaIngressStartupPosition#fromEarliest()``: starts from the earliest offset.
61
-
* ``KafkaIngressStartupPosition#fromLatest()``: starts from the latest offset.
62
-
* ``KafkaIngressStartupPosition#fromSpecificOffsets(Map)``: starts from specific offsets, defined as a map of partitions to their target starting offset.
63
-
* ``KafkaIngressStartupPosition#fromDate(Date)``: starts from offsets that have an ingestion time larger than or equal to a specified date.
The ingress also accepts properties to directly configure the Kafka client, using ``KafkaIngressBuilder#withProperties(Properties)``.
70
90
Please refer to the Kafka `consumer configuration <https://docs.confluent.io/current/installation/configuration/consumer-configs.html>`_ documentation for the full list of available properties.
71
91
Note that configuration passed using named methods, such as ``KafkaIngressBuilder#withConsumerGroupId(String)``, will have higher precedence and overwrite their respective settings in the provided properties.
72
92
93
+
Startup Position
94
+
^^^^^^^^^^^^^^^^
95
+
96
+
The ingress allows configuring the startup position to be one of the following:
97
+
98
+
**From Group Offset (default)**
99
+
100
+
Starts from offsets that were committed to Kafka for the specified consumer group.
101
+
102
+
.. tabs::
103
+
104
+
.. group-tab:: Java
105
+
106
+
.. code-block:: none
107
+
108
+
KafkaIngressStartupPosition#fromGroupOffsets();
109
+
110
+
.. group-tab:: Yaml
111
+
112
+
.. code-block:: yaml
113
+
114
+
startupPosition:
115
+
type: group-offsets
116
+
117
+
**Earliest**
118
+
119
+
Starts from the earliest offset.
120
+
121
+
.. tabs::
122
+
123
+
.. group-tab:: Java
124
+
125
+
.. code-block:: none
126
+
127
+
KafkaIngressStartupPosition#fromEarliest();
128
+
129
+
.. group-tab:: Yaml
130
+
131
+
.. code-block:: yaml
132
+
133
+
startupPosition:
134
+
type: earliest
135
+
136
+
**Latest**
137
+
138
+
Starts from the latest offset.
139
+
140
+
.. tabs::
141
+
142
+
.. group-tab:: Java
143
+
144
+
.. code-block:: none
145
+
146
+
KafkaIngressStartupPosition#fromLatest();
147
+
148
+
.. group-tab:: Yaml
149
+
150
+
.. code-block:: yaml
151
+
152
+
startupPosition:
153
+
type: latest
154
+
155
+
**Specific Offsets**
156
+
157
+
Starts from specific offsets, defined as a map of partitions to their target starting offset.
158
+
159
+
.. tabs::
160
+
161
+
.. group-tab:: Java
162
+
163
+
.. code-block:: none
164
+
165
+
Map<TopicPartition, Long> offsets = new HashMap<>();
On startup, if the specified startup offset for a partition is out-of-range or does not exist (which may be the case if the ingress is configured to
204
+
start from group offsets, specific offsets, or from a date), then the ingress will fallback to using the position configured
205
+
using ``KafkaIngressBuilder#withAutoOffsetResetPosition(KafkaIngressAutoResetPosition)``.
206
+
By default, this is set to be the latest position.
207
+
73
208
Kafka Deserializer
74
-
""""""""""""""""""
209
+
^^^^^^^^^^^^^^^^^^
75
210
76
-
The Kafka ingress needs to know how to turn the binary data in Kafka into Java objects.
211
+
When using the Java api, the Kafka ingress needs to know how to turn the binary data in Kafka into Java objects.
77
212
The ``KafkaIngressDeserializer`` allows users to specify such a schema.
78
213
The ``T deserialize(ConsumerRecord<byte[], byte[]> record)`` method gets called for each Kafka message, passing the key, value, and metadata from Kafka.
79
214
@@ -90,30 +225,111 @@ It accepts the following arguments:
90
225
91
226
1) The egress identifier associated with this egress
92
227
2) The address of the bootstrap servers
93
-
3) A ``KafkaEgressSerializer`` for serializing data into Kafka
228
+
3) A ``KafkaEgressSerializer`` for serializing data into Kafka (Java only)
Please refer to the Kafka `producer configuration <https://docs.confluent.io/current/installation/configuration/producer-configs.html>`_ documentation for the full list of available properties.
102
264
103
265
Kafka Egress and Fault Tolerance
104
-
""""""""""""""""""""""""""""""""
266
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
105
267
106
268
With fault tolerance enabled, the Kafka egress can provide exactly-once delivery guarantees.
107
-
You can choose three different modes of operating based through the ``KafkaEgressBuilder``.
269
+
You can choose three different modes of operation.
270
+
271
+
**None**
272
+
273
+
Nothing is guaranteed, produced records can be lost or duplicated.
274
+
275
+
.. tabs::
276
+
277
+
.. group-tab:: Java
278
+
279
+
.. code-block:: none
280
+
281
+
KafkaEgressBuilder#withNoProducerSemantics();
282
+
283
+
.. group-tab:: Yaml
284
+
285
+
.. code-block:: yaml
286
+
287
+
deliverySemantic:
288
+
type: none
289
+
290
+
**At Least Once**
291
+
292
+
Stateful Functions will guarantee that no records will be lost but they can be duplicated.
0 commit comments