Server-sent events (HTTP) #86
Replies: 3 comments
-
|
Hi @derrickoswald, I agree, SSE makes a lot of sense. It's just not so common in public APIs, but if you are in control of the API implementation, it can definitely be a nice fit. Regarding how would you go about implementing a Kafka Connect source connector for it, it's not that different to any other source connector requiring a subscription to an external system. You already described the gist of it. Source Task lifecycle is the way it is: pull-based by continuously polling for new records. I'm afraid if you want to use Kafka Connect you can't fight this. One way to go about it is by creating a buffer where you accumulate all messages pushed by the server, and serve them in batches whenever Kafka Connect polls for them. The main challenges I see you would need to overcome with this approach come from the fact that this buffer is nothing but a finite shared state between 2 threads: Kafka Connect's and the server sending events. So:
I hope this helps. |
Beta Was this translation helpful? Give feedback.
-
|
I forgot to mention that although Kafka Connect is the way it is, and probably for a good reason, as this kind of generalized abstraction of a source connector enables pretty much all sorts of integration patterns, while still offering all of the benefits of Kafka Connect, such as managing the lifecycle of your running connectors, generalizing its configuration, deployment, etc. I'm sorry if I'm saying something too obvious, but you might be after something more specific, in which case you can just use the Kafka Producer client to push messages to Kafka straight from your own code as a result of every server sent event. Either way, I'm happy to help any way I can in your implementation. Best regards. |
Beta Was this translation helpful? Give feedback.
-
|
It looks like the way to go is a simple HTTP client feeding the Kafka Producer. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
An alternative to Change Data Capture is to use server-sent events (SSE) where the endpoint has implemented support for it.
Just as an example:
curl --no-buffer --header "Accept:text/event-stream" https://cybre.space/api/v1/streaming/publicThe naive approach of adding
http.request.params=Accept=text/event-streamto the plugin properties for kafka-connect-http leads tofailed to poll records from SourceTaskas obviously theorg.apache.kafka.connect.source.SourceTaskis based on polling to request updates.I suppose one could collect the streamed events and return the collection when asked by
poll()but this looks like theSourceTaskis just getting in the way, and a better solution would be to use aSourceConnectorwhere thetaskClass()is just aTask, i.e. a hypotheticalHttpTaskextendingTask.Would Kafka still try to call
SourceTaskmethods even though there is no override of thetaskClassmethod inSourceConnector?If not, how does that work, where the
poll()method is called on aTaskinstance?If so, is there a work-around that would force Kafka to only call
start()andstop()on a hypotheticalHttpTask?Beta Was this translation helpful? Give feedback.
All reactions