Out of memory in case of Splunk indexer slowness/failure

Hello, 
We are using the Splunk Sink Connector with these main parameters:

```json
{
	"name": "SplunkHECSinkConnector",
	"config":{
		"connector.class": "com.splunk.kafka.connect.SplunkSinkConnector",
		"tasks.max": "6",
		"splunk.hec.ack.enabled": "true",
		"splunk.hec.max.outstanding.events": "50000",
		"splunk.hec.max.retries": "-1",
		"splunk.hec.backoff.threshhold.seconds": "60",
		"splunk.hec.threads": "1"
	}
}
```

In my understanding, we should never have more than 50000 events per task kept in memory.
But that is not the case if Splunk indexers encounter slowness or failures.

We can observe in the Kafka Connect logs such errors and messages:

```shell
[2024-02-27 06:39:24,527] INFO [SplunkHECSinkConnector|task-5] handled 394 failed batches with 193452 events (com.splunk.kafka.connect.SplunkSinkTask:154)
```

I have attached the Kafka Connect metrics during a Splunk indexer stress test.
You can observe the out of memory and the number of active records.

<img width="861" alt="out-of-memory-splunk-sink" src="https://github.com/splunk/kafka-connect-splunk/assets/13764888/7c210628-71c0-4b12-be9a-1fd25f5205ff">


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Out of memory in case of Splunk indexer slowness/failure #423

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Out of memory in case of Splunk indexer slowness/failure #423

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions