Skip to content

Allow for Out-of-Band Health-check in LoadBalancer to be Disabled #324

@hubert-s

Description

@hubert-s

In deployments where a pool of heavy forwarders or indexers are fronted by an external load-balancer, the configuration of the Kafka Connector will only contain a single address. In this scenario, the out-of-band health-check does not take into account the available capacity of the pool behind the external load balancer. When a health check fails, all channels are removed for a configurable period of time including some that may be otherwise healthy. Although this is configurable (by default 120 seconds), frequently adding/removing channels based on an out-of-band check does not seem very elegant or efficient.

Furthermore, despite a successful out-of-band health check, the indexer object of the Kafka Connector may still receive a 503 result code from an indexer/heavy forwarder. This triggers the back-pressure handling, which I would consider an in-band health-check. In contrast, the channel that has back-pressure refers to a specific TCP session that is also typically maintained by a keep-alive. Avoiding a channel that has back-pressure for a preset period of time is a reasonable thing for the indexer object to do.

In short, when an external load-balancer is used, the out-of-band health-check does not seem very useful. Therefore, I propose that if splunk.hec.lb.poll.interval is set to say “-1” (or any negative integer) that would disable the out-of-band health-check.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions