-
Notifications
You must be signed in to change notification settings - Fork 41.5k
Description
This is similar to #22509 to improve root cause analysis when probe endpoints returned non 200 response.
I am migrating k8s http probes to use readiness and liveness health group endpoints(/actuator/health/[readiness|liveness]
).
When these endpoints return non UP
status(other than 200 response), k8s stops traffic or shutdown the pod. When such event happens, k8s http probe only record the returned http status for the reason of its probe failure.
This makes hard to investigate WHY readiness/liveness probes returned non 200 response when somebody needs to investigate the failure reason later. Even if k8s could record body of probe response, it would be nicer to have such information in application log.
I wrote this implementation to our services to log information when health endpoints returns non UP
response.
@Slf4j
public class LoggingHealthEndpointWebExtension extends HealthEndpointWebExtension {
public LoggingHealthEndpointWebExtension(HealthContributorRegistry registry, HealthEndpointGroups groups) {
super(registry, groups);
}
@Override
public WebEndpointResponse<HealthComponent> health(ApiVersion apiVersion, SecurityContext securityContext,
boolean showAll, String... path) {
WebEndpointResponse<HealthComponent> response = super.health(apiVersion, securityContext, showAll, path);
HealthComponent health = response.getBody();
if (health == null) {
return response;
}
Status status = health.getStatus();
if (status != Status.UP) {
Map<String, HealthComponent> components = new TreeMap<>();
if (health instanceof CompositeHealth) {
Map<String, HealthComponent> details = ((CompositeHealth) health).getComponents();
if (details != null) {
components.putAll(details);
}
}
log.warn("Health endpoints {} returned {}. components={}", path, status, components);
}
return response;
}
}
If HealthEndpointSupport
could have logging capability (or HealthEndpointWebExtension
and ReactiveHealthEndpointWebExtension
for web only), then we don't need to have this custom implementation.
Something like:
boolean enableLogging;
if(this.enableLogging && health.getStatus() != Status.UP) {
log.warn(...);
}