Skip to content

MINOR: Reduce logging in persister. #19998

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 20, 2025
Merged

Conversation

smjn
Copy link
Collaborator

@smjn smjn commented Jun 19, 2025

  • Few logs in PersisterStateManager were noisy and not adding much
    value.
  • For the sake of reducing pollution, they have been moved to debug
    level.
  • Additional debug log in DefaultStatePersister to track epochs.

Reviewers: Apoorv Mittal [email protected], Andrew Schofield
[email protected]

@smjn smjn requested a review from AndrewJSchofield June 19, 2025 10:40
@github-actions github-actions bot added the triage PRs from the community label Jun 19, 2025
@smjn smjn requested a review from apoorvmittal10 June 19, 2025 10:40
@github-actions github-actions bot added core Kafka Broker small Small PRs labels Jun 19, 2025
@apoorvmittal10 apoorvmittal10 added KIP-932 Queues for Kafka ci-approved and removed triage PRs from the community labels Jun 19, 2025
Copy link
Contributor

@apoorvmittal10 apoorvmittal10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smjn Thanks for the PR. This will only suppress logs in Persister but as we saw the UnknownServerException in SharePartition from Persister which should be fixed as well, correct?

Copy link
Contributor

@apoorvmittal10 apoorvmittal10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have left some comments.

groupId, topicIdPartition, partitionData);
if (!(ex instanceof UnknownServerException)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seems to be correct, so if the error code is 13 then only we should not log the eror message. Example log from persister.

Failed to initialize the share partition: <.......>. Exception occurred: PartitionData(partition=144,stateEpoch=0,startOffset=0,errorCode=13,errorMessage=Error in read state RPC. The server disconnected before a response was received.,leaderEpoch=0,stateBatches=[]).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is because the SharePartition.fetchPersisterError is wrapping the error codes to UnknownServerException

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, just 2 minor change needed as well, move below line to debug or trace as they call writeShareGroupState which logs error message appropiratley hence these exception messages should go in debug or trace:

log.error("Failed to write state to persister for the share partition: {}-{}",

log.error("Failed to write the share group state on acquisition lock timeout for share partition: {}-{} memberId: {}",

groupId, topicIdPartition, ex);
if (!(ex instanceof UnknownServerException)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something similar to above but here the log line in SharePartition shows: org.apache.kafka.common.errors.UnknownServerException rather should Persister send NetworkException as like in ReadAPI...?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is because the SharePartition.fetchPersisterError is wrapping the error codes to UnknownServerException

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, thanks.

@smjn
Copy link
Collaborator Author

smjn commented Jun 20, 2025

@apoorvmittal10 Thanks for the review, incorporated comments.

@smjn smjn requested a review from apoorvmittal10 June 20, 2025 06:28
Copy link
Contributor

@apoorvmittal10 apoorvmittal10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments, some minor comments and change in 2 log lines as well.

Comment on lines 2582 to 2585
log.debug(message, e);
if (!(e instanceof NetworkException)) {
log.error(message, e);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not NetworkException then same error message will be twice in debug mode:

Suggested change
log.debug(message, e);
if (!(e instanceof NetworkException)) {
log.error(message, e);
}
if (e instanceof NetworkException) {
log.debug(message, e);
} else {
log.error(message, e);
}

Comment on lines +2328 to +2329
logError(String.format("Failed to write the share group state for share partition: %s-%s due to exception",
groupId, topicIdPartition), ex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for my understanding. What difference in log do we get by String.format now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially we were passing args to log.debug and log.error directly which allowed template based placeholders.
Since I created a function now we are passing args to it as String. There will be no difference in o/p.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read logError as log.error with String.format, hence was wondering why you needed the change. It makes sense.

groupId, topicIdPartition, partitionData);
if (!(ex instanceof UnknownServerException)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, just 2 minor change needed as well, move below line to debug or trace as they call writeShareGroupState which logs error message appropiratley hence these exception messages should go in debug or trace:

log.error("Failed to write state to persister for the share partition: {}-{}",

log.error("Failed to write the share group state on acquisition lock timeout for share partition: {}-{} memberId: {}",

groupId, topicIdPartition, ex);
if (!(ex instanceof UnknownServerException)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, thanks.

@smjn smjn requested a review from apoorvmittal10 June 20, 2025 10:32
Copy link
Contributor

@apoorvmittal10 apoorvmittal10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, LGTM!

Comment on lines +2328 to +2329
logError(String.format("Failed to write the share group state for share partition: %s-%s due to exception",
groupId, topicIdPartition), ex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read logError as log.error with String.format, hence was wondering why you needed the change. It makes sense.

@apoorvmittal10 apoorvmittal10 merged commit d5e2eca into apache:trunk Jun 20, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-approved core Kafka Broker KIP-932 Queues for Kafka small Small PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants