-
Notifications
You must be signed in to change notification settings - Fork 14.5k
MINOR: Reduce logging in persister. #19998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smjn Thanks for the PR. This will only suppress logs in Persister but as we saw the UnknownServerException in SharePartition from Persister which should be fixed as well, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have left some comments.
groupId, topicIdPartition, partitionData); | ||
if (!(ex instanceof UnknownServerException)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seems to be correct, so if the error code is 13 then only we should not log the eror message. Example log from persister.
Failed to initialize the share partition: <.......>. Exception occurred: PartitionData(partition=144,stateEpoch=0,startOffset=0,errorCode=13,errorMessage=Error in read state RPC. The server disconnected before a response was received.,leaderEpoch=0,stateBatches=[]).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is because the SharePartition.fetchPersisterError
is wrapping the error codes to UnknownServerException
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, just 2 minor change needed as well, move below line to debug or trace as they call writeShareGroupState
which logs error message appropiratley hence these exception messages should go in debug or trace:
log.error("Failed to write state to persister for the share partition: {}-{}", |
log.error("Failed to write the share group state on acquisition lock timeout for share partition: {}-{} memberId: {}", |
groupId, topicIdPartition, ex); | ||
if (!(ex instanceof UnknownServerException)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something similar to above but here the log line in SharePartition shows: org.apache.kafka.common.errors.UnknownServerException
rather should Persister send NetworkException as like in ReadAPI...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is because the SharePartition.fetchPersisterError
is wrapping the error codes to UnknownServerException
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, thanks.
@apoorvmittal10 Thanks for the review, incorporated comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the comments, some minor comments and change in 2 log lines as well.
log.debug(message, e); | ||
if (!(e instanceof NetworkException)) { | ||
log.error(message, e); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If not NetworkException then same error message will be twice in debug mode:
log.debug(message, e); | |
if (!(e instanceof NetworkException)) { | |
log.error(message, e); | |
} | |
if (e instanceof NetworkException) { | |
log.debug(message, e); | |
} else { | |
log.error(message, e); | |
} |
logError(String.format("Failed to write the share group state for share partition: %s-%s due to exception", | ||
groupId, topicIdPartition), ex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: for my understanding. What difference in log do we get by String.format now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially we were passing args to log.debug and log.error directly which allowed template based placeholders.
Since I created a function now we are passing args to it as String. There will be no difference in o/p.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read logError
as log.error
with String.format
, hence was wondering why you needed the change. It makes sense.
groupId, topicIdPartition, partitionData); | ||
if (!(ex instanceof UnknownServerException)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, just 2 minor change needed as well, move below line to debug or trace as they call writeShareGroupState
which logs error message appropiratley hence these exception messages should go in debug or trace:
log.error("Failed to write state to persister for the share partition: {}-{}", |
log.error("Failed to write the share group state on acquisition lock timeout for share partition: {}-{} memberId: {}", |
groupId, topicIdPartition, ex); | ||
if (!(ex instanceof UnknownServerException)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, LGTM!
logError(String.format("Failed to write the share group state for share partition: %s-%s due to exception", | ||
groupId, topicIdPartition), ex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read logError
as log.error
with String.format
, hence was wondering why you needed the change. It makes sense.
PersisterStateManager
were noisy and not adding muchvalue.
level.
DefaultStatePersister
to track epochs.Reviewers: Apoorv Mittal [email protected], Andrew Schofield
[email protected]