-
Notifications
You must be signed in to change notification settings - Fork 14.5k
KAFKA-19407 Fix potential IllegalStateException when appending to timeIndex #19972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
This PR is supposed to introduce no any new lock contention, because (potentially blocking) lazy timeIndex materialization has already exclusive control by lock in LazyIndex#get |
@@ -192,8 +192,13 @@ public void sanityCheck(boolean timeIndexFileNewlyCreated) throws IOException { | |||
* the time index). | |||
*/ | |||
public TimestampOffset readMaxTimestampAndOffsetSoFar() throws IOException { | |||
if (maxTimestampAndOffsetSoFar == TimestampOffset.UNKNOWN) | |||
maxTimestampAndOffsetSoFar = timeIndex().lastEntry(); | |||
if (maxTimestampAndOffsetSoFar == TimestampOffset.UNKNOWN) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this volatile field become an Atomic instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using Atomic doesn't make the thing simple, because we need double-check locking anyways to ensure:
- initialize the value only once at first time
- always return initialized value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought updateAndGet
could work to this effect
maxTimestampAndOffsetSoFar.updateAndGet(t -> if (t == TimestampOffset.UNKNOWN) timeIndex().lastEntry() else t)
But I have not fully considered whether it would end up being slower on any microbenchmark. It just looked simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ocadaruma nice find!
@@ -192,8 +192,13 @@ public void sanityCheck(boolean timeIndexFileNewlyCreated) throws IOException { | |||
* the time index). | |||
*/ | |||
public TimestampOffset readMaxTimestampAndOffsetSoFar() throws IOException { | |||
if (maxTimestampAndOffsetSoFar == TimestampOffset.UNKNOWN) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can prevent readMaxTimestampAndOffsetSoFar
from changing maxTimestampAndOffsetSoFar
by initializing the latter with timeIndex().lastEntry()
in the constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do so, memory map for all segments will be created eagerly at startup and make it slow against the motivation of lazy index loading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memory map for all segments will be created eagerly at startup
You're right.
The lock approach works for me, but maybe there's another approach that doesn't need a lock. We can pre-load the maxTimestampAndOffsetSoFar
at startup for the active segment. If I understand this issue correctly, the issue happens only in the active segment. Fortunately, the memory map of the active segment is always pre-loaded. Hence, we can initialize the maxTimestampAndOffsetSoFar
at the same time. WDYT?
Summary