Skip to content

Conversation

@jasontedor
Copy link
Member

This local checkpoint tracker uses collections of bit sets to track which sequence numbers are complete, eventually removing these bit sets when the local checkpoint advances. However, these bit sets were eagerly allocated so that if a sequence number far ahead of the checkpoint was marked as completed, all bit sets between the "last" bit set and the bit set needed to track the marked sequence number were allocated. If this sequence number was too far ahead, the memory requirements could be excessive. This commit opts for a different strategy for holding on to these bit sets and enables them to be lazily allocated.

Relates #10708

@jasontedor jasontedor changed the title Lazy seq no bit sets Lazy initialize checkpoint tracker bit sets Oct 30, 2017
This local checkpoint tracker uses collections of bit sets to track
which sequence numbers are complete, eventually removing these bit sets
when the local checkpoint advances. However, these bit sets were eagerly
allocated so that if a sequence number far ahead of the checkpoint was
marked as completed, all bit sets between the "last" bit set and the bit
set needed to track the marked sequence number were allocated. If this
sequence number was too far ahead, the memory requirements could be
excessive. This commit opts for a different strategy for holding on to
these bit sets and enables them to be lazily allocated.
Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left some nits.

long bitArrayKey = getBitArrayKey(checkpoint);
FixedBitSet current = processedSeqNo.get(bitArrayKey);
if (current == null) {
// the bit set corresponding to the checkpoint has already been removed, set ourselves up for the next bit set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when can this happen? it seems we only clean bit sets here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we clean the set (e.g., the checkpoint is equal to size - 1) but the checkpoint can not advance to size. In this case, when size is marked as completed, checkpoint will still be equal to size - 1 but we have already cleaned the set.

FixedBitSet current = processedSeqNo.getFirst();
long bitArrayKey = getBitArrayKey(checkpoint);
FixedBitSet current = processedSeqNo.get(bitArrayKey);
if (current == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is a FixedBitSet I wonder why renamed everything to use bitArray? I don't mind. Just curious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I did not like the inconsistency that in some places it's referred to as a "bit array", and in some it's referred to as a "bit set". I wanted to make them all "bit set" but the problem is the setting index.seq_no.checkpoint.bit_arrays_size (currently never released to the world) refers to "bit array". If, and only if, you would be okay with changing the name of this setting to index.seq_no.checkpoint.bit_sets_size (in 6.0) then I would be okay with normalizing everything to "bit set".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with renaming the setting. I think I added it just be able to control things from tests and as an extra escape hatch if the default size proved wrong. We might want to just remove it. We can solve the test part differently and I'm less concerned now about the default size (I'm also thinking that 1024 is maybe too small)..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #27191.

* current bit set, we can clean it.
*/
if (checkpoint == (1 + bitArrayKey) * bitArraysSize - 1) {
assert current != null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we maybe cache (1 + bitArrayKey) * bitArraysSize - 1 to something like lastSeqNoInArray ? I think it will be easier to read.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 91fbab3.

* Previously this would allocate the entire chain of bit sets to the one for the sequence number being marked; for very large
* sequence numbers this could lead to excessive memory usage resulting in out of memory errors.
*/
tracker.markSeqNoAsCompleted(randomNonNegativeLong());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol. can we assert we allocated just one array?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed b2cbfc5.

assertThat(tracker.getCheckpoint(), equalTo((long) localCheckpoint));
assertThat(tracker.getMaxSeqNo(), equalTo((long) maxSeqNo));
assertThat(tracker.processedSeqNo, empty());
assertThat(tracker.processedSeqNo, new BaseMatcher<LongObjectHashMap<FixedBitSet>>() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I despise assertTrue. Note that this gives a nicer error message if the assertion fails:

java.lang.AssertionError: 
Expected: empty
     but: was <[512=>org.apache.lucene.util.FixedBitSet@98761236]>
Expected :empty
     
Actual   :<[512=>org.apache.lucene.util.FixedBitSet@98761236]>

as opposed to

java.lang.AssertionError
	at __randomizedtesting.SeedInfo.seed([F9E3DE31F5D979AE:C4C54170CE328934]:0)
	at org.junit.Assert.fail(Assert.java:86)
	at org.junit.Assert.assertTrue(Assert.java:41)
	at org.junit.Assert.assertTrue(Assert.java:52)
	at org.elasticsearch.index.seqno.LocalCheckpointTrackerTests.testResetCheckpoint(LocalCheckpointTrackerTests.java:276)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)

@jasontedor
Copy link
Member Author

Thanks @bleskes, I've responded to your feedback.

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left one suggestion.

}

private FixedBitSet getBitArrayForSeqNo(final long seqNo) {
assert Thread.holdsLock(this);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use indexOf(...) and then indexInsert(...) and indexGet(...) respectively to avoid determining what the slot is for a key several times?

final int slot = processedSeqNo.indexOf(bitArrayKey);
if (processedSeqNo.indexExists(slot) == false) {
   processedSeqNo.indexInsert(slot, bitArrayKey, new FixedBitSet(bitArraysSize));
}
return processedSeqNo.indexGet(slot);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @martijnvg, I pushed cabda87.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And 68c4eab. 😐

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the indexGet at the end is not correct because the returned index is negative if the slot is not actually used. Yet, ~slot also does not work because the table could have been resized after the insert. I pushed 97bb3ca.

Copy link
Member

@martijnvg martijnvg Oct 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I didn't realize that. Maybe we should do this to make optimal use of the slot?

final int slot = processedSeqNo.indexOf(bitArrayKey);
if (processedSeqNo.indexExists(slot) == false) {
   FixedBitSet bitSet = new FixedBitSet(bitArraysSize));
   processedSeqNo.indexInsert(slot, bitArrayKey, bitSet);
   return bitSet;
} else {
   return processedSeqNo.indexGet(slot);
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you see a chance to avoid == false, you seize it. 😉

I pushed f4f0dae.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:)

* master:
  Remove checkpoint tracker bit sets setting
  Fix stable BWC branch detection logic
  Fix logic detecting unreleased versions
  Enhances exists queries to reduce need for `_field_names` (elastic#26930)
  Added new terms_set query
  Set request body to required to reflect the code base (elastic#27188)
  Update Docker docs for 6.0.0-rc2 (elastic#27166)
  Add version 6.0.0
  Docs: restore now fails if it encounters incompatible settings (elastic#26933)
  Convert index blocks to cluster block exceptions (elastic#27050)
  [DOCS] Link remote info API in Cross Cluster Search docs page
  Fix Laplace scorer to multiply by alpha (and not add) (elastic#27125)
  [DOCS] Clarify migrate guide and search request validation
  Raise IllegalArgumentException if query validation failed (elastic#26811)
  prevent duplicate fields when mixing parent and root nested includes (elastic#27072)
  TopHitsAggregator must propagate calls to `setScorer`. (elastic#27138)
@jasontedor jasontedor merged commit 59657ad into elastic:master Nov 2, 2017
jasontedor added a commit that referenced this pull request Nov 2, 2017
This local checkpoint tracker uses collections of bit sets to track
which sequence numbers are complete, eventually removing these bit sets
when the local checkpoint advances. However, these bit sets were eagerly
allocated so that if a sequence number far ahead of the checkpoint was
marked as completed, all bit sets between the "last" bit set and the bit
set needed to track the marked sequence number were allocated. If this
sequence number was too far ahead, the memory requirements could be
excessive. This commit opts for a different strategy for holding on to
these bit sets and enables them to be lazily allocated.

Relates #27179
@jasontedor jasontedor deleted the lazy-seq-no-bit-sets branch November 2, 2017 01:27
@clintongormley clintongormley added :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Sequence IDs labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement v6.1.0 v7.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants