Skip to content

Conversation

@grcevski
Copy link
Contributor

@grcevski grcevski commented Dec 29, 2021

This PR proposes to fix the Lucene split package issue in LazySoftDeletesDirectoryReaderWrapper.

I did two things to make this work, one is a hack until we perhaps make a change in Lucene:

  • (hack) I use SoftDeletesDirectoryReaderWrapper to make me a fresh CacheKey since we can't call the constructor.
  • I copy pasted the applySoftDeletes code and removed the bits that were irrelevant (and causing us to bring more packages). Fortunately, the call to DocValuesFieldExistsQuery.getDocValuesDocIdSetIterator(field, reader) can't get us a reader back that has a non-null hasValue, so we could live without the iterator in DocValuesFieldUpdates (which is package private).

If we manage to get the DelegatingCacheHelper implementation as a public implementation in Lucene, then we get much better implementation.

closes #81981

This PR proposes to fix the Lucene split package
issue in LazySoftDeletesDirectoryReaderWrapper.
@grcevski grcevski marked this pull request as draft December 29, 2021 21:17
}

private static CacheKey getCacheKey(DirectoryReader reader, String field) throws IOException {
SoftDeletesDirectoryReaderWrapper wrapper = new SoftDeletesDirectoryReaderWrapper(reader, field);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I realized there's an issue if reader.getReaderCacheHelper() is null. I can easily work around the null pointer here, but I need to somehow get a key inside the leaf and codec readers.

}

static CacheKey createCacheKey(DirectoryReader in) {
try (DirectoryReader clone = DirectoryReader.open(in.directory())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like too much of a hack to bring into ES. It's effectively opening up the index multiple times now whenever a search hits a shard in the frozen tier (one extra time per lucene index, and then once per segment in the Lucene index) . Let's rather work towards exposing the right abstraction in Lucene.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree, it's a total hack. I couldn't figure out a better way to get to a unique CacheKey. Let's try to get a change in Lucene.

@mark-vieira mark-vieira added v8.2.0 and removed v8.1.0 labels Feb 2, 2022
@grcevski grcevski marked this pull request as ready for review March 4, 2022 20:28
@grcevski grcevski added :Core/Infra/Core Core issues without another label >enhancement Team:Core/Infra Meta label for core/infra team labels Mar 4, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticsearchmachine
Copy link
Collaborator

Hi @grcevski, I've created a changelog YAML for you.

@grcevski
Copy link
Contributor Author

grcevski commented Mar 4, 2022

@elasticmachine run Check labels

Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@elasticsearchmachine
Copy link
Collaborator

Hi @grcevski, I've updated the changelog YAML for you.

@grcevski grcevski merged commit 487077c into elastic:master Mar 7, 2022
@grcevski grcevski deleted the enhancement/no-lucene-split-packages branch March 7, 2022 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Core/Infra/Core Core issues without another label >enhancement Team:Core/Infra Meta label for core/infra team v8.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix split package LazySoftDeletesDirectoryReaderWrapper

6 participants