Skip to content

Conversation

@henningandersen
Copy link
Contributor

FilterDirectory.getPendingDeletions does not delegate, fixed
temporarily by overriding in StoreDirectory.

This in turn caused duplicate file name use after a trimUnsafeCommits
had been done, since a new IndexWriter would not consider the pending
deletes in IndexFileDeleter. This should only happen on windows (AFAIK).

Reenabled doing index updates for all tests using
IndexShardTests.indexOnReplicaWithGaps (which could fail due to above
when using mocked WindowsFS).

Follow-up to #40249

A follow-up should fix this in FilterDirectory.

FilterDirectory.getPendingDeletions does not delegate, fixed
temporarily by overriding in StoreDirectory.

This in turn caused duplicate file name use after a trimUnsafeCommits
had been done, since a new IndexWriter would not consider the pending
deletes in IndexFileDeleter. This should only happen on windows (AFAIK).

Reenabled doing index updates for all tests using
IndexShardTests.indexOnReplicaWithGaps (which could fail due to above
when using mocked WindowsFS).
@henningandersen henningandersen added >bug :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. v7.0.0 v8.0.0 v7.2.0 v6.6.3 v6.7.1 labels Mar 22, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@henningandersen henningandersen requested a review from ywelsch March 22, 2019 07:38
@henningandersen henningandersen changed the title Store Pending Files Fix Store Pending Deletions Fix Mar 22, 2019
Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch! LGTM

@dnhatn
Copy link
Member

dnhatn commented Mar 22, 2019

@henningandersen I also dug into a recent test failure of testRestoreLocalHistoryFromTranslogOnPromotion (https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=centos-6/312/console). I think that test failure has the same cause as the one you are fixing here. I proposed https://issues.apache.org/jira/browse/LUCENE-8734 to record nextWriteDVGen.

@ywelsch ywelsch requested review from dnhatn and s1monw March 22, 2019 09:54
@henningandersen
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/packaging-sample

public Set<String> getPendingDeletions() throws IOException {
// FilterDirectory.getPendingDeletions does not delegate, working around it here.
// to be removed once fixed in FilterDirectory.
return unwrap(this).getPendingDeletions();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this fix is trappy, but I am ok with it for now. I would like us to delegate all our subclasses of FilterDirectory as well. We have a few more. I am torn on the unwrap to be honest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with that route is that most of the subclasses of this are in lucene, including the MockDirectoryWrapper created by LuceneTestCase and for instance HardlinkCopyDirectoryWrapper used by StoreRecovery.
I agree that using unwrap is questionable, but in the given context this should be safe here until fixed in lucene.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, we should still pro-actively implement the right fix in all subclasses and remove the unwrap once lucene is upgraded

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @s1monw for clarifying. Except for two trivial test-only subclasses, I have added the override to all subclasses in 85f2066.

Please have another look at your convenience.

@henningandersen
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/packaging-sample

1 similar comment
@henningandersen
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/packaging-sample

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great find!

Added getPendingDeletions delegation to all elasticsearch
FilterDirectory subclasses that were not trivial test-only overrides to
minimize the risk of hitting this issue in another case.
@henningandersen henningandersen merged commit cca77c1 into elastic:master Mar 26, 2019
henningandersen added a commit that referenced this pull request Mar 26, 2019
FilterDirectory.getPendingDeletions does not delegate, fixed
temporarily by overriding in StoreDirectory.

This in turn caused duplicate file name use after a trimUnsafeCommits
had been done, since a new IndexWriter would not consider the pending
deletes in IndexFileDeleter. This should only happen on windows (AFAIK).

Reenabled doing index updates for all tests using
IndexShardTests.indexOnReplicaWithGaps (which could fail due to above
when using mocked WindowsFS).

Added getPendingDeletions delegation to all elasticsearch
FilterDirectory subclasses that were not trivial test-only overrides to
minimize the risk of hitting this issue in another case.
henningandersen added a commit that referenced this pull request Mar 26, 2019
FilterDirectory.getPendingDeletions does not delegate, fixed
temporarily by overriding in StoreDirectory.

This in turn caused duplicate file name use after a trimUnsafeCommits
had been done, since a new IndexWriter would not consider the pending
deletes in IndexFileDeleter. This should only happen on windows (AFAIK).

Reenabled doing index updates for all tests using
IndexShardTests.indexOnReplicaWithGaps (which could fail due to above
when using mocked WindowsFS).

Added getPendingDeletions delegation to all elasticsearch
FilterDirectory subclasses that were not trivial test-only overrides to
minimize the risk of hitting this issue in another case.
henningandersen added a commit that referenced this pull request Mar 26, 2019
FilterDirectory.getPendingDeletions does not delegate, fixed
temporarily by overriding in StoreDirectory.

This in turn caused duplicate file name use after a trimUnsafeCommits
had been done, since a new IndexWriter would not consider the pending
deletes in IndexFileDeleter. This should only happen on windows (AFAIK).

Reenabled doing index updates for all tests using
IndexShardTests.indexOnReplicaWithGaps (which could fail due to above
when using mocked WindowsFS).

Added getPendingDeletions delegation to all elasticsearch
FilterDirectory subclasses that were not trivial test-only overrides to
minimize the risk of hitting this issue in another case.
henningandersen added a commit that referenced this pull request Mar 26, 2019
Added missing file/commit.
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Apr 16, 2019
The recent upgrade to lucene 8.1.0 included the fix for LUCENE-8735
(pending deletions did not delegate). Removed the temporary workaround
for this.

Follow up to elastic#40345
henningandersen added a commit that referenced this pull request Apr 17, 2019
The recent upgrade to lucene 8.1.0 included the fix for LUCENE-8735
(pending deletions did not delegate). Removed the temporary workaround
that was previously added for this issue.

Follow up to #40345
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
The recent upgrade to lucene 8.1.0 included the fix for LUCENE-8735
(pending deletions did not delegate). Removed the temporary workaround
that was previously added for this issue.

Follow up to elastic#40345
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Distributed Indexing/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. v6.7.1 v7.0.0-rc2 v7.2.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants