-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Closed
Labels
:Distributed Indexing/EngineAnything around managing Lucene and the Translog in an open shard.Anything around managing Lucene and the Translog in an open shard.>testIssues or PRs that are addressing/adding testsIssues or PRs that are addressing/adding tests
Description
This test sometimes fails in 5.x and 5.4 but the failure does not reproduce locally:
gradle :core:integTest -Dtests.seed=EB43781CD72967EA -Dtests.class=org.elasticsearch.index.store.CorruptedFileIT -Dtests.method="testReplicaCorruption" -Dtests.security.manager=true -Dtests.locale=da -Dtests.timezone=America/Indianapolis
Build output:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+multijob-darwin-compatibility/461/consoleFull
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+multijob-darwin-compatibility/410/consoleFull
It seems that the recovery is blocked if the segment_* file is corrupted:
1> Caused by: java.io.EOFException: read past EOF: MMapIndexInput(path="/private/var/lib/jenkins/workspace/elastic+elasticsearch+5.x+multijob-darwin-compatibility/core/build/testrun/integTest/J2/temp/org.elasticsearch.index.store.CorruptedFileIT_D04C89BC1BDC6CB3-001/tempDir-002/data/nodes/4/indices/djB78NKbThaDzb0A0xah6A/1/index/segments_2")
1> at org.apache.lucene.store.ByteBufferIndexInput.readByte(ByteBufferIndexInput.java:75) ~[lucene-core-6.5.0.jar:6.5.0 4b16c9a10c3c00cafaf1fc92ec3276a7bc7b8c95 - jimczi - 2017-03-21 20:40:22]
1> at org.apache.lucene.store.MockIndexInputWrapper.readByte(MockIndexInputWrapper.java:140) ~[lucene-test-framework-6.5.0.jar:6.5.0 4b16c9a10c3c00cafaf1fc92ec3276a7bc7b8c95 - jimczi - 2017-03-21 20:40:22]
1> at org.apache.lucene.store.BufferedChecksumIndexInput.readByte(BufferedChecksumIndexInput.java:41) ~[lucene-core-6.5.0.jar:6.5.0 4b16c9a10c3c00cafaf1fc92ec3276a7bc7b8c95 - jimczi - 2017-03-21 20:40:22]
1> at org.apache.lucene.store.DataInput.readInt(DataInput.java:101) ~[lucene-core-6.5.0.jar:6.5.0 4b16c9a10c3c00cafaf1fc92ec3276a7bc7b8c95 - jimczi - 2017-03-21 20:40:22]
1> at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:300) ~[lucene-core-6.5.0.jar:6.5.0 4b16c9a10c3c00cafaf1fc92ec3276a7bc7b8c95 - jimczi - 2017-03-21 20:40:22]
1> at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:288) ~[lucene-core-6.5.0.jar:6.5.0 4b16c9a10c3c00cafaf1fc92ec3276a7bc7b8c95 - jimczi - 2017-03-21 20:40:22]
1> at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:448) ~[lucene-core-6.5.0.jar:6.5.0 4b16c9a10c3c00cafaf1fc92ec3276a7bc7b8c95 - jimczi - 2017-03-21 20:40:22]
1> at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:445) ~[lucene-core-6.5.0.jar:6.5.0 4b16c9a10c3c00cafaf1fc92ec3276a7bc7b8c95 - jimczi - 2017-03-21 20:40:22]
1> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:692) ~[lucene-core-6.5.0.jar:6.5.0 4b16c9a10c3c00cafaf1fc92ec3276a7bc7b8c95 - jimczi - 2017-03-21 20:40:22]
1> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:644) ~[lucene-core-6.5.0.jar:6.5.0 4b16c9a10c3c00cafaf1fc92ec3276a7bc7b8c95 - jimczi - 2017-03-21 20:40:22]
1> at org.apache.lucene.index.SegmentInfos.readLatestCommit(SegmentInfos.java:450) ~[lucene-core-6.5.0.jar:6.5.0 4b16c9a10c3c00cafaf1fc92ec3276a7bc7b8c95 - jimczi - 2017-03-21 20:40:22]
1> at org.elasticsearch.common.lucene.Lucene.readSegmentInfos(Lucene.java:129) ~[main/:?]
1> at org.elasticsearch.index.store.Store.readSegmentsInfo(Store.java:198) ~[main/:?]
1> at org.elasticsearch.index.store.Store.access$200(Store.java:126) ~[main/:?]
1> at org.elasticsearch.index.store.Store$MetadataSnapshot.loadMetadata(Store.java:785) ~[main/:?]
1> at org.elasticsearch.index.store.Store$MetadataSnapshot.<init>(Store.java:718) ~[main/:?]
1> at org.elasticsearch.index.store.Store.getMetadata(Store.java:240) ~[main/:?]
1> at org.elasticsearch.index.shard.IndexShard.snapshotStoreMetadata(IndexShard.java:874) ~[main/:?]
1> at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.doRecovery(PeerRecoveryTargetService.java:183) ~[main/:?]
1> at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.access$900(PeerRecoveryTargetService.java:73) ~[main/:?]
1> at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRunner.doRun(PeerRecoveryTargetService.java:555) ~[main/:?]
1> at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) ~[main/:?]
1> at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[main/:?]
1> ... 3 more
... and after some time the test fails on a timeout:
FAILURE 33.7s J2 | CorruptedFileIT.testReplicaCorruption <<< FAILURES!
> Throwable #1: java.lang.AssertionError: timed out waiting for green state
> at __randomizedtesting.SeedInfo.seed([EB43781CD72967EA:74122634059C0417]:0)
> at org.elasticsearch.test.ESIntegTestCase.ensureColor(ESIntegTestCase.java:932)
> at org.elasticsearch.test.ESIntegTestCase.ensureGreen(ESIntegTestCase.java:898)
> at org.elasticsearch.test.ESIntegTestCase.ensureGreen(ESIntegTestCase.java:887)
> at org.elasticsearch.index.store.CorruptedFileIT.testReplicaCorruption(CorruptedFileIT.java:590)
> at java.lang.Thread.run(Thread.java:745)
````
Metadata
Metadata
Labels
:Distributed Indexing/EngineAnything around managing Lucene and the Translog in an open shard.Anything around managing Lucene and the Translog in an open shard.>testIssues or PRs that are addressing/adding testsIssues or PRs that are addressing/adding tests