Skip to content

Conversation

@rahulgoswami
Copy link
Contributor

@rahulgoswami rahulgoswami commented Nov 18, 2025

Description

Backport #14607 to branch_10x

@rahulgoswami
Copy link
Contributor Author

rahulgoswami commented Nov 18, 2025

gradlew check passes fine, but the nightly tests in TestBinaryBackwardsCompatibility are failing. Specifically TestBinaryBackwardsCompatibility.testReadNMinusTwoCommit and TestBinaryBackwardsCompatibility.testReadNMinusTwoSegmentInfos.

These try to test opening N-2 version (=8.x) and expect success. This used to work earlier with VERSION_74 in checkHeaderNoMagic(), but fails now (for version < 8.6.0) since we moved to VERSION_86. Staying with VERSION_74 complains about missing Lucene_70 codec (same as main). "main" doesn't have this problem because N-2=9.x

I have not yet understood why we test for N-2 when we anyway don't support the index. And hence still contemplating the right way forward on these failures.

To reproduce:
./gradlew test --tests TestBinaryBackwardsCompatibility.testReadNMinusTwoSegmentInfos -Dtests.seed=738894B1606DB252 -Dtests.nightly=true -Dtests.locale=en-IM -Dtests.timezone=Australia/Queensland -Dtests.asserts=true -Dtests.file.encoding=UTF-8

@msokolov
Copy link
Contributor

I'm trying to understand what's going on here. One thing that confuses me is why we have:

 Version.MIN_SUPPORTED_MAJOR = Version.LATEST.major - 1;

and also

 TestBinaryBackwardsCompatibility.MIN_BINARY_SUPPORTED_MAJOR = Version.MIN_SUPPORTED_MAJOR - 1;

what is the difference between "supported" and "binary supported"?

@msokolov
Copy link
Contributor

msokolov commented Nov 21, 2025

I think what happened is that CheckIndex is now able to read some more of the back-compat indexes that we previously said were incompatible. But this doesn't really make sense since the 10x branch does not include any additional backwards codecs that were removed from main. Some of these indexes cannot be opened, but CheckIndex is able to check them and reports they are clean.

I was able to get tests passing by relaxing a few version numbers and by I changing the exception type when we are unable to read the segments file from IllegalArgumentException to IndexFormatTooOldException to match the expectations of the test. Maybe that was bad, but it seems pretty harmless to me? I'm not sure if this change is safe or not.

diff --git a/lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestAncientIndicesCompatibility.java b/lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestAncientIndicesCompatibility.java
index a06a96b2ed5..56608b0b506 100644
--- a/lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestAncientIndicesCompatibility.java
+++ b/lucene/backward-codecs/src/test/org/apache/lucene/backward_index/TestAncientIndicesCompatibility.java
@@ -199,7 +199,7 @@ public class TestAncientIndicesCompatibility extends LuceneTestCase {
       checker.setInfoStream(new PrintStream(bos, false, UTF_8));
       checker.setLevel(CheckIndex.Level.MIN_LEVEL_FOR_INTEGRITY_CHECKS);
       CheckIndex.Status indexStatus = checker.checkIndex();
-      if (getVersion(version).onOrAfter(Version.fromBits(8, 6, 0))) {
+      if (getVersion(version).onOrAfter(Version.fromBits(8, 0, 0))) {
         assertTrue(indexStatus.clean);
       } else {
         assertFalse(indexStatus.clean);
@@ -209,10 +209,9 @@ public class TestAncientIndicesCompatibility extends LuceneTestCase {
         boolean formatTooOld =
             bos.toString(UTF_8).contains(IndexFormatTooOldException.class.getName());
         boolean missingCodec = bos.toString(UTF_8).contains("Could not load codec");
-        assertTrue(formatTooOld || missingCodec);
+        assertTrue("version=" + version, formatTooOld || missingCodec);
       }
       checker.close();
-
       dir.close();
     }
   }
diff --git a/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java b/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java
index 131518983a8..621ecb0b529 100644
--- a/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java
+++ b/lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java
@@ -328,7 +328,7 @@ public final class SegmentInfos implements Cloneable, Iterable<SegmentCommitInfo
         throw new IndexFormatTooOldException(
             input, magic, CodecUtil.CODEC_MAGIC, CodecUtil.CODEC_MAGIC);
       }
-      format = CodecUtil.checkHeaderNoMagic(input, "segments", VERSION_86, VERSION_CURRENT);
+      format = CodecUtil.checkHeaderNoMagic(input, "segments", VERSION_74, VERSION_CURRENT);
       byte[] id = new byte[StringHelper.ID_LENGTH];
       input.readBytes(id, 0, id.length);
       CodecUtil.checkIndexHeaderSuffix(input, Long.toString(generation, Character.MAX_RADIX));
@@ -529,11 +529,13 @@ public final class SegmentInfos implements Cloneable, Iterable<SegmentCommitInfo
     } catch (IllegalArgumentException e) {
       // maybe it's an old default codec that moved
       if (name.startsWith("Lucene")) {
-        throw new IllegalArgumentException(
+        throw new IndexFormatTooOldException(
+            input,
             "Could not load codec '"
                 + name
-                + "'. Did you forget to add lucene-backward-codecs.jar?",
-            e);
+                + "'. "
+                + e.getMessage()
+                + " Did you forget to add lucene-backward-codecs.jar?");
       }
       throw e;
     }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants