Skip to content

Conversation

@Johnson-zs
Copy link
Contributor

This commit addresses critical crashes that occur when trying to read certain Lucene index files where field data may be corrupted or have incompatible format.

Root causes:

  1. Assertion failures in FieldsReader::doc() when bits flag or compressed field format is incompatible with expected values
  2. Null pointer dereference in boost::shared_ptr when trying to add fields to a document with potentially corrupted data

The fixes include:

  • Replace assertions with safe handling for invalid bit flags and format incompatibilities
  • Add null checks before dereferencing shared pointers in critical paths
  • Create field objects separately before adding them to documents
  • Add proper error handling for decompression failures
  • Ensure uncompress and string conversion methods never return null values

These minimal changes maintain the original logic but make the code more robust when dealing with unexpected or corrupt index data. Instead of crashing, the code now gracefully handles these edge cases and continues processing where possible.

This commit addresses critical crashes that occur when trying to read certain
Lucene index files where field data may be corrupted or have incompatible format.

Root causes:
1. Assertion failures in FieldsReader::doc() when bits flag or compressed field
   format is incompatible with expected values
2. Null pointer dereference in boost::shared_ptr when trying to add fields to a
   document with potentially corrupted data

The fixes include:
- Replace assertions with safe handling for invalid bit flags and format incompatibilities
- Add null checks before dereferencing shared pointers in critical paths
- Create field objects separately before adding them to documents
- Add proper error handling for decompression failures
- Ensure uncompress and string conversion methods never return null values

These minimal changes maintain the original logic but make the code more robust
when dealing with unexpected or corrupt index data. Instead of crashing, the code
now gracefully handles these edge cases and continues processing where possible.
Copy link
Collaborator

@alanw alanw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@alanw alanw merged commit 5a74bd5 into luceneplusplus:master May 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants