Skip to content

Incremental restores treat segments containing soft-deleted docs as "different" #55142

@DaveCTurner

Description

@DaveCTurner

Today in restores and peer recoveries we treat any .liv files as per-commit rather than per-segment since they may change independently of the rest of the segment as documents are deleted:

if (IndexFileNames.SEGMENTS.equals(segmentId) ||
DEL_FILE_EXTENSION.equals(extension) || LIV_FILE_EXTENSION.equals(extension)) {
// only treat del files as per-commit files fnm files are generational but only for upgradable DV
perCommitStoreFiles.add(meta);
} else {
perSegment.computeIfAbsent(segmentId, k -> new ArrayList<>()).add(meta);
}

This means that a restore or peer recovery that only deletes documents need only copy the corresponding .liv files. However today we soft-delete documents instead which updates their doc values, not their liveness markers, which means we fully restore/recover any segments whose set of deleted documents have changed.

Relates #55013
Relates https://discuss.elastic.co/t/227184
Relates https://issues.apache.org/jira/browse/LUCENE-9324

Metadata

Metadata

Assignees

Labels

:Distributed Indexing/StoreIssues around managing unopened Lucene indices. If it touches Store.java, this is a likely label.Team:Distributed (Obsolete)Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions