Skip to content

Conversation

@tlrx
Copy link
Member

@tlrx tlrx commented Feb 23, 2021

Searchable snapshots IndexInput implementations detect read operations that are executed on the last 16 bytes of files which contain the footer checksum, and then serves them from memory (FileInfo) instead of relying on cache mechanisms.

But the current implementation expect an exact checksum read: ie a read operation that starts at file length - 16 bytes and expects to read exactly 16 bytes. This is not always the case as IndexInput implementations extend BufferedIndexInput which uses an internal buffer of 1024 bytes: in some case the remaining length (ie, the last portion of the file) might be less than 16 bytes long and thus a very small read is engaged using cache (or direct) read where it could instead be also served from memory.

@tlrx tlrx added >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.11.2 v7.12.1 labels Feb 23, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Feb 23, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Contributor

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM nice find!

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tlrx tlrx merged commit 5322aa6 into elastic:master Feb 23, 2021
@tlrx tlrx deleted the footer-checksum branch February 23, 2021 10:54
@tlrx
Copy link
Member Author

tlrx commented Feb 23, 2021

Thanks Armin and Yannick

tlrx added a commit to tlrx/elasticsearch that referenced this pull request Feb 23, 2021
Searchable snapshots IndexInput implementations detect 
read operations that are executed on the last 16 bytes of 
files which contain the footer checksum, and then serves 
them from memory (FileInfo) instead of relying on cache 
mechanisms.

But the current implementation expect an exact checksum 
read: ie a read operation that starts at file length - 16 bytes 
and expects to read exactly 16 bytes. This is not always the 
case as IndexInput implementations extend 
BufferedIndexInput which uses an internal buffer of 1024 
bytes: in some case the remaining length (ie, the last portion 
of the file) might be less than 16 bytes long and thus a very 
small read is engaged using cache (or direct) read where it 
could instead be also served from memory.
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Feb 23, 2021
Searchable snapshots IndexInput implementations detect 
read operations that are executed on the last 16 bytes of 
files which contain the footer checksum, and then serves 
them from memory (FileInfo) instead of relying on cache 
mechanisms.

But the current implementation expect an exact checksum 
read: ie a read operation that starts at file length - 16 bytes 
and expects to read exactly 16 bytes. This is not always the 
case as IndexInput implementations extend 
BufferedIndexInput which uses an internal buffer of 1024 
bytes: in some case the remaining length (ie, the last portion 
of the file) might be less than 16 bytes long and thus a very 
small read is engaged using cache (or direct) read where it 
could instead be also served from memory.
@tlrx tlrx removed the v7.11.2 label Feb 23, 2021
tlrx added a commit that referenced this pull request Feb 23, 2021
Searchable snapshots IndexInput implementations detect 
read operations that are executed on the last 16 bytes of 
files which contain the footer checksum, and then serves 
them from memory (FileInfo) instead of relying on cache 
mechanisms.

But the current implementation expect an exact checksum 
read: ie a read operation that starts at file length - 16 bytes 
and expects to read exactly 16 bytes. This is not always the 
case as IndexInput implementations extend 
BufferedIndexInput which uses an internal buffer of 1024 
bytes: in some case the remaining length (ie, the last portion 
of the file) might be less than 16 bytes long and thus a very 
small read is engaged using cache (or direct) read where it 
could instead be also served from memory.

Backport of #69415
tlrx added a commit that referenced this pull request Feb 23, 2021
Searchable snapshots IndexInput implementations detect 
read operations that are executed on the last 16 bytes of 
files which contain the footer checksum, and then serves 
them from memory (FileInfo) instead of relying on cache 
mechanisms.

But the current implementation expect an exact checksum 
read: ie a read operation that starts at file length - 16 bytes 
and expects to read exactly 16 bytes. This is not always the 
case as IndexInput implementations extend 
BufferedIndexInput which uses an internal buffer of 1024 
bytes: in some case the remaining length (ie, the last portion 
of the file) might be less than 16 bytes long and thus a very 
small read is engaged using cache (or direct) read where it 
could instead be also served from memory.

Backport of #69415
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.12.1 v7.13.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants