Skip to content

Miss some documents when use search_after in search request at the index with index sorting #28023

@LinPower

Description

@LinPower

Elasticsearch version (bin/elasticsearch --version):

Version: 6.1.1, Build: bd92e7f/2017-12-17T20:23:25.338Z, JVM: 1.8.0_66

Plugins installed: []

JVM version (java -version):

java version "1.8.0_66"
Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)

OS version (uname -a if on a Unix-like system):

CentOS release 6.3 (Final)

Description of the problem including expected versus actual behavior:

Steps to reproduce:

  1. Create index with index sorting.
curl -XPUT 'http://localhost:9200/test' -H 'Content-Type: application/json' -d'
{
    "settings": {
        "index": {
            "number_of_shards": "1",
            "number_of_replicas": "0",
            "refresh_interval": "100ms",
            "sort.field": "field",
            "sort.order": "asc"
        }
    },
    "mappings": {
        "type": {
            "_all": {
                "enabled": false
            },
            "properties": {
                "field": {
                    "type": "keyword"
                }
            }
        }
    }
}
'

  1. Write documents with bulk api.
curl -XPOST 'localhost:9200/_bulk?pretty' -H 'Content-Type: application/json' -d'
{ "index" : { "_index" : "test", "_type" : "type", "_id" : "1" } }
{ "field" : "value1" }
{ "index" : { "_index" : "test", "_type" : "type", "_id" : "2" } }
{ "field" : "value2" }
{ "index" : { "_index" : "test", "_type" : "type", "_id" : "3" } }
{ "field" : "value3" }
'
curl -XPOST 'localhost:9200/_bulk?pretty' -H 'Content-Type: application/json' -d'
{ "index" : { "_index" : "test", "_type" : "type", "_id" : "4" } }
{ "field" : "value4" }
{ "index" : { "_index" : "test", "_type" : "type", "_id" : "5" } }
{ "field" : "value5" }
{ "index" : { "_index" : "test", "_type" : "type", "_id" : "6" } }
{ "field" : "value6" }
'

  1. Search documents with search_after.
curl -XGET 'http://localhost:9200/test/_search?pretty' -H 'Content-Type: application/json' -d'
{
    "size": 1,
    "query" : {
        "match_all": {}
    },
    "sort": [
        {"field": "asc"}
    ],
    "search_after": ["value2"]
}
'

Response is:

{
  "took" : 67,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 6,
    "max_score" : null,
    "hits" : [
      {
        "_index" : "test",
        "_type" : "type",
        "_id" : "4",
        "_score" : null,
        "_source" : {
          "field" : "value4"
        },
        "sort" : [
          "value4"
        ]
      }
    ]
  }
}

Expected document is {"field": "value3"},but the response is {"field": "value4"}。
If I set the "size" to 2, the response also did not contain {"field": "value3"}. But if I set the "size" to 3, the response contained {"field": "value3"}.

I think it may be related to segments. The segments in this index:

curl -XGET 'localhost:9200/_cat/segments?v'
index shard prirep ip        segment generation docs.count docs.deleted  size size.memory committed searchable version compound
test  0     p      127.0.0.1 _0               0          3            0 2.7kb         895 false     true       7.1.0   true
test  0     p      127.0.0.1 _1               1          3            0 2.7kb         895 false     true       7.1.0   true

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions