Skip to content

Reindex can ignore op_type create when using external version #37855

@ismael-hasan

Description

@ismael-hasan

Elasticsearch version (bin/elasticsearch --version): 6.5.2

Plugins installed: []

JVM version (java -version): Java 10.0.2

OS version (uname -a if on a Unix-like system): Windows 10

Description of the problem including expected versus actual behavior:
In a reindex, when op_type is set as create and we use external versioning we can update documents already existing in the destination index. If op_type is set as create, it is not expected that we can update existing documents; quoting the reindex documentation:

Settings op_type to create will cause _reindex to only create missing documents in the target index. All existing documents will cause a version conflict:

Steps to reproduce:
Create a test index with 1 document:

POST test/doc
{
  "test" : "test"
}

Reindex to another test2 index with op_type create (it will work):

POST _reindex
{
  "source": {
    "index": "test"
  },
  "dest": {
    "index": "test2",
    "op_type": "create",
    "version_type": "external"
  }
}
{
  "took" : 1500,
  "timed_out" : false,
  "total" : 1,
  "updated" : 0,
  "created" : 1,
  "deleted" : 0,
  "batches" : 1,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

Reindex again to the same index; in this case it will fail due to versioning (it is expected that it fails for op_type create, but not sure if it fails incidentally due to versioning and it is already ignoring the op_type). The request is the same as in the previous step, the response is:

{
  "took": 2,
  "timed_out": false,
  "total": 1,
  "updated": 0,
  "created": 0,
  "deleted": 0,
  "batches": 1,
  "version_conflicts": 1,
  "noops": 0,
  "retries": {
    "bulk": 0,
    "search": 0
  },
  "throttled_millis": 0,
  "requests_per_second": -1,
  "throttled_until_millis": 0,
  "failures": [
    {
      "index": "test2",
      "type": "doc",
      "id": "YjULhGgBIniEFhEZQcaI",
      "cause": {
        "type": "version_conflict_engine_exception",
        "reason": "[doc][YjULhGgBIniEFhEZQcaI]: version conflict, current version [1] is higher or equal to the one provided [1]",
        "index_uuid": "VNd0kK6aSxWCU6v-NO5m8g",
        "shard": "3",
        "index": "test2"
      },
      "status": 409
    }
  ]
}

Reindex again to the same index, but overwriting the version with scripting. In this case, regardless of the document already existing in the destination and op_type being create, it will update that document:

POST _reindex
{
  "source": {
    "index": "test"
  },
  "dest": {
    "index": "test2",
    "op_type": "create",
    "version_type": "external"
  },
    "script": {
    "source": "ctx._version = 6",
    "lang": "painless"
  }
}
{
  "took" : 65,
  "timed_out" : false,
  "total" : 1,
  "updated" : 1,
  "created" : 0,
  "deleted" : 0,
  "batches" : 1,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Distributed Indexing/ReindexIssues relating to reindex that are not caused by issues further down>bugTeam:Distributed (Obsolete)Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.good first issuelow hanging fruit

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions