Skip to content

Bulk index tasks stuck forever #31099

@jeancornic

Description

@jeancornic

Elasticsearch version (bin/elasticsearch --version): 6.1.3

Plugins installed: []

JVM version (java -version):

java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)

OS version (uname -a if on a Unix-like system):

Linux *** 4.4.0-1049-aws #58-Ubuntu SMP Fri Jan 12 23:17:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:

Seeing a weird behaviour, with bulk index actions that never finish.
They are triggered by the Elasticsearch Java client (version 6.1.3), through a BulkRequestBuilder, with the default timeout (1m). Seems we never enter the ActionListener callbacks.

I found out recently that it's linked with tasks running indefinitely. When checking the /_tasks api, I'm seeing tasks that have been running for several days (!)..

$ curl 'localhost:9200/_tasks?pretty&human&actions=indices:data/write/bulk&detailed'
...
    "pd5MT4eUSEqsOy_oq_q-vw" : {
      "name" : "***",
      "transport_address" : "***",
      "host" : "***",
      "ip" : "***",
      "roles" : [
        "data"
      ],
      "attributes" : {
        "availability_zone" : "us-east-1c",
        "tag" : "fresh"
      },
      "tasks" : {
        "pd5MT4eUSEqsOy_oq_q-vw:208477658" : {
          "node" : "pd5MT4eUSEqsOy_oq_q-vw",
          "id" : 208477658,
          "type" : "netty",
          "action" : "indices:data/write/bulk",
          "start_time" : "2018-04-12T21:17:45.175Z",
          "start_time_in_millis" : 1523567865175,
          "running_time" : "53.4d",
          "running_time_in_nanos" : 4622108629250516,
          "cancellable" : false
        }
      }
    },
...

I would expect the bulk action to run for 1min, and then timeout.

I came across several issues / PRs on github:

Steps to reproduce: I have no idea, it happens randomly, since we migrated from 5.6 to 6.1.3.

Provide logs (if relevant): found nothing relevant nor in ES, nor in the client logs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Distributed Indexing/CRUDA catch all label for issues around indexing, updating and getting a doc by id. Not search.>bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions