-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Elasticsearch version (bin/elasticsearch --version): 6.1.3
Plugins installed: []
JVM version (java -version):
java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)
OS version (uname -a if on a Unix-like system):
Linux *** 4.4.0-1049-aws #58-Ubuntu SMP Fri Jan 12 23:17:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Description of the problem including expected versus actual behavior:
Seeing a weird behaviour, with bulk index actions that never finish.
They are triggered by the Elasticsearch Java client (version 6.1.3), through a BulkRequestBuilder, with the default timeout (1m). Seems we never enter the ActionListener callbacks.
I found out recently that it's linked with tasks running indefinitely. When checking the /_tasks api, I'm seeing tasks that have been running for several days (!)..
$ curl 'localhost:9200/_tasks?pretty&human&actions=indices:data/write/bulk&detailed'
...
"pd5MT4eUSEqsOy_oq_q-vw" : {
"name" : "***",
"transport_address" : "***",
"host" : "***",
"ip" : "***",
"roles" : [
"data"
],
"attributes" : {
"availability_zone" : "us-east-1c",
"tag" : "fresh"
},
"tasks" : {
"pd5MT4eUSEqsOy_oq_q-vw:208477658" : {
"node" : "pd5MT4eUSEqsOy_oq_q-vw",
"id" : 208477658,
"type" : "netty",
"action" : "indices:data/write/bulk",
"start_time" : "2018-04-12T21:17:45.175Z",
"start_time_in_millis" : 1523567865175,
"running_time" : "53.4d",
"running_time_in_nanos" : 4622108629250516,
"cancellable" : false
}
}
},
...
I would expect the bulk action to run for 1min, and then timeout.
I came across several issues / PRs on github:
- Handle throws on tasks submitted to thread pools #28667: could it be related to the issue?
- Tasks stuck forever #25863: seems like a similar issue, but we're running a newer version, and we have no rejection logs in ES, so I don't believe that's it.
Steps to reproduce: I have no idea, it happens randomly, since we migrated from 5.6 to 6.1.3.
Provide logs (if relevant): found nothing relevant nor in ES, nor in the client logs.