Skip to content

Long running tasks never stop. #36826

@stanleyyang1987

Description

@stanleyyang1987

Elasticsearch version (bin/elasticsearch --version):
6.3.0
Plugins installed: [analysis-ik]

JVM version (java -version):
java version "1.8.0_181"
OS version (uname -a if on a Unix-like system):
Linux 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Description of the problem including expected versus actual behavior:

      **task run for 7.8 days and  never stop**
  • [1] Firstly:I have found some tasks runing for a long time incrediablley using

      GET _cat/task?v
    

indices:data/write/bulk dvuBObXQTHaKH6-zZVTFwg:61134252 - transport 1544532141642 20:42:21 7.8d 10.200.3.15 applog02
indices:data/write/bulk[s] dvuBObXQTHaKH6-zZVTFwg:61134258 dvuBObXQTHaKH6-zZVTFwg:61134252 transport 1544532141642 20:42:21 7.8d 10.200.3.15 applog02
indices:data/write/bulk[s] 9QmM0GhMQROuHQ2Bq4bzIA:61066347 dvuBObXQTHaKH6-zZVTFwg:61134258 netty 1544532112903 20:41:52 7.8d 10.200.3.16 applog03
indices:data/write/bulk[s][p] 9QmM0GhMQROuHQ2Bq4bzIA:61066348 9QmM0GhMQROuHQ2Bq4bzIA:61066347 direct 1544532112903 20:41:52 7.8d 10.200.3.16 applog03
indices:data/write/bulk[s][r] dvuBObXQTHaKH6-zZVTFwg:61134292 9QmM0GhMQROuHQ2Bq4bzIA:61066347 netty 1544532141647 20:42:21 7.8d 10.200.3.15
applog02

  • [2]Then I check the detail about the long running task:
    GET /_tasks/dvuBObXQTHaKH6-zZVTFwg:61134252?pretty
    {
    "completed" : false,
    "task" : {
    "node" : "dvuBObXQTHaKH6-zZVTFwg",
    "id" : 61134252,
    "type" : "transport",
    "action" : "indices:data/write/bulk",
    "description" : "requests[125], indices[logs-galaxy-2018-12-11]",
    "start_time_in_millis" : 1544532141642,
    "running_time_in_nanos" : 681462436191915,
    "cancellable" : false,
    "headers" : { }
    }
    }
  • [3]Actually the indices[logs-galaxy-2018-12-11]I have already deleted a few days ago just show as following cmd:
    GET /_cat/segments/logs-galaxy-2018-12-11?pretty
    {
    "error" : {
    "root_cause" : [
    {
    "type" : "index_not_found_exception",
    "reason" : "no such index",
    "resource.type" : "index_or_alias",
    "resource.id" : "logs-galaxy-2018-12-11",
    "index_uuid" : "na",
    "index" : "logs-galaxy-2018-12-11"
    }
    ],
    "type" : "index_not_found_exception",
    "reason" : "no such index",
    "resource.type" : "index_or_alias",
    "resource.id" : "logs-galaxy-2018-12-11",
    "index_uuid" : "na",
    "index" : "logs-galaxy-2018-12-11"
    },
    "status" : 404
    }

Steps to reproduce:

Sorry for losing infos for reproduce the bug. Because I have done the same delete on my indices like logs-galaxy-2018-12-12、logs-galaxy-2018-13
Actually I do this ususally.And every thing go right.

By the way. each size of the logs-galaxy-2018-12-11 per shard is about 850G.

Provide logs (if relevant):
No

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Distributed Indexing/CRUDA catch all label for issues around indexing, updating and getting a doc by id. Not search.>bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions