-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Closed
Closed
Copy link
Labels
Description
If a transform task fails in the search phase due to a mapping conflict or a scripting error the error is handled as a temporary search problem, search is re-tried (10 times) and eventually the task is put into FAILED state with reason: "task encountered more than 10 failures; latest failure: Partial shards failure", audit only contains "Partial shards failure".
The real issue can only be found in the logs, e.g.
Caused by: org.elasticsearch.ElasticsearchException$1: Fielddata is disabled on text fields by default. Set fielddata=true on [...] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.
or
org.elasticsearch.script.ScriptException: runtime error
...
Caused by: java.lang.IllegalArgumentException: No field found for [field_b] in mapping
Solution
We need to unwrap search failures and check for inner problems:
- do not retry if it turns out to be a irrecoverable error (like we do for other errors like this)
- message the real error as reason in
_statsand as audit message
Repro
Case 1
- create 2 indexes with 2 fields, use keyword fields:
field_a,field_bfield_a,field_c
- create a transform group by
field_awith a scripted metric agg that accessesfield_bwithout a guard:
"scripted_metric": {
"init_script": "state.b = new String()",
"map_script": "state.b = doc['field_b']",
"combine_script": "return state.b",
"reduce_script": "return states"
}
The transform should fail with a ScriptException
Case 2
- create 2 indexes with 2 fields, map
field_afor the 2nd index totext:field_a,field_bfield_a,field_b
- create a transform, group by
field_a
The transform should fail with an ElasticsearchException: Fielddata is disabled on text fields by default.
/CC @tsg
tsg