-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Elasticsearch version (bin/elasticsearch --version):
Version: 6.3.0, Build: default/tar/424e937/2018-06-11T23:38:03.357887Z, JVM: 1.8.0_102
Plugins installed: []
JVM version (java -version):
java version "1.8.0_102"
Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)
OS version (uname -a:
Darwin Kernel Version 17.6.0: Tue May 8 15:22:16 PDT 2018; root:xnu-4570.61.1~1/RELEASE_X86_64 x86_64
Description of the problem including expected versus actual behavior:
When terms aggregation is applied to a Painless script, then empty strings, returned by the script, are ignored (there is no bucket for them in the response).
Expected behavior: there should be a bucket for empty strings.
Steps to reproduce:
# delete the index
curl -XDELETE localhost:9200/test
# re-create the index
curl -XPUT localhost:9200/test -H "Content-Type: application/json" -d '{
"mappings": {
"test": {
"properties" : {
"string_with_empty_values": {
"type" : "keyword"
}
}
}
}
}'
# insert documents
curl -XPOST localhost:9200/test/test -H "Content-Type: application/json" -d '{
"string_with_empty_values": "Not empty"
}'
curl -XPOST localhost:9200/test/test -H "Content-Type: application/json" -d '{
"string_with_empty_values": ""
}'
curl -XPOST localhost:9200/test/test -H "Content-Type: application/json" -d '{
"string_with_empty_values": null
}'
# aggregate using a script
curl -XPOST localhost:9200/test/_search -H "Content-Type: application/json" -d '{
"size": 0,
"aggregations": {
"string_with_empty_values_terms": {
"terms": {
"script": {
"source": "doc['\''string_with_empty_values'\''].value",
"lang": "painless"
}
}
},
"string_with_empty_values_missing": {
"missing": {
"script": {
"source": "doc['\''string_with_empty_values'\''].value",
"lang": "painless"
}
}
}
}
}'
The response will look like the following (note the absence of a bucket for an empty term):
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 0,
"hits": []
},
"aggregations": {
"string_with_empty_values_terms": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Not empty",
"doc_count": 1
}
]
},
"string_with_empty_values_missing": {
"doc_count": 1
}
}
}
The same response will be received for the following query:
curl -XPOST localhost:9200/test/_search -H "Content-Type: application/json" -d '{
"size": 0,
"aggregations": {
"string_with_empty_values_terms": {
"terms": {
"script": {
"source": "params._source.string_with_empty_values",
"lang": "painless"
}
}
},
"string_with_empty_values_missing": {
"missing": {
"script": {
"source": "params._source.string_with_empty_values",
"lang": "painless"
}
}
}
}
}'
At the same time, an aggregation by the field directly:
curl -XPOST localhost:9200/test/_search -H "Content-Type: application/json" -d '{
"size": 0,
"aggregations": {
"string_with_empty_values_terms": {
"terms": {
"field": "string_with_empty_values"
}
},
"string_with_empty_values_missing": {
"missing": {
"field": "string_with_empty_values"
}
}
}
}'
returns correct result:
{
"took": 12,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 0,
"hits": []
},
"aggregations": {
"string_with_empty_values_terms": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "",
"doc_count": 1
},
{
"key": "Not empty",
"doc_count": 1
}
]
},
"string_with_empty_values_missing": {
"doc_count": 1
}
}
}