-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Currently there is a bug in elasticsearch (#4047) where empty objects are not stored in _source when an include/exclude list is present.
This is because elasticsearch aggressively removes empty objects from the _source.
For example, if I have an object
{ 'data': { 'key': 'value' } }
the following filters will all result in removing data from _source: excludes = ['data'], excludes = ['data.*'], excludes = ['data.key'], excludes = ['*.key'], includes = ['data.other']
I believe this behavior is incorrect. I think that we should only remove an object if it is explicitly excluded (excludes = ['data'], excludes = ['data.*']) or if no elements are included (includes = ['other_data.*']). For situations where the object is referenced in an includes list but there is no match, I think the object should remain as an empty object (includes = ['data.other']`).
This use case makes more sense if we are talking about some nested object that is indexed...
Example:
{
"name": "John Doe",
"identifiers": {
"ssn": "987-65-4320",
"facebook_uid": "12345",
"twitter_uid": "54321"
}
and excludes = [ "*.ssn"] would drop the entire identifiers object if the only key was ssn for that object, even if we want the empty identifiers object to remain under all circumstances. We have the same problem with includes = ["*.instagram_uid"].