Skip to content

Include/Exclude Filtering Behavior #4491

@RobCherry

Description

@RobCherry

Currently there is a bug in elasticsearch (#4047) where empty objects are not stored in _source when an include/exclude list is present.

This is because elasticsearch aggressively removes empty objects from the _source.

For example, if I have an object

{ 'data': { 'key': 'value' } }

the following filters will all result in removing data from _source: excludes = ['data'], excludes = ['data.*'], excludes = ['data.key'], excludes = ['*.key'], includes = ['data.other']

I believe this behavior is incorrect. I think that we should only remove an object if it is explicitly excluded (excludes = ['data'], excludes = ['data.*']) or if no elements are included (includes = ['other_data.*']). For situations where the object is referenced in an includes list but there is no match, I think the object should remain as an empty object (includes = ['data.other']`).

This use case makes more sense if we are talking about some nested object that is indexed...

Example:

{
  "name": "John Doe",
  "identifiers": {
    "ssn": "987-65-4320",
    "facebook_uid": "12345",
    "twitter_uid": "54321"
  }

and excludes = [ "*.ssn"] would drop the entire identifiers object if the only key was ssn for that object, even if we want the empty identifiers object to remain under all circumstances. We have the same problem with includes = ["*.instagram_uid"].

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions