Memory efficient source filtering

**Example:** 

Using Twitter as an example, each user is a document, and each tweet is a document nested under the user. For active users, each document can end up with thousands of tweets and thus a single document can be a few megabytes in size. 

``` json
{
  "userId": "1",
  "tweets": [
    {
      "id": 1,
      "message": "tweet 1",
      
    },
    {
      "id": 2,
      "message": "tweet 2"
    },
   ...
  ]
}
```

**Use Case:**
We want to find users that have used a specific hashtag in their tweets and view only those tweets. We use source filtering and nested inner hit queries to get back just the users and matching tweets. 

**Problem:**
Even though we are using source filtering, ElasticSearch will load the entire document into memory before doing source filtering. Since each record is so large, that means with any real throughput, we see constant garbage collection happening in the logs.

**Feature Request:**
Can you load filtered source in a more memory efficient manner - where you do not have to load the entire source into memory first?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Memory efficient source filtering #25168

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Memory efficient source filtering #25168

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions