Give _uid doc values

We already use fielddata on the `_uid` field today in order to implement random sorting. However, given that doc values are disabled on `_uid`, this will use an insane amount of memory in order to load information in memory given that this field only has unique values.

Having better fielddata for `_uid` would also be useful in order to have more consistent sort order when paginating or hitting different replicas: we could always add a tie-break on the value of the `_uid` field.

I think we have several options:
- Option 1: Add SORTED doc values to `_uid`
- Option 2: Add BINARY doc values to `_uid`
- Option 3: Add SORTED doc values to `_type` and `_id`
- Option 4: Add SORTED doc values to `_type` and BINARY to `_id` 

Option 2 would probably be wasteful in terms of disk space given that we don't have good compression available for binary doc values (and it's hard to implement given that the values can store pretty much anything).

Options 3 and 4 have the benefit of not having to duplicate information if we also want to have doc values on `_type` and `_id`: we could even build a BINARY fielddata view for `_uid`.

Then the other question is whether we should rather use sorted or binary doc values, the former being better for sorting (useful for the consistent sorting use-case) and the latter being better for value lookups (useful for random sorting).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Give _uid doc values #11887

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Give _uid doc values #11887

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions