Skip to content

Conversation

@cbuescher
Copy link
Member

The _terms_enum API currently does not support ip fields. However,
type-ahead-like completion is useful for UI purposes.
This change adds the ability to query ip fields via the _terms_enum API by
leveraging the terms enumeration available when doc_values are enabled on the
field, which is the default. In order to make prefix filtering fast, we
internally create a fast prefix automaton from the user-supplied prefix that
gets intersected with the shards terms enumeration, similar to what we do for
keyword fields already.

Closes #89933

The _terms_enum API currently does not support ip fields. However,
type-ahead-like completion is useful for UI purposes.
This change adds the ability to query `ip` fields via the _terms_enum API by
leveraging the terms enumeration available when doc_values are enabled on the
field, which is the default. In order to make prefix filtering fast, we
internally create a fast prefix automaton from the user-supplied prefix that
gets intersected with the shards terms enumeration, similar to what we do for
keyword fields already.

Closes elastic#89933
@cbuescher cbuescher added >enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types v8.7.1 labels Mar 6, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2023

Documentation preview:

@cbuescher cbuescher requested a review from romseygeek March 6, 2023 11:19
@elasticsearchmachine elasticsearchmachine added Team:Search Meta label for search team v8.8.0 labels Mar 6, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@elasticsearchmachine
Copy link
Collaborator

Hi @cbuescher, I've created a changelog YAML for you.

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, neatly done.

@cbuescher
Copy link
Member Author

@elasticmachine run elasticsearch-ci/bwc

@cbuescher
Copy link
Member Author

@romseygeek thanks a lot for reviewing

@cbuescher
Copy link
Member Author

Just out of curiosity and for documentation purposes: I was interested in the average size of the prefix automata this creates for small prefix sizes (1-7 characters seems common enough). Since the automaton lets you see the number of states, transitions and estimated size in bytes I did a few very limited experiments (1000 random Ips, prefix length random between 1-7 chars) and got the following rough estimates:

Avg. num states: 21
Avg. num transitions: 21
Avg. ram bytes: 675

@cbuescher cbuescher merged commit d802136 into elastic:main Mar 7, 2023
@cbuescher cbuescher deleted the terms-enum-ipField branch March 7, 2023 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team v8.7.1 v8.8.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add _terms_enum support for the ip field type

3 participants