-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Enable _terms_enum on ip fields
#94322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The _terms_enum API currently does not support ip fields. However, type-ahead-like completion is useful for UI purposes. This change adds the ability to query `ip` fields via the _terms_enum API by leveraging the terms enumeration available when doc_values are enabled on the field, which is the default. In order to make prefix filtering fast, we internally create a fast prefix automaton from the user-supplied prefix that gets intersected with the shards terms enumeration, similar to what we do for keyword fields already. Closes elastic#89933
|
Documentation preview: |
|
Pinging @elastic/es-search (Team:Search) |
|
Hi @cbuescher, I've created a changelog YAML for you. |
romseygeek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, neatly done.
|
@elasticmachine run elasticsearch-ci/bwc |
|
@romseygeek thanks a lot for reviewing |
|
Just out of curiosity and for documentation purposes: I was interested in the average size of the prefix automata this creates for small prefix sizes (1-7 characters seems common enough). Since the automaton lets you see the number of states, transitions and estimated size in bytes I did a few very limited experiments (1000 random Ips, prefix length random between 1-7 chars) and got the following rough estimates: |
The _terms_enum API currently does not support ip fields. However,
type-ahead-like completion is useful for UI purposes.
This change adds the ability to query
ipfields via the _terms_enum API byleveraging the terms enumeration available when doc_values are enabled on the
field, which is the default. In order to make prefix filtering fast, we
internally create a fast prefix automaton from the user-supplied prefix that
gets intersected with the shards terms enumeration, similar to what we do for
keyword fields already.
Closes #89933