-
Notifications
You must be signed in to change notification settings - Fork 25.6k
QL: handle IP type fields extraction with ignore_malformed property #66622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Pinging @elastic/es-ql (Team:QL) |
1 similar comment
|
Pinging @elastic/es-ql (Team:QL) |
costin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. It's worth clarifying why the type is relevant when parsing ignore malformed.
| && hit.getFields().containsKey(IgnoredFieldMapper.NAME) | ||
| && isFromDocValuesOnly(dataType) == false | ||
| && dataType.isNumeric()) { | ||
| && (dataType.isNumeric() || dataType == DataTypes.IP)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any cases where the type is relevant for ignore malformed is set and the return value should not be null?
It is currently available for numeric, date , geo-points and IP (https://www.elastic.co/guide/en/elasticsearch/reference/current/ignore-malformed.html).
I would argue that if it is set, null should be returned regardless of the data type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@costin Not 100% sure if null is always the answer. A null_value parameter, for example, doesn't apply in this case. If the parsing fails and ignore_malformed is true setting the null_value parameter will not return that value in this case. That field will still not be indexed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, we don't know if the null value was set and thus return null. However my comment is regarding the data type, why does it matter?
If ignore malformed is set and the type is geo for example (which is not part of the if) the source data which is malformed is returned.
That is just like now for IP which this ticket addresses.
I think returning null is a safer choice the returning potentially invalid data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
geo_point is taken from docvalues, not from _source. shape doesn't seem to support docvalues and thus can only be extracted from _source. The same goes for keyword, date, datetime, scaled_float.
And in this case we already have the null value from docvalues and we don't need the _source to extract data.
But, I see your point about checking the type itself. I think there is no need for this check, the presence of the ignored section in the response is enough
response, not only numerics and IPs. Add more tests
…o ip_field_hit_extractor
|
@elasticmachine run elasticsearch-ci/1 |
…lastic#66622) Return null for any field that is present in the _ignored section of the response, not only numerics and IPs. (cherry picked from commit 106719f)
…lastic#66622) Return null for any field that is present in the _ignored section of the response, not only numerics and IPs. (cherry picked from commit 106719f)
|
Pushed the fix into 7.11 branch as well. |
Fixes a bug where the field value extraction for an ip field that has
ignore_malformed=truestill happens from_sourceeven though the value might be ignored because it's not a valid IP.Fixes also #66675.