Skip to content

Conversation

@astefan
Copy link
Contributor

@astefan astefan commented Dec 18, 2020

Fixes a bug where the field value extraction for an ip field that has ignore_malformed=true still happens from _source even though the value might be ignored because it's not a valid IP.

Fixes also #66675.

@astefan astefan requested a review from costin December 18, 2020 18:07
@elasticmachine elasticmachine added Team:QL (Deprecated) Meta label for query languages team labels Dec 18, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

1 similar comment
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. It's worth clarifying why the type is relevant when parsing ignore malformed.

&& hit.getFields().containsKey(IgnoredFieldMapper.NAME)
&& isFromDocValuesOnly(dataType) == false
&& dataType.isNumeric()) {
&& (dataType.isNumeric() || dataType == DataTypes.IP)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any cases where the type is relevant for ignore malformed is set and the return value should not be null?
It is currently available for numeric, date , geo-points and IP (https://www.elastic.co/guide/en/elasticsearch/reference/current/ignore-malformed.html).
I would argue that if it is set, null should be returned regardless of the data type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@costin Not 100% sure if null is always the answer. A null_value parameter, for example, doesn't apply in this case. If the parsing fails and ignore_malformed is true setting the null_value parameter will not return that value in this case. That field will still not be indexed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we don't know if the null value was set and thus return null. However my comment is regarding the data type, why does it matter?
If ignore malformed is set and the type is geo for example (which is not part of the if) the source data which is malformed is returned.
That is just like now for IP which this ticket addresses.
I think returning null is a safer choice the returning potentially invalid data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

geo_point is taken from docvalues, not from _source. shape doesn't seem to support docvalues and thus can only be extracted from _source. The same goes for keyword, date, datetime, scaled_float.
And in this case we already have the null value from docvalues and we don't need the _source to extract data.

But, I see your point about checking the type itself. I think there is no need for this check, the presence of the ignored section in the response is enough

@astefan
Copy link
Contributor Author

astefan commented Dec 21, 2020

@elasticmachine run elasticsearch-ci/1

@astefan astefan merged commit 106719f into elastic:master Dec 21, 2020
@astefan astefan deleted the ip_field_hit_extractor branch December 21, 2020 12:14
astefan added a commit to astefan/elasticsearch that referenced this pull request Dec 21, 2020
…lastic#66622)

Return null for any field that is present in the _ignored section of the
response, not only numerics and IPs.

(cherry picked from commit 106719f)
astefan added a commit to astefan/elasticsearch that referenced this pull request Dec 21, 2020
…lastic#66622)

Return null for any field that is present in the _ignored section of the
response, not only numerics and IPs.

(cherry picked from commit 106719f)
astefan added a commit that referenced this pull request Dec 21, 2020
…66622) (#66685)

Return null for any field that is present in the _ignored section of the
response, not only numerics and IPs.

(cherry picked from commit 106719f)
astefan added a commit that referenced this pull request Dec 21, 2020
…66622) (#66686)

Return null for any field that is present in the _ignored section of the
response, not only numerics and IPs.

(cherry picked from commit 106719f)
costin pushed a commit that referenced this pull request Dec 22, 2020
…66622) (#66685)

Return null for any field that is present in the _ignored section of the
response, not only numerics and IPs.

(cherry picked from commit 106719f)
(cherry picked from commit b27d167)
@costin
Copy link
Member

costin commented Dec 22, 2020

Pushed the fix into 7.11 branch as well.

@jakelandis jakelandis removed the v8.0.0 label Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants