-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ObjectParser: add method to parse inner objects with names matching a given predicate #24020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ObjectParser: add method to parse inner objects with names matching a given predicate #24020
Conversation
tlrx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @cbuescher, I left some comments. Since this is really related to aggregation parsing, I guess it could have been opened against the feature branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm bad at naming things but I'd call this fallbackFieldParser
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it confusing to have a fallback/unknown field parser and still be able to ignore unknown fields. I'd make these two mutually exclusive, ie not possible to define an unknown field parser if ignoreUnknownFields is set to true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add a check for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but we do have to do both right? the logic should be: check whether the agg is a registered one by type, where the type is part of the name (type#name). If not, ignore the field (for forward compatibility).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super comfortable making this. I think maybe instead we should add the match skipping at the FieldParser level. Maybe some kind of subclass that skips or something. Not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an update that subclasses FieldParser but I found the previous solution more convincing tbh. I'd like to change it back if you don't have other ideas but added a commit to discuss it. Maybe I did it in a bad way.
javanna
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that this solution addresses our parsing problems. We would rather need a way to match based on a prefix I guess, and still be able to ignore other fields that don't match anything else. With this solution I think we could declare a match all parser that calls parser.namedObject, but then we should be able to ignore unknown fields if there isn't a named object with such a name. It could also be that I don't see how we would use this, in that case please educate me :)
I was rather thinking that we'd need a name to declare a named object among other fields, while at the moment we only allow it under a specific object container which can only contain named objects of a specific type. Hope this makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but we do have to do both right? the logic should be: check whether the agg is a registered one by type, where the type is part of the name (type#name). If not, ignore the field (for forward compatibility).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this example is not that close to reality. we cannot really parse such an agg as we need its type in the name (type#total_number_of_ratings). That is what makes the field not unknown, which is why I am not convinced that the mechanism should be a match all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the example. The test should now cover a more complex case with inner objects, although I didn't use any special aggregation parsers in the test to keep them more simple.
e05a94b to
2a91a5e
Compare
When parsing aggregations and sun-aggregations we encounter cases where the aggregation name is written to the response as key for a new nested aggregation object on the same level that we want to parse other fields. Currently we cannot use ObjectParser in this situations because it requires some ParseField for matching the field name. This adds a new `declareUnknownFieldParser()` method to ObjectParser that can be used to declare one specific ContextParser that gets called if no other FieldParsers that have been declared match.
8b21a61 to
6a1169a
Compare
6a1169a to
59118cd
Compare
cbuescher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the PR to address some previous concerns. I renamed the new method to declareMatchFieldParser and it requires an additional Predicate argument now that needs to match the field name we are currently parsing to apply the provided ContextParser. Otherwise we fall back to the current ignoreUnknownFields logic. I also changed the test to reflect those cases.
javanna
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few comments, I would love if @nik9000 can have a look and see if this is generic enough to go to ObjectParser and be useful in other cases than aggregations parsing.
| */ | ||
| private final boolean ignoreUnknownFields; | ||
|
|
||
| /** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/precidate/predicate
| /** | ||
| * A special purpose field parser that gets used when the provided matchFieldPrecidate accepts it | ||
| */ | ||
| private FieldParser matchFieldParser; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of the predicate. It could for instance be something like registry.supports(name, Aggregation.class). This way we could also support ignoreUnknownFields for forward compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am trying to think if these can be a single field rather than two that are so tightly coupled. I couldn't come up with a good idea but maybe you will...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can combine these into an inner private struct-like class. I think they should be separate arguments in the declareMatchFieldParser method though, that way it's easier to use and we don't need to expose another class for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could for instance be something like
registry.supports(name, Aggregation.class).
Upon further thought, with the current signature (Predicate<String>) this would be difficult since the only information the function "sees" is the field name, and the registry is not visible when declaring the parser in the static initialization context. But I think we can postpone this extensions for later, this is useful as is, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BiPredicate<String, NamedXContentRegistry> ? is that ugly? :)
| /** | ||
| * Declares a parser for fields where the exact field name is not known when | ||
| * creating the ObjectParser. In order for this field name to match, it has | ||
| * to be accepted by a provided predicate As an example, in the aggregation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a punctuation mark after predicate?
| * the accepted values for this field | ||
| */ | ||
| public <T> void declareMatchFieldParser(BiConsumer<Value, T> consumer, ContextParser<Context, T> parser, | ||
| Predicate<String> fieldNameMatcher, ValueType type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename fieldNameMatcher to fieldNamePredicate?
| this.matchFieldParser = new MatchAllFieldParser((p, v, c) -> consumer.accept(v, parser.parse(p, c)), type); | ||
| } | ||
|
|
||
| private class MatchAllFieldParser extends FieldParser { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it still be called MatchAll although it parses only what matches a predicate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, will think about a better name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not very original, but I renamed this to MatchfieldParser, similar to the method that it uses. Its not a public class so I think thats is okay.
| return new Tuple<>(name, value); | ||
| }, s -> s.contains("#"), ObjectParser.ValueType.STRING); | ||
|
|
||
| if (ignoreUnknown == true) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need == true ? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I might have added that for readablility but I don't like it now that I see it.
|
For an example of how this would be used in the context of aggregation parsing see https://github.com/elastic/elasticsearch/pull/24386/files#diff-76a2fb6a5f584d4c41b8f7046955bf6bR66 |
nik9000
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of this. I imagine there are other contexts where this might be useful in addition to aggregation parsing.
| Objects.requireNonNull(parser, "[parser] is required"); | ||
| Objects.requireNonNull(fieldNameMatcher, "[fieldNameMatcher] is required"); | ||
| Objects.requireNonNull(type, "[type] is required"); | ||
| this.matchFieldPredicate = fieldNameMatcher; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also require that this.matchFieldPredicate is not already set. For now, let's limit ourselves to only supporting one of these on each parser.
| } | ||
|
|
||
| /** | ||
| * test parsing fields with a random name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be more right to say that this tests support for parsing unknown fields.
javanna
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM I am good with getting this in once the few minors you have are addressed
javanna
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a couple of comments/ideas, but LGTM already.
| } | ||
| } | ||
|
|
||
| private class MatchFieldParser extends FieldParser { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why we need this class at this point. I guess it is because we don't provide a ParseField and the parseField.match part in assertSupports would not work. Would it be an idea to change parse field to always work with a predicate internally, and allow to pass the predicate in through a new constructor?
|
|
||
| /** | ||
| * A special purpose field parser that gets used when the provided matchFieldPredicate accepts it | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be final?
|
After discussing with @tlrx and @cbuescher we decided not to get this in for now. It seems like it would add some complexity to |
|
given that we completed the aggs parsing for the high level client and we didn't feel the need to get this PR in, shall we close this PR at this point? WDYT @cbuescher ? |
|
@javanna sure, I still have the code around if needed for something else later on |
When parsing aggregations and sub-aggregations we encounter cases where the
aggregation name is written to the response as key for a new nested aggregation
object on the same level that we want to parse other fields. Currently we cannot
use ObjectParser in this situations because it requires some ParseField for
matching the field name.
This adds a new
declareUnknownFieldParser()method to ObjectParser that can beused to declare one specific ContextParser that gets called if no other
FieldParsers that have been declared match.