Skip to content

Conversation

@cbuescher
Copy link
Member

@cbuescher cbuescher commented Apr 10, 2017

When parsing aggregations and sub-aggregations we encounter cases where the
aggregation name is written to the response as key for a new nested aggregation
object on the same level that we want to parse other fields. Currently we cannot
use ObjectParser in this situations because it requires some ParseField for
matching the field name.

This adds a new declareUnknownFieldParser() method to ObjectParser that can be
used to declare one specific ContextParser that gets called if no other
FieldParsers that have been declared match.

Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cbuescher, I left some comments. Since this is really related to aggregation parsing, I guess it could have been opened against the feature branch?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm bad at naming things but I'd call this fallbackFieldParser

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it confusing to have a fallback/unknown field parser and still be able to ignore unknown fields. I'd make these two mutually exclusive, ie not possible to define an unknown field parser if ignoreUnknownFields is set to true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a check for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we do have to do both right? the logic should be: check whether the agg is a registered one by type, where the type is part of the name (type#name). If not, ignore the field (for forward compatibility).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super comfortable making this. I think maybe instead we should add the match skipping at the FieldParser level. Maybe some kind of subclass that skips or something. Not sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an update that subclasses FieldParser but I found the previous solution more convincing tbh. I'd like to change it back if you don't have other ideas but added a commit to discuss it. Maybe I did it in a bad way.

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that this solution addresses our parsing problems. We would rather need a way to match based on a prefix I guess, and still be able to ignore other fields that don't match anything else. With this solution I think we could declare a match all parser that calls parser.namedObject, but then we should be able to ignore unknown fields if there isn't a named object with such a name. It could also be that I don't see how we would use this, in that case please educate me :)

I was rather thinking that we'd need a name to declare a named object among other fields, while at the moment we only allow it under a specific object container which can only contain named objects of a specific type. Hope this makes sense.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we do have to do both right? the logic should be: check whether the agg is a registered one by type, where the type is part of the name (type#name). If not, ignore the field (for forward compatibility).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this example is not that close to reality. we cannot really parse such an agg as we need its type in the name (type#total_number_of_ratings). That is what makes the field not unknown, which is why I am not convinced that the mechanism should be a match all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the example. The test should now cover a more complex case with inner objects, although I didn't use any special aggregation parsers in the test to keep them more simple.

When parsing aggregations and sun-aggregations we encounter cases where the
aggregation name is written to the response as key for a new nested aggregation
object on the same level that we want to parse other fields. Currently we cannot
use ObjectParser in this situations because it requires some ParseField for
matching the field name.

This adds a new `declareUnknownFieldParser()` method to ObjectParser that can be
used to declare one specific ContextParser that gets called if no other
FieldParsers that have been declared match.
@cbuescher cbuescher force-pushed the addUnknownFieldParsing-ObjectParser branch 5 times, most recently from 8b21a61 to 6a1169a Compare May 9, 2017 09:58
@cbuescher cbuescher force-pushed the addUnknownFieldParsing-ObjectParser branch from 6a1169a to 59118cd Compare May 9, 2017 10:14
Copy link
Member Author

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the PR to address some previous concerns. I renamed the new method to declareMatchFieldParser and it requires an additional Predicate argument now that needs to match the field name we are currently parsing to apply the provided ContextParser. Otherwise we fall back to the current ignoreUnknownFields logic. I also changed the test to reflect those cases.

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments, I would love if @nik9000 can have a look and see if this is generic enough to go to ObjectParser and be useful in other cases than aggregations parsing.

*/
private final boolean ignoreUnknownFields;

/**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/precidate/predicate

/**
* A special purpose field parser that gets used when the provided matchFieldPrecidate accepts it
*/
private FieldParser matchFieldParser;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of the predicate. It could for instance be something like registry.supports(name, Aggregation.class). This way we could also support ignoreUnknownFields for forward compatibility

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to think if these can be a single field rather than two that are so tightly coupled. I couldn't come up with a good idea but maybe you will...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can combine these into an inner private struct-like class. I think they should be separate arguments in the declareMatchFieldParser method though, that way it's easier to use and we don't need to expose another class for this.

Copy link
Member Author

@cbuescher cbuescher May 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could for instance be something like registry.supports(name, Aggregation.class).

Upon further thought, with the current signature (Predicate<String>) this would be difficult since the only information the function "sees" is the field name, and the registry is not visible when declaring the parser in the static initialization context. But I think we can postpone this extensions for later, this is useful as is, wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BiPredicate<String, NamedXContentRegistry> ? is that ugly? :)

/**
* Declares a parser for fields where the exact field name is not known when
* creating the ObjectParser. In order for this field name to match, it has
* to be accepted by a provided predicate As an example, in the aggregation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a punctuation mark after predicate?

* the accepted values for this field
*/
public <T> void declareMatchFieldParser(BiConsumer<Value, T> consumer, ContextParser<Context, T> parser,
Predicate<String> fieldNameMatcher, ValueType type) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename fieldNameMatcher to fieldNamePredicate?

this.matchFieldParser = new MatchAllFieldParser((p, v, c) -> consumer.accept(v, parser.parse(p, c)), type);
}

private class MatchAllFieldParser extends FieldParser {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it still be called MatchAll although it parses only what matches a predicate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, will think about a better name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very original, but I renamed this to MatchfieldParser, similar to the method that it uses. Its not a public class so I think thats is okay.

return new Tuple<>(name, value);
}, s -> s.contains("#"), ObjectParser.ValueType.STRING);

if (ignoreUnknown == true) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need == true ? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I might have added that for readablility but I don't like it now that I see it.

@cbuescher
Copy link
Member Author

For an example of how this would be used in the context of aggregation parsing see https://github.com/elastic/elasticsearch/pull/24386/files#diff-76a2fb6a5f584d4c41b8f7046955bf6bR66

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of this. I imagine there are other contexts where this might be useful in addition to aggregation parsing.

Objects.requireNonNull(parser, "[parser] is required");
Objects.requireNonNull(fieldNameMatcher, "[fieldNameMatcher] is required");
Objects.requireNonNull(type, "[type] is required");
this.matchFieldPredicate = fieldNameMatcher;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also require that this.matchFieldPredicate is not already set. For now, let's limit ourselves to only supporting one of these on each parser.

}

/**
* test parsing fields with a random name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be more right to say that this tests support for parsing unknown fields.

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM I am good with getting this in once the few minors you have are addressed

@javanna javanna added v5.5.0 and removed v5.4.1 labels May 9, 2017
Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of comments/ideas, but LGTM already.

}
}

private class MatchFieldParser extends FieldParser {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why we need this class at this point. I guess it is because we don't provide a ParseField and the parseField.match part in assertSupports would not work. Would it be an idea to change parse field to always work with a predicate internally, and allow to pass the predicate in through a new constructor?


/**
* A special purpose field parser that gets used when the provided matchFieldPredicate accepts it
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be final?

@javanna javanna changed the title Add method to parse inner objects with unknown field names to ObjectParser ObjectParser: add method to parse inner objects with names matching a given predicate May 10, 2017
@javanna
Copy link
Member

javanna commented May 10, 2017

After discussing with @tlrx and @cbuescher we decided not to get this in for now. It seems like it would add some complexity to ObjectParser, for something that is not that complicated to do when manually parsing aggregations. We may revisit this decision in the future, especially if more users of this pop up.

@javanna
Copy link
Member

javanna commented Jun 9, 2017

given that we completed the aggs parsing for the high level client and we didn't feel the need to get this PR in, shall we close this PR at this point? WDYT @cbuescher ?

@cbuescher
Copy link
Member Author

@javanna sure, I still have the code around if needed for something else later on

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants