Skip to content

Conversation

@nik9000
Copy link
Member

@nik9000 nik9000 commented Dec 6, 2016

Introduces XContentParser#namedObject which works a little like StreamInput#readNamedWriteable: on startup components register parsers under names and a superclass. At runtime we look up the parser and call it to parse the object.

Right now the parsers take a context object they use to help with the parsing but I hope to be able to eliminate the need for this context as most what it is used for at this point is to move around parser registries which should be replaced by this method eventually. I make no effort to do so in this PR because it is big enough already. This is meant to the a start down a road that allows us to remove classes like QueryParseContext, AggregatorParsers, IndicesQueriesRegistry, and ParseFieldRegistry.

The goal here is to reduce the amount of plumbing required to allow parsing pluggable things. With this you don't have to pass registries all over the place. Instead you must pass a super registry to fewer places and use it to wrap the reader. This is the same tradeoff that we use for NamedWriteable and it allows much, much simpler binary serialization. We think we want that same thing for xcontent serialization.

The only parsing actually converted to this method is parsing ScoreFunctions inside of FunctionScoreQuery. I chose this because it is relatively self contained.

.flatMap(Function.identity()).collect(Collectors.toList());
final NamedWriteableRegistry namedWriteableRegistry = new NamedWriteableRegistry(namedWriteables);
NamedXContentRegistry xContentRegistry = new NamedXContentRegistry(Stream.of(
searchModule.getNamedXContents().stream()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use this funny looking construct to match with the constructs above. It'll eventually have more than one module it has to stream, just like above.

b.bind(IndicesQueriesRegistry.class).toInstance(searchModule.getQueryParserRegistry());
b.bind(SearchRequestParsers.class).toInstance(searchModule.getSearchRequestParsers());
b.bind(SearchExtRegistry.class).toInstance(searchModule.getSearchExtRegistry());
b.bind(NamedXContentRegistry.class).toInstance(xContentRegistry);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate binding another thing but in subsequent PRs I'll use this to remove a bunch of things.

@nik9000 nik9000 added review and removed WIP labels Dec 6, 2016
@nik9000
Copy link
Member Author

nik9000 commented Dec 6, 2016

I think at least @javanna, @rjernst, @martijnvg and maybe others might have an interest in this.

@rjernst
Copy link
Member

rjernst commented Dec 6, 2016

I love the concept; it has been a pain to deal with those registries like QueryParserRegistry when trying to deguice the rest side of things. I left some comments about wrapping, as I think it makes things confusing and will be used wrong (hence all the need for the extra checks you have on passing wrapped vs unwrapped parsers).

@nik9000
Copy link
Member Author

nik9000 commented Dec 6, 2016

I left some comments about wrapping

They didn't stick. Do you want to make the registry required for building the XContentParser or something like that?

*/
public NamedXContentRegistry(List<Entry> entries) {
Map<Class<?>, Map<String, Entry>> registry = new HashMap<>();
for (Entry entry : entries) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this algorithm the same as that in the NamedWriteableRegistry? I think it is cleaner (does not require replacing the inner maps to make them unmodifiable.

* Wrap an {@link XContentParser} in one that implements {@link XContentParser#namedObject(Class, String, ParseFieldMatcherSupplier)}
* against this registry.
*/
public XContentParser wrap(XContentParser parser) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this inside xcontentfactory.creatParser? In fact why do we need a wrapper at all? Can't this just be native methods on XContentParser?

@nik9000
Copy link
Member Author

nik9000 commented Dec 6, 2016

I talked with @rjernst over another channel. I'm going to implement the suggestion to use NamedWriteableRegistry's algorithm for building the map of maps. I'm going to try and make the registry a required parameter for building the XContentParser. That'll balloon this PR significantly but this is the right time for it. I'll try and take some shortcuts, mostly moving where the parsers are built so we don't have thousands of spots where we build the parsers. That also means I'll remove all the methods called wrap.

@nik9000 nik9000 added WIP and removed review labels Dec 6, 2016
nik9000 added a commit that referenced this pull request Dec 10, 2016
To get #22003 in cleanly we need to centralize as much `XContentParser` creation as possible into `RestRequest`. That'll mean we have to plumb the `NamedXContentRegistry` into fewer places.

This removes `RestAction.hasBody`, `RestAction.guessBodyContentType`, and `RestActions.getRestContent`, moving callers over to `RestRequest.hasContentOrSourceParam`, `RestRequest.contentOrSourceParam`, and `RestRequest.contentOrSourceParamParser` and `RestRequest.withContentOrSourceParamParserOrNull`. The idea is to use `withContentOrSourceParamParserOrNull` if you need to handle requests without any sort of body content and to use `contentOrSourceParamParser` otherwise.

I believe the vast majority of this PR to be purely mechanical but I know I've made the following behavioral change (I'll add more if I think of more):
* If you make a request to an endpoint that requires a request body and has cut over to the new APIs instead of getting `Failed to derive xcontent` you'll get `Body required`.
* Template parsing is now non-strict by default. This is important because we need to be able to deprecate things without requests failing.
@nik9000 nik9000 changed the title Introduce XContentParser#namedXObject Introduce XContentParser#namedObject Dec 16, 2016
@nik9000
Copy link
Member Author

nik9000 commented Dec 16, 2016

OK! I've got the code compiling. I've taken a bunch of shortcuts, used NamedXContentRegistry.EMPTY in a bunch of places where it probably can't be forever. I'm now working to get the tests passing. Once they pass I'll update.

@nik9000
Copy link
Member Author

nik9000 commented Dec 16, 2016

Everything is passing locally!

Copy link
Contributor

@imotov imotov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When parsing stored metadata I will need a way to distinguish between unknown custom element and an error parsing such element. The former typically comes when elasticsearch starts after a plugin with custom metadata was removed from the cluster and therefore can be ignored and the latter means that something got very wrong and we cannot ignore it. I think this warrants some other exception.

Copy link
Member

@martijnvg martijnvg Dec 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately we have to be lenient when reading cluster state with custom metadata...

(that is why we have all these lookupPrototypeSafe(...) and lookupPrototype(...) static methods on ClusterState, Metadata and IndexMetadata)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NamedXContentRegistry.EMPTY is currently used in two use cases. 1) when we use XContentParser as a lexer to convert XContent from one format to another or convert it into map and 2) where XContentParser is feeding into a parser, which doesn't used named objects at the moment. As we discussed yesterday, it would be great to distinguish between these two use cases by moving all uses in the first category into helper methods.

@nik9000
Copy link
Member Author

nik9000 commented Dec 19, 2016

@martijnvg and @imotov: for the unknown custom prototype stuff, would you prefer a new subclass of ParsingException or another version of namedObject that returns Optional and is empty when it can't find the right parser? I kind of prefer the former because that means I only need one method.

@nik9000
Copy link
Member Author

nik9000 commented Dec 19, 2016

@imotov I pushed some commits that add an explanation comment for every non-test usage of NamedXContent.EMPTY and removes many of the usages.

@imotov
Copy link
Contributor

imotov commented Dec 19, 2016

@nik9000 @martijnvg the reason I asked to add another exception is because I think that a missing parser is still an error, but we react to this error in a different way comparing to the corrupted xcontent stream that we cannot parse. Returning Optional would work for me, but I think it wouldn't be semantically correct since in my mind it would indicate that object doesn't exist (rather than object exits but we cannot read it)

@nik9000
Copy link
Member Author

nik9000 commented Dec 19, 2016

I'm happy to make a new exception!

@nik9000
Copy link
Member Author

nik9000 commented Dec 19, 2016

@imotov, can you have another look?

Copy link
Contributor

@imotov imotov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of minor comments. Otherwise LGTM.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fairly confused about what's going on here. Could you add a comment or explain here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is in.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah. I'll rebase before merging and use it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I will need the registry here, but for this PR, it's probably OK :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EMPTY here is good though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one of these nulls should be xContentRegistry. I will actually need it there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and most likely here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xContentRegistry is not used in this method

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed!

Introduces `XContentParser#namedObject which works a little like
`StreamInput#readNamedWriteable`: on startup components register
parsers under names and a superclass. At runtime we look up the
parser and call it to parse the object.

Right now the parsers take a context object they use to help with
the parsing but I hope to be able to eliminate the need for this
context as most what it is used for at this point is to move
around parser registries which should be replaced by this method
eventually. I make no effort to do so in this PR because it is
big enough already. This is meant to the a start down a road that
allows us to remove classes like `QueryParseContext`,
`AggregatorParsers`, `IndicesQueriesRegistry`, and
`ParseFieldRegistry`.

The goal here is to reduce the amount of plumbing required to
allow parsing pluggable things. With this you don't have to pass
registries all over the place. Instead you must pass a super
registry to fewer places and use it to wrap the reader. This is
the same tradeoff that we use for NamedWriteable and it allows
much, much simpler binary serialization. We think we want that
same thing for xcontent serialization.

The only parsing actually converted to this method is parsing
`ScoreFunctions` inside of `FunctionScoreQuery`. I chose this
because it is relatively self contained.
@nik9000 nik9000 merged commit a04dcfb into elastic:master Dec 20, 2016
@nik9000
Copy link
Member Author

nik9000 commented Dec 20, 2016

Thanks for reviewing @imotov! I've merged to:

master: a04dcfb
5.x: 08167d2

nik9000 added a commit that referenced this pull request Dec 20, 2016
Introduces `XContentParser#namedObject which works a little like
`StreamInput#readNamedWriteable`: on startup components register
parsers under names and a superclass. At runtime we look up the
parser and call it to parse the object.

Right now the parsers take a context object they use to help with
the parsing but I hope to be able to eliminate the need for this
context as most what it is used for at this point is to move
around parser registries which should be replaced by this method
eventually. I make no effort to do so in this PR because it is
big enough already. This is meant to the a start down a road that
allows us to remove classes like `QueryParseContext`,
`AggregatorParsers`, `IndicesQueriesRegistry`, and
`ParseFieldRegistry`.

The goal here is to reduce the amount of plumbing required to
allow parsing pluggable things. With this you don't have to pass
registries all over the place. Instead you must pass a super
registry to fewer places and use it to wrap the reader. This is
the same tradeoff that we use for NamedWriteable and it allows
much, much simpler binary serialization. We think we want that
same thing for xcontent serialization.

The only parsing actually converted to this method is parsing
`ScoreFunctions` inside of `FunctionScoreQuery`. I chose this
because it is relatively self contained.
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jan 9, 2017
Removes `AggregatorParsers`, replacing all of its functionality with
`XContentParser#namedObject`.

This is the third bit of payoff from elastic#22003, one less thing to pass
around the entire application.
nik9000 added a commit that referenced this pull request Jan 9, 2017
Removes `AggregatorParsers`, replacing all of its functionality with
`XContentParser#namedObject`.

This is the third bit of payoff from #22003, one less thing to pass
around the entire application.
nik9000 added a commit that referenced this pull request Jan 9, 2017
Removes `AggregatorParsers`, replacing all of its functionality with
`XContentParser#namedObject`.

This is the third bit of payoff from #22003, one less thing to pass
around the entire application.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants