-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Introduce XContentParser#namedObject #22003
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| .flatMap(Function.identity()).collect(Collectors.toList()); | ||
| final NamedWriteableRegistry namedWriteableRegistry = new NamedWriteableRegistry(namedWriteables); | ||
| NamedXContentRegistry xContentRegistry = new NamedXContentRegistry(Stream.of( | ||
| searchModule.getNamedXContents().stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use this funny looking construct to match with the constructs above. It'll eventually have more than one module it has to stream, just like above.
| b.bind(IndicesQueriesRegistry.class).toInstance(searchModule.getQueryParserRegistry()); | ||
| b.bind(SearchRequestParsers.class).toInstance(searchModule.getSearchRequestParsers()); | ||
| b.bind(SearchExtRegistry.class).toInstance(searchModule.getSearchExtRegistry()); | ||
| b.bind(NamedXContentRegistry.class).toInstance(xContentRegistry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hate binding another thing but in subsequent PRs I'll use this to remove a bunch of things.
|
I think at least @javanna, @rjernst, @martijnvg and maybe others might have an interest in this. |
|
I love the concept; it has been a pain to deal with those registries like QueryParserRegistry when trying to deguice the rest side of things. I left some comments about wrapping, as I think it makes things confusing and will be used wrong (hence all the need for the extra checks you have on passing wrapped vs unwrapped parsers). |
They didn't stick. Do you want to make the registry required for building the XContentParser or something like that? |
| */ | ||
| public NamedXContentRegistry(List<Entry> entries) { | ||
| Map<Class<?>, Map<String, Entry>> registry = new HashMap<>(); | ||
| for (Entry entry : entries) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this algorithm the same as that in the NamedWriteableRegistry? I think it is cleaner (does not require replacing the inner maps to make them unmodifiable.
| * Wrap an {@link XContentParser} in one that implements {@link XContentParser#namedObject(Class, String, ParseFieldMatcherSupplier)} | ||
| * against this registry. | ||
| */ | ||
| public XContentParser wrap(XContentParser parser) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't this inside xcontentfactory.creatParser? In fact why do we need a wrapper at all? Can't this just be native methods on XContentParser?
|
I talked with @rjernst over another channel. I'm going to implement the suggestion to use NamedWriteableRegistry's algorithm for building the map of maps. I'm going to try and make the registry a required parameter for building the XContentParser. That'll balloon this PR significantly but this is the right time for it. I'll try and take some shortcuts, mostly moving where the parsers are built so we don't have thousands of spots where we build the parsers. That also means I'll remove all the methods called |
To get #22003 in cleanly we need to centralize as much `XContentParser` creation as possible into `RestRequest`. That'll mean we have to plumb the `NamedXContentRegistry` into fewer places. This removes `RestAction.hasBody`, `RestAction.guessBodyContentType`, and `RestActions.getRestContent`, moving callers over to `RestRequest.hasContentOrSourceParam`, `RestRequest.contentOrSourceParam`, and `RestRequest.contentOrSourceParamParser` and `RestRequest.withContentOrSourceParamParserOrNull`. The idea is to use `withContentOrSourceParamParserOrNull` if you need to handle requests without any sort of body content and to use `contentOrSourceParamParser` otherwise. I believe the vast majority of this PR to be purely mechanical but I know I've made the following behavioral change (I'll add more if I think of more): * If you make a request to an endpoint that requires a request body and has cut over to the new APIs instead of getting `Failed to derive xcontent` you'll get `Body required`. * Template parsing is now non-strict by default. This is important because we need to be able to deprecate things without requests failing.
9f046f9 to
55b24bb
Compare
|
OK! I've got the code compiling. I've taken a bunch of shortcuts, used NamedXContentRegistry.EMPTY in a bunch of places where it probably can't be forever. I'm now working to get the tests passing. Once they pass I'll update. |
1f067ab to
be1dad4
Compare
|
Everything is passing locally! |
imotov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When parsing stored metadata I will need a way to distinguish between unknown custom element and an error parsing such element. The former typically comes when elasticsearch starts after a plugin with custom metadata was removed from the cluster and therefore can be ignored and the latter means that something got very wrong and we cannot ignore it. I think this warrants some other exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately we have to be lenient when reading cluster state with custom metadata...
(that is why we have all these lookupPrototypeSafe(...) and lookupPrototype(...) static methods on ClusterState, Metadata and IndexMetadata)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NamedXContentRegistry.EMPTY is currently used in two use cases. 1) when we use XContentParser as a lexer to convert XContent from one format to another or convert it into map and 2) where XContentParser is feeding into a parser, which doesn't used named objects at the moment. As we discussed yesterday, it would be great to distinguish between these two use cases by moving all uses in the first category into helper methods.
|
@martijnvg and @imotov: for the unknown custom prototype stuff, would you prefer a new subclass of |
|
@imotov I pushed some commits that add an explanation comment for every non-test usage of |
|
@nik9000 @martijnvg the reason I asked to add another exception is because I think that a missing parser is still an error, but we react to this error in a different way comparing to the corrupted xcontent stream that we cannot parse. Returning Optional would work for me, but I think it wouldn't be semantically correct since in my mind it would indicate that object doesn't exist (rather than object exits but we cannot read it) |
|
I'm happy to make a new exception! |
|
@imotov, can you have another look? |
imotov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a couple of minor comments. Otherwise LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fairly confused about what's going on here. Could you add a comment or explain here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yah. I'll rebase before merging and use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I will need the registry here, but for this PR, it's probably OK :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EMPTY here is good though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one of these nulls should be xContentRegistry. I will actually need it there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and most likely here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
xContentRegistry is not used in this method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed!
Introduces `XContentParser#namedObject which works a little like `StreamInput#readNamedWriteable`: on startup components register parsers under names and a superclass. At runtime we look up the parser and call it to parse the object. Right now the parsers take a context object they use to help with the parsing but I hope to be able to eliminate the need for this context as most what it is used for at this point is to move around parser registries which should be replaced by this method eventually. I make no effort to do so in this PR because it is big enough already. This is meant to the a start down a road that allows us to remove classes like `QueryParseContext`, `AggregatorParsers`, `IndicesQueriesRegistry`, and `ParseFieldRegistry`. The goal here is to reduce the amount of plumbing required to allow parsing pluggable things. With this you don't have to pass registries all over the place. Instead you must pass a super registry to fewer places and use it to wrap the reader. This is the same tradeoff that we use for NamedWriteable and it allows much, much simpler binary serialization. We think we want that same thing for xcontent serialization. The only parsing actually converted to this method is parsing `ScoreFunctions` inside of `FunctionScoreQuery`. I chose this because it is relatively self contained.
09e2aec to
c7d8860
Compare
Introduces `XContentParser#namedObject which works a little like `StreamInput#readNamedWriteable`: on startup components register parsers under names and a superclass. At runtime we look up the parser and call it to parse the object. Right now the parsers take a context object they use to help with the parsing but I hope to be able to eliminate the need for this context as most what it is used for at this point is to move around parser registries which should be replaced by this method eventually. I make no effort to do so in this PR because it is big enough already. This is meant to the a start down a road that allows us to remove classes like `QueryParseContext`, `AggregatorParsers`, `IndicesQueriesRegistry`, and `ParseFieldRegistry`. The goal here is to reduce the amount of plumbing required to allow parsing pluggable things. With this you don't have to pass registries all over the place. Instead you must pass a super registry to fewer places and use it to wrap the reader. This is the same tradeoff that we use for NamedWriteable and it allows much, much simpler binary serialization. We think we want that same thing for xcontent serialization. The only parsing actually converted to this method is parsing `ScoreFunctions` inside of `FunctionScoreQuery`. I chose this because it is relatively self contained.
Removes `AggregatorParsers`, replacing all of its functionality with `XContentParser#namedObject`. This is the third bit of payoff from elastic#22003, one less thing to pass around the entire application.
Removes `AggregatorParsers`, replacing all of its functionality with `XContentParser#namedObject`. This is the third bit of payoff from #22003, one less thing to pass around the entire application.
Removes `AggregatorParsers`, replacing all of its functionality with `XContentParser#namedObject`. This is the third bit of payoff from #22003, one less thing to pass around the entire application.
Introduces
XContentParser#namedObjectwhich works a little likeStreamInput#readNamedWriteable: on startup components register parsers under names and a superclass. At runtime we look up the parser and call it to parse the object.Right now the parsers take a
contextobject they use to help with the parsing but I hope to be able to eliminate the need for this context as most what it is used for at this point is to move around parser registries which should be replaced by this method eventually. I make no effort to do so in this PR because it is big enough already. This is meant to the a start down a road that allows us to remove classes likeQueryParseContext,AggregatorParsers,IndicesQueriesRegistry, andParseFieldRegistry.The goal here is to reduce the amount of plumbing required to allow parsing pluggable things. With this you don't have to pass registries all over the place. Instead you must pass a super registry to fewer places and use it to wrap the reader. This is the same tradeoff that we use for NamedWriteable and it allows much, much simpler binary serialization. We think we want that same thing for xcontent serialization.
The only parsing actually converted to this method is parsing
ScoreFunctions inside ofFunctionScoreQuery. I chose this because it is relatively self contained.