Skip to content

Conversation

@tlrx
Copy link
Member

@tlrx tlrx commented Jun 15, 2017

This commit changes the parsing logic of DocWriteResponse, ReplicationResponse and GetResult so that it skips any unknown additional fields (for forward compatibility reasons). This affects the IndexResponse, UpdateResponse, DeleteResponse and GetResponse objects.

@tlrx tlrx added :Java High Level REST Client >test Issues or PRs that are addressing/adding tests v5.6.0 v6.0.0 labels Jun 15, 2017
@tlrx tlrx requested review from cbuescher and javanna June 15, 2017 15:26
Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM besides the suggestions I left

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: shall we be more accurate in verifying that the current token is a start_array? anyways skipChildren won't have any effect otherwise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it will avoid to execute a noop since the verification will be done anyway by Jackson

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above: shall we be more accurate in checking what the current token is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here too, same as above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we are making this parser too lenient with this change. it's ok to skip what we don't know about, but maybe we can avoid returning null as the array with at least a value is required?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this is too lenient and also breaks the current GET behavior: the high level client ignores the 404 response code and tries to parse the response body again. If the parsing is too lenient then a potential "error" is just skipped and that's wrong. I'm looking at this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the code to make it throw a ParsingException when the found field has not been parsed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we can avoid having this null check, related to comment above

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I left two minor suggestions about comments in the tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to the great comment. I still had to look up why we cannot insert into the root GetResponse and realized its the metadata which can be any values. Can you add this little piece of info to the comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I rephrased that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: maybe also explain the last exception rult, why inserting into the nested GetResult doesn't work here.

@tlrx
Copy link
Member Author

tlrx commented Jun 16, 2017

Thanks @cbuescher @javanna for your comments. I updated the code.

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a few more comments, nothing major though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adding this comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this is too much? we do expect each field under fields to contain an array of values I'd say. without this change parseStoredFieldsValue would throw exception if it finds something that it doesn't support, which is probably ok?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also thought looking below that we are not testing this at the moment, maybe I am wrong.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm reverting this change. The parseStoredFieldsValue() is strict but not tested and it looks like it does not handle correctly the null value that can be printed out by MappedTypeField sub classes. I'm not sure it's a real case be we should test that too. I'll add a test for parseStoredFieldsValue() as a follow up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_index, _type and _id are required too right? Shall we check them too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also add a comment on why we need to throw this exception?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we can add them all. I added a comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wrong, at this stage and when parsing an UpdateResponse the index/type/id/version fields can be missing because they are printed out by the UpdateResponse itself...

I removed the checks from this method and add them in the GetResponse.fromXContent() where I think they should be located. This way we keep the same behavior (ie throwing an exception when fields are missing) but that would impact only users of the GetResponse.fromXContent() method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which makes me wonder...shouldn't we do the same in the update api with 404? I'm afraid so... cause if the doc is missing we do return 404?

Copy link
Member Author

@tlrx tlrx Jun 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the Update API, a 404 is returned with an exception in the response body. The high level rest client doesn't ignore 404s but it's okay because the response body will always be an exception.

The Get API is a bit different, when the document doesn't exists it returns a 404 but with a response body that is not always an exception. So the client has to try to parse that doc, fails, and then parses it as an exception and got the right result.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok phew sounds good :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the path.endsWith("get") could be the path.endsWith(".get") ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to path.contains("get") which is equivalent to
path.contains("get.fields") || path.contains("get._source") || path.endsWith("get")

@tlrx
Copy link
Member Author

tlrx commented Jun 16, 2017

@javanna I updated again the code, moving few things around. Hopefully we're good now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the case when we get a 404 back, which can be parsed as a normal get response with found set to false, or as an elasticsearch exception. The caller of this method needs a way to figure out whether we got back a valid get response, which can be done by catching ParsingException.

Maybe some of this should be in the javadocs of this method.

Copy link
Member

@javanna javanna Jun 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

funny that I would have instead done

if (getResult.getIndex() == null && getResult.getType() == null && getResult.getId() == null)

as this seems to be a stronger condition :) what do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason why we don't shuffle stuff here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to shuffle here (though we could) because we just want to check that the parsed update response, when printed out as xcontent and parsed back as a map, has the exact same output as what we expect. I hoped the comment was clear enough...

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a few very minors, LGTM though

tlrx added 5 commits June 19, 2017 10:14
This commit changes the parsing logic of DocWriteResponse, ReplicationResponse
and GetResult so that it skips any unknown additional fields (for forward
compatibility reasons). This affects the IndexResponse, UpdateResponse,
DeleteResponse and GetResponse objects.
@tlrx tlrx force-pushed the relax-parsing-logic-docwriteresponses branch from 15fd2c7 to 1aec207 Compare June 19, 2017 08:14
@tlrx
Copy link
Member Author

tlrx commented Jun 19, 2017

@elasticmachine retest this please

@tlrx tlrx merged commit e4f4886 into elastic:master Jun 19, 2017
tlrx added a commit that referenced this pull request Jun 19, 2017
This commit changes the parsing logic of DocWriteResponse, ReplicationResponse
and GetResult so that it skips any unknown additional fields (for forward compatibility
reasons). This affects the IndexResponse, UpdateResponse,DeleteResponse and
GetResponse objects.
@tlrx tlrx deleted the relax-parsing-logic-docwriteresponses branch June 19, 2017 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>test Issues or PRs that are addressing/adding tests v5.6.0 v6.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants