Skip to content

Conversation

@tlrx
Copy link
Member

@tlrx tlrx commented May 15, 2017

This pull request adds parsing methods for the SignificantStringTerms and SignificantLongTerms aggregations.

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a few minors, LGTM otherwise

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this happen in practice?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I should have changed that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, just wondering if this can really happen, when can a key be null?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we can return key.utf8ToString() directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this method needed? I think it isn't used anywhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems it's not used but I prefer to leave it in core - I saw something about it might be needed in scripted aggs or something. I'll try to figure this out, and if it's unused then I'll create a separate PR in core.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #24714 to remove the method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove one of these empty lines? ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that to make this work as part of AggregationsTests you need to rename this method to setUp and remove before annotation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AggregationsTests passed so I didn't catch it. I changed that, thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok weird that it worked. does the format have a default value in case this method is not called?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No... but the output does not depend on the format (or heuristic) so a null format produces a valid XContent output that is parseable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one we set although it is never printed out as part of toXContent?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We render the doc count which is in fact the subsetDf.

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I left a few minor comments that might either be addressed, ignored or handled as follow ups, as you prefer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we reuse the constants used in InternalSignificantTerms by making them public? I think we did that elsewhere and also do this with the CommonFields.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - I changed them in core specially and after that I forgot to use them :/ Thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we lazily compute a bucketMap like in InternalMappedSignificantTerms and then be able reuse it when this method gets called many times? I have no strong opinions on this though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have too... I tend to think that it can be done on caller's side if a lot of bucket are going to be retrieve by their keys. I'll add one for the sake of coherency with the internal implementations and other parsed aggregations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also added a test about this method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We render a subsetSize as the DOC_COUNT field in the surrounding aggregation, I'm not sure if this is equivalent with the bucket subset size but maybe it could be used here? Probably would need some checking with somebody who knows the aggregation better though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how they are related to be honest. Since we don't render the superset size and subset size I think it's ok to throw an unsupported operation exception here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion that gives more background around where these fields come from: #5146 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if getSuperset/SubsetSize is part of the Bucket interface but not rendered via the Rest response, should we either add rendering of these values to the bucket response or remove it from the interface to get equivalent behaviour of functionality of the transport client with the high level rest client here? I think this can be done in a separate issue though, maybe its not needed at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe @markharwood has an opinion on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with the UnsupportedOperationException for now if we can track this question (whether we can reach consistency between the functionality the transport client provides via the SignificantTerms.Bucket interface with the rest response) in a separate issue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This took me some time to wrap my head around this, but I finally created #24865 to address this.

@tlrx
Copy link
Member Author

tlrx commented May 16, 2017

Thanks @javanna @cbuescher ! I updated a bit, would you like to have another look please?

@cbuescher
Copy link
Member

cbuescher commented May 16, 2017

Update looks good to me.

@tlrx tlrx force-pushed the add-parsing-to-sig-terms-aggs branch from a861bd7 to cafdb94 Compare May 16, 2017 12:54
@tlrx tlrx merged commit d5fc520 into elastic:feature/client_aggs_parsing May 16, 2017
@tlrx
Copy link
Member Author

tlrx commented May 16, 2017

Thanks @javanna @cbuescher !

@tlrx tlrx deleted the add-parsing-to-sig-terms-aggs branch May 16, 2017 12:56
javanna pushed a commit to javanna/elasticsearch that referenced this pull request May 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants