-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-14272][ML] Add Loglikelihood in GaussianMixtureSummary #12064
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #54522 has finished for PR 12064 at commit
|
5e2aff7 to
29841d0
Compare
|
Test build #66739 has finished for PR 12064 at commit
|
|
Test build #66743 has finished for PR 12064 at commit
|
cbe92b6 to
cdd829a
Compare
|
Test build #66788 has finished for PR 12064 at commit
|
|
Test build #66806 has finished for PR 12064 at commit
|
d5b9422 to
4458a5f
Compare
|
Test build #69845 has finished for PR 12064 at commit
|
4458a5f to
8c2d529
Compare
|
Test build #71113 has finished for PR 12064 at commit
|
|
Test build #71114 has started for PR 12064 at commit |
|
Jenkins, retest this please |
|
Test build #71119 has finished for PR 12064 at commit
|
|
Test build #71121 has finished for PR 12064 at commit
|
|
ping @yanboliang |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question here, should we provide the final logLikelihood of the model in its summary as well? Since lots of users will use it to evaluate the current model, that they don't need to take another pass on data.
This will expose a public API, cc @jkbradley @sethah @srowen @MLnick to discuss the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for putting it in the summary. If you want to evaluate a new dataset, then let's add an evaluate() method which returns a summary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need to bother a separate test, you can add check for logLikelihood on the existing test(multivariate data and check againt R mvnormalmixEM) which is equivalent to what you wrote but with more reasonable dataset.
1856e59 to
41e1a57
Compare
|
Test build #71243 has finished for PR 12064 at commit
|
|
Test build #71245 has finished for PR 12064 at commit
|
|
@yanboliang Updated! Thanks for reviewing! |
|
Test build #71432 has finished for PR 12064 at commit
|
|
Test build #71436 has finished for PR 12064 at commit
|
|
ping @yanboliang |
|
|
||
| @property | ||
| @since("2.2.0") | ||
| def logLikelihood(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add this to doc test.
|
Test build #71498 has finished for PR 12064 at commit
|
|
Test build #71500 has finished for PR 12064 at commit
|
|
Test build #71550 has finished for PR 12064 at commit
|
|
jenkins, retest this please |
|
Test build #71566 has finished for PR 12064 at commit
|
|
Test build #71573 has started for PR 12064 at commit |
|
jenkins, retest this please |
|
Test build #71585 has finished for PR 12064 at commit
|
|
Test build #71598 has finished for PR 12064 at commit
|
|
Test build #71641 has started for PR 12064 at commit |
|
Jenkins, retest this please. |
|
Test build #71646 has finished for PR 12064 at commit
|
|
LGTM, merged into master. Thanks! |
## What changes were proposed in this pull request? add loglikelihood in GMM.summary ## How was this patch tested? added tests Author: Zheng RuiFeng <[email protected]> Author: Ruifeng Zheng <[email protected]> Closes apache#12064 from zhengruifeng/gmm_metric.
## What changes were proposed in this pull request? add loglikelihood in GMM.summary ## How was this patch tested? added tests Author: Zheng RuiFeng <[email protected]> Author: Ruifeng Zheng <[email protected]> Closes apache#12064 from zhengruifeng/gmm_metric.
What changes were proposed in this pull request?
add loglikelihood in GMM.summary
How was this patch tested?
added tests