[SPARK-14272][ML] Add Loglikelihood in GaussianMixtureSummary #12064

zhengruifeng · 2016-03-30T14:12:56Z

What changes were proposed in this pull request?

add loglikelihood in GMM.summary

How was this patch tested?

added tests

SparkQA · 2016-03-30T14:59:22Z

Test build #54522 has finished for PR 12064 at commit 5e2aff7.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-11T10:25:34Z

Test build #66739 has finished for PR 12064 at commit 29841d0.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-11T14:25:15Z

Test build #66743 has finished for PR 12064 at commit cbe92b6.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-12T03:13:20Z

Test build #66788 has finished for PR 12064 at commit cdd829a.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-10-12T09:38:58Z

Test build #66806 has finished for PR 12064 at commit d5b9422.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-12-08T06:33:20Z

Test build #69845 has finished for PR 12064 at commit 4458a5f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-01-10T06:20:56Z

Test build #71113 has finished for PR 12064 at commit 8c2d529.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-01-10T07:33:33Z

Test build #71114 has started for PR 12064 at commit 083e4f9.

zhengruifeng · 2017-01-10T08:47:28Z

Jenkins, retest this please

SparkQA · 2017-01-10T09:31:10Z

Test build #71119 has finished for PR 12064 at commit 083e4f9.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-01-10T11:27:47Z

Test build #71121 has finished for PR 12064 at commit 1856e59.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

zhengruifeng · 2017-01-11T02:59:44Z

ping @yanboliang

yanboliang · 2017-01-11T09:11:44Z

mllib/src/main/scala/org/apache/spark/ml/clustering/GaussianMixture.scala

I have a question here, should we provide the final logLikelihood of the model in its summary as well? Since lots of users will use it to evaluate the current model, that they don't need to take another pass on data.

This will expose a public API, cc @jkbradley @sethah @srowen @MLnick to discuss the API.

+1 for putting it in the summary. If you want to evaluate a new dataset, then let's add an evaluate() method which returns a summary.

yanboliang · 2017-01-11T09:55:39Z

mllib/src/test/scala/org/apache/spark/ml/clustering/GaussianMixtureSuite.scala

I think we don't need to bother a separate test, you can add check for logLikelihood on the existing test(multivariate data and check againt R mvnormalmixEM) which is equivalent to what you wrote but with more reasonable dataset.

SparkQA · 2017-01-12T03:54:48Z

Test build #71243 has finished for PR 12064 at commit 41e1a57.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-01-12T04:46:44Z

Test build #71245 has finished for PR 12064 at commit 9af6c92.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

zhengruifeng · 2017-01-13T02:41:29Z

@yanboliang Updated! Thanks for reviewing!

SparkQA · 2017-01-16T08:13:47Z

Test build #71432 has finished for PR 12064 at commit 68f72fa.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-01-16T10:50:42Z

Test build #71436 has finished for PR 12064 at commit d333642.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

zhengruifeng · 2017-01-16T10:52:24Z

ping @yanboliang

yanboliang · 2017-01-17T08:54:31Z

python/pyspark/ml/clustering.py


+    @property
+    @since("2.2.0")
+    def logLikelihood(self):


Add this to doc test.

SparkQA · 2017-01-17T11:23:32Z

Test build #71498 has finished for PR 12064 at commit 1de60b0.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-01-17T11:28:39Z

Test build #71500 has finished for PR 12064 at commit d6fa8fa.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-01-18T03:33:27Z

Test build #71550 has finished for PR 12064 at commit fd85c5d.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

zhengruifeng · 2017-01-18T03:42:07Z

jenkins, retest this please

SparkQA · 2017-01-18T05:54:38Z

Test build #71566 has finished for PR 12064 at commit fd85c5d.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-01-18T06:32:45Z

Test build #71573 has started for PR 12064 at commit eebae43.

zhengruifeng · 2017-01-18T08:11:29Z

jenkins, retest this please

SparkQA · 2017-01-18T10:21:32Z

Test build #71585 has finished for PR 12064 at commit eebae43.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-01-18T13:27:03Z

Test build #71598 has finished for PR 12064 at commit eb27bcc.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-01-19T07:28:40Z

Test build #71641 has started for PR 12064 at commit cbec946.

yanboliang · 2017-01-19T08:21:48Z

Jenkins, retest this please.

SparkQA · 2017-01-19T10:48:17Z

Test build #71646 has finished for PR 12064 at commit cbec946.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

yanboliang · 2017-01-19T11:47:02Z

LGTM, merged into master. Thanks!

## What changes were proposed in this pull request? add loglikelihood in GMM.summary ## How was this patch tested? added tests Author: Zheng RuiFeng <[email protected]> Author: Ruifeng Zheng <[email protected]> Closes apache#12064 from zhengruifeng/gmm_metric.

zhengruifeng changed the title ~~[SPARK-14272][MLLIB] Evaluate GaussianMixtureModel with LogLooklihood~~ [SPARK-14272][MLLIB] Evaluate GaussianMixtureModel with LogLikelihood Apr 10, 2016

zhengruifeng force-pushed the gmm_metric branch from 5e2aff7 to 29841d0 Compare October 11, 2016 10:13

zhengruifeng changed the title ~~[SPARK-14272][MLLIB] Evaluate GaussianMixtureModel with LogLikelihood~~ [SPARK-14272][ML] Evaluate GaussianMixtureModel with LogLikelihood Oct 11, 2016

zhengruifeng force-pushed the gmm_metric branch from cbe92b6 to cdd829a Compare October 12, 2016 03:02

zhengruifeng force-pushed the gmm_metric branch from d5b9422 to 4458a5f Compare December 8, 2016 05:22

zhengruifeng force-pushed the gmm_metric branch from 4458a5f to 8c2d529 Compare January 10, 2017 06:09

yanboliang reviewed Jan 11, 2017

View reviewed changes

zhengruifeng force-pushed the gmm_metric branch from 1856e59 to 41e1a57 Compare January 12, 2017 03:15

zhengruifeng added 7 commits January 16, 2017 12:14

recreate pr

c596b39

update pr

6973301

update tol

9e316b9

add implicits

05f60b9

fix isotonic

45bb563

fix conflict and update version

c992a1a

fix a bug in test

2bbdf5b

fix mima

d333642

yanboliang reviewed Jan 17, 2017

View reviewed changes

python/pyspark/ml/clustering.py

@property

@since("2.2.0")

def logLikelihood(self):

Copy link

Contributor

yanboliang Jan 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this to doc test.

zhengruifeng added 2 commits January 17, 2017 17:11

Merge branch 'master' into gmm_metric

1de60b0

add test

d6fa8fa

add output of summary

fd85c5d

update doc test

eebae43

update doc test

eb27bcc

zhengruifeng added 2 commits January 19, 2017 15:23

update mima

e3bc962

update pr desc

cbec946

zhengruifeng changed the title ~~[SPARK-14272][ML] Evaluate GaussianMixtureModel with LogLikelihood~~ [SPARK-14272][ML] Add Loglikelihood in GaussianMixtureSummary Jan 19, 2017

asfgit closed this in 8ccca91 Jan 19, 2017

zhengruifeng deleted the gmm_metric branch January 19, 2017 12:00

[SPARK-14272][ML] Add Loglikelihood in GaussianMixtureSummary #12064

[SPARK-14272][ML] Add Loglikelihood in GaussianMixtureSummary #12064

Uh oh!

Conversation

zhengruifeng commented Mar 30, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Mar 30, 2016

Uh oh!

SparkQA commented Oct 11, 2016

Uh oh!

SparkQA commented Oct 11, 2016

Uh oh!

SparkQA commented Oct 12, 2016

Uh oh!

SparkQA commented Oct 12, 2016

Uh oh!

SparkQA commented Dec 8, 2016

Uh oh!

SparkQA commented Jan 10, 2017

Uh oh!

SparkQA commented Jan 10, 2017

Uh oh!

zhengruifeng commented Jan 10, 2017

Uh oh!

SparkQA commented Jan 10, 2017

Uh oh!

SparkQA commented Jan 10, 2017

Uh oh!

zhengruifeng commented Jan 11, 2017

Uh oh!

yanboliang Jan 11, 2017

Choose a reason for hiding this comment

Uh oh!

jkbradley Jan 12, 2017

Choose a reason for hiding this comment

Uh oh!

yanboliang Jan 11, 2017

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Jan 12, 2017

Uh oh!

SparkQA commented Jan 12, 2017

Uh oh!

zhengruifeng commented Jan 13, 2017

Uh oh!

SparkQA commented Jan 16, 2017

Uh oh!

SparkQA commented Jan 16, 2017

Uh oh!

zhengruifeng commented Jan 16, 2017

Uh oh!

yanboliang Jan 17, 2017

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Jan 17, 2017

Uh oh!

SparkQA commented Jan 17, 2017

Uh oh!

SparkQA commented Jan 18, 2017

Uh oh!

zhengruifeng commented Jan 18, 2017

Uh oh!

SparkQA commented Jan 18, 2017

Uh oh!

SparkQA commented Jan 18, 2017

Uh oh!

zhengruifeng commented Jan 18, 2017

Uh oh!

SparkQA commented Jan 18, 2017

Uh oh!

SparkQA commented Jan 18, 2017

Uh oh!

SparkQA commented Jan 19, 2017

Uh oh!

yanboliang commented Jan 19, 2017

Uh oh!

SparkQA commented Jan 19, 2017

zhengruifeng commented Mar 30, 2016 •

edited

Loading