Skip to content

Conversation

@GayathriMurali
Copy link
Contributor

What changes were proposed in this pull request?

Add summary API to Gaussian Mixture

How was this patch tested?

Added unit test case to test summary information

@GayathriMurali
Copy link
Contributor Author

@wangmiao1981 @jkbradley Please help review this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PyDoc seems oddly formatted in terms of line breaks.

@GayathriMurali
Copy link
Contributor Author

@jkbradley Can you please ok to test this

@GayathriMurali
Copy link
Contributor Author

@holdenk I fixed the pydoc style issue. Can you please help review this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this also seems strangely formatted - did you mean trainingSummary is None. ?

@holdenk
Copy link
Contributor

holdenk commented May 9, 2016

So did a quick look and it seems like it might not be lining up with the Scala implementation (or I just haven't had enough coffee this morning). It might be worthwhile to doublecheck against the scaladoc and also run your tests locally if you haven't done that.

@GayathriMurali
Copy link
Contributor Author

@holdenk I checked the ScalaDoc and removed the evaluate method. Thanks for pointing it out. Can you please help review

@GayathriMurali
Copy link
Contributor Author

@jkbradley @holdenk Can you please help review?

@GayathriMurali
Copy link
Contributor Author

@jkbradley This PR has been open >30days. Can you please help review?

@MLnick
Copy link
Contributor

MLnick commented Jun 17, 2016

ok to test

@SparkQA
Copy link

SparkQA commented Jun 17, 2016

Test build #60707 has finished for PR 12675 at commit 2e226ce.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 18, 2016

Test build #60739 has finished for PR 12675 at commit 7d16a23.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 18, 2016

Test build #60748 has finished for PR 12675 at commit 3cf080a.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 19, 2016

Test build #60820 has finished for PR 12675 at commit c2b1aef.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 19, 2016

Test build #60821 has finished for PR 12675 at commit 3bc75a3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


def test_gaussian_mixture_summary(self):
from pyspark.mllib.linalg import Vectors
df = self.spark.read.format("libsvm").load("data/mllib/sample_kmeans_data.txt")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let me know if its ok to load data from a file when all other test cases uses hard coded data values. I tried with a sparse vector and fit gave me an error for the data format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its ok to load a dataset which is already used for testing else where.

@GayathriMurali
Copy link
Contributor Author

@MLnick It would be great if you can help review this.

Copy link
Contributor

@holdenk holdenk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you still interested in this PR @GayathriMurali ? If you've got the time to look at some of my notes around the PyDocs that would be cool - but we can also try pinging @davies & @MLnick to see if they are available to review once those comments are looked at :)

return self._call_java("gaussiansDF")

@property
@since("2.0.0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we've shipped 2.0 we will need to update this to 2.1 (same with the versionAdded notes)

@HyukjinKwon
Copy link
Member

Hi @GayathriMurali, it seems inactive for the review comments for few months. Should this be better closed for now if you are not able to proceed this further?

@wangmiao1981
Copy link
Contributor

@GayathriMurali If you are not able to proceed, I can take over. Thanks!

@wangmiao1981
Copy link
Contributor

#15777 has resolved this issue. We should close this one.

@wangmiao1981
Copy link
Contributor

@HyukjinKwon @srowen This should be closed. Thanks!

@HyukjinKwon
Copy link
Member

@wangmiao1981 Wow, thank you for your efforts to verify it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants