Skip to content

Conversation

@nxwhite-str
Copy link

This implements the functionality for SPARK-4749 and provides units tests in Scala and PySpark

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you set the default in the one public constructor instead since that's where other defaults are set?

@jkbradley
Copy link
Member

@nxwhite-str Thanks for the PR! Could you please update the title to start with "[SPARK-4749] [mllib]" to help with automated tagging?

@nxwhite-str nxwhite-str changed the title SPARK-4749: Allow initializing KMeans clusters using a seed [SPARK-4749] [mllib]: Allow initializing KMeans clusters using a seed Dec 10, 2014
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move this to the beginning and make the one without seed call this?

@mengxr
Copy link
Contributor

mengxr commented Dec 19, 2014

LGTM except minor inline comments.

@mengxr
Copy link
Contributor

mengxr commented Dec 19, 2014

ok to test

@SparkQA
Copy link

SparkQA commented Dec 19, 2014

Test build #24654 has started for PR 3610 at commit f8d5928.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Dec 19, 2014

Test build #24654 has finished for PR 3610 at commit f8d5928.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class SparkContext(config: SparkConf) extends Logging
    • class RandomModuleHook(object):
    • class Analyzer(catalog: Catalog, registry: FunctionRegistry, caseSensitive: Boolean)

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24654/
Test FAILed.

@jkbradley
Copy link
Member

failure in a streaming test...retesting

@SparkQA
Copy link

SparkQA commented Dec 22, 2014

Test build #551 has started for PR 3610 at commit f8d5928.

  • This patch merges cleanly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add parameter name: seed: System.nanoTime()

@SparkQA
Copy link

SparkQA commented Dec 22, 2014

Test build #551 has finished for PR 3610 at commit f8d5928.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor

mengxr commented Jan 10, 2015

@nxwhite-str There are few minor comments left. Do you have time to update the PR?

@SparkQA
Copy link

SparkQA commented Jan 21, 2015

Test build #25891 has started for PR 3610 at commit a2ebbd3.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 21, 2015

Test build #25891 has finished for PR 3610 at commit a2ebbd3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25891/
Test PASSed.

@asfgit asfgit closed this in 7450a99 Jan 21, 2015
@mengxr
Copy link
Contributor

mengxr commented Jan 21, 2015

Merged into master. Thanks!

bomeng pushed a commit to Huawei-Spark/spark that referenced this pull request Jan 22, 2015
This implements the functionality for SPARK-4749 and provides units tests in Scala and PySpark

Author: nate.crosswhite <[email protected]>
Author: nxwhite-str <[email protected]>
Author: Xiangrui Meng <[email protected]>

Closes apache#3610 from nxwhite-str/master and squashes the following commits:

a2ebbd3 [nxwhite-str] Merge pull request #1 from mengxr/SPARK-4749-kmeans-seed
7668124 [Xiangrui Meng] minor updates
f8d5928 [nate.crosswhite] Addressing PR issues
277d367 [nate.crosswhite] Merge remote-tracking branch 'upstream/master'
9156a57 [nate.crosswhite] Merge remote-tracking branch 'upstream/master'
5d087b4 [nate.crosswhite] Adding KMeans train with seed and Scala unit test
616d111 [nate.crosswhite] Merge remote-tracking branch 'upstream/master'
35c1884 [nate.crosswhite] Add kmeans initial seed to pyspark API
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants