[SPARK-15177.1] [R] make SparkR model params and default values consistent with MLlib #13801

mengxr · 2016-06-21T06:55:42Z

What changes were proposed in this pull request?

This PR is a subset of #13023 by @yanboliang to make SparkR model param names and default values consistent with MLlib. I tried to avoid other changes from #13023 to keep this PR minimal. I will send a follow-up PR to improve the documentation.

Main changes:

spark.glm: epsilon -> tol, maxit -> maxIter
spark.kmeans: default k -> 2, default maxIter -> 20, default initMode -> "k-means||"
spark.naiveBayes: laplace -> smoothing, default 1.0

How was this patch tested?

Existing unit tests.

mengxr · 2016-06-21T07:04:52Z

cc: @shivaram

shivaram · 2016-06-21T07:11:53Z

R/pkg/R/mllib.R

 #' @note spark.kmeans since 2.0.0
 setMethod("spark.kmeans", signature(data = "SparkDataFrame", formula = "formula"),
-          function(data, formula, k, maxIter = 10, initMode = c("random", "k-means||")) {
+          function(data, formula, k = 2, maxIter = 20, initMode = c("k-means||", "random")) {


just to clarify - this initMode change wasn't present in #13023 -- Is this intended to match some Spark behavior ?

Yes, the default initMode in MLlib is k-means|| instead of random. See https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala#L263.

Ah I see - change LGTM then

shivaram · 2016-06-21T07:14:39Z

Changes look fine given what was a part of #13023

mengxr · 2016-06-21T07:37:07Z

test this please

SparkQA · 2016-06-21T07:48:58Z

Test build #60913 has finished for PR 13801 at commit 39a4c4c.

This patch fails SparkR unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-06-21T08:33:52Z

Test build #60915 has finished for PR 13801 at commit 39a4c4c.

This patch fails SparkR unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-06-21T08:42:22Z

Test build #60917 has finished for PR 13801 at commit 0a712fe.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

mengxr · 2016-06-21T15:32:06Z

Merged into master and branch-2.0.

…istent with MLlib ## What changes were proposed in this pull request? This PR is a subset of #13023 by yanboliang to make SparkR model param names and default values consistent with MLlib. I tried to avoid other changes from #13023 to keep this PR minimal. I will send a follow-up PR to improve the documentation. Main changes: * `spark.glm`: epsilon -> tol, maxit -> maxIter * `spark.kmeans`: default k -> 2, default maxIter -> 20, default initMode -> "k-means||" * `spark.naiveBayes`: laplace -> smoothing, default 1.0 ## How was this patch tested? Existing unit tests. Author: Xiangrui Meng <[email protected]> Closes #13801 from mengxr/SPARK-15177.1. (cherry picked from commit 4f83ca1) Signed-off-by: Xiangrui Meng <[email protected]>

make SparkR model params and default values consistent with MLlib

39a4c4c

shivaram reviewed Jun 21, 2016
View reviewed changes

fix test

0a712fe

asfgit closed this in 4f83ca1 Jun 21, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-15177.1] [R] make SparkR model params and default values consistent with MLlib #13801

[SPARK-15177.1] [R] make SparkR model params and default values consistent with MLlib #13801

Uh oh!

mengxr commented Jun 21, 2016 •

edited

Loading

Uh oh!

mengxr commented Jun 21, 2016

Uh oh!

shivaram Jun 21, 2016

Uh oh!

mengxr Jun 21, 2016 •

edited

Loading

Uh oh!

shivaram Jun 21, 2016

Uh oh!

shivaram commented Jun 21, 2016

Uh oh!

mengxr commented Jun 21, 2016

Uh oh!

SparkQA commented Jun 21, 2016

Uh oh!

SparkQA commented Jun 21, 2016

Uh oh!

SparkQA commented Jun 21, 2016

Uh oh!

mengxr commented Jun 21, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SPARK-15177.1] [R] make SparkR model params and default values consistent with MLlib #13801

[SPARK-15177.1] [R] make SparkR model params and default values consistent with MLlib #13801

Uh oh!

Conversation

mengxr commented Jun 21, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

mengxr commented Jun 21, 2016

Uh oh!

shivaram Jun 21, 2016

Choose a reason for hiding this comment

Uh oh!

mengxr Jun 21, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shivaram Jun 21, 2016

Choose a reason for hiding this comment

Uh oh!

shivaram commented Jun 21, 2016

Uh oh!

mengxr commented Jun 21, 2016

Uh oh!

SparkQA commented Jun 21, 2016

Uh oh!

SparkQA commented Jun 21, 2016

Uh oh!

SparkQA commented Jun 21, 2016

Uh oh!

mengxr commented Jun 21, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mengxr commented Jun 21, 2016 •

edited

Loading

mengxr Jun 21, 2016 •

edited

Loading