Skip to content

Conversation

@thunterdb
Copy link
Contributor

@thunterdb thunterdb commented Apr 29, 2016

What changes were proposed in this pull request?

This PR splits the MLlib algorithms into two flavors:

  • the R flavor, which tries to mimic the existing R API for these algorithms (and works as an S4 specialization for Spark dataframes)
  • the Spark flavor, which follows the same API and naming conventions as the rest of the MLlib algorithms in the other languages

In practice, the former calls the latter.

How was this patch tested?

The tests for the various algorithms were adapted to be run against both interfaces.

@SparkQA
Copy link

SparkQA commented Apr 30, 2016

Test build #57371 has finished for PR 12789 at commit 46d9d68.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@thunterdb thunterdb changed the title [WIP][SPARKR][SPARK-14831]Make the SparkR MLlib API more consistent with Spark [SPARKR][SPARK-14831]Make the SparkR MLlib API more consistent with Spark Apr 30, 2016
@SparkQA
Copy link

SparkQA commented Apr 30, 2016

Test build #57386 has finished for PR 12789 at commit 3b176d9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 30, 2016

Test build #57407 has finished for PR 12789 at commit f57db82.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor

mengxr commented Apr 30, 2016

I merged this into master. Thanks! There are some minor issues that I will address in a follow-up PR:

  • ml.save/load should be renamed to read.ml and write.ml to be consistent with read.df and write.df
  • the param data should be called df

@asfgit asfgit closed this in bc36fe6 Apr 30, 2016
asfgit pushed a commit that referenced this pull request Apr 30, 2016
## What changes were proposed in this pull request?

Continue the work of #12789 to rename ml.asve/ml.load to write.ml/read.ml, which are more consistent with read.df/write.df and other methods in SparkR.

I didn't rename `data` to `df` because we still use `predict` for prediction, which uses `newData` to match the signature in R.

## How was this patch tested?

Existing unit tests.

cc: yanboliang thunterdb

Author: Xiangrui Meng <[email protected]>

Closes #12807 from mengxr/SPARK-14831.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants