Skip to content

Conversation

@GayathriMurali
Copy link
Contributor

What changes were proposed in this pull request?

Make user guide changes to SparkR documentation for all changes that happened in 2.0 to Machine Learning APIs

@GayathriMurali
Copy link
Contributor Author

@jkbradley @MLnick I have marked this WIP, as I want to get your thoughts on if you think the format looks ok. I can add examples to KMeans and SurvReg is the overall format looks fine.

@yanboliang
Copy link
Contributor

@GayathriMurali
1, Please clear the change for ml-features.md which is irrelevant to this issue.
2, We can use include_example tag to directly include examples at examples/src/main/r/ml.R into the user guide. You can refer this PR #9713.

@GayathriMurali GayathriMurali changed the title [Spark 15129][R][DOC][WIP]R API changes in ML [Spark-15129][R][DOC][WIP]R API changes in ML May 26, 2016
@GayathriMurali
Copy link
Contributor Author

@yanboliang Thanks, thats a good idea. However, that would just include example code and not how the output of summary() looks like. It might be useful to include that

@yanboliang
Copy link
Contributor

yanboliang commented May 26, 2016

I think it's not necessary to include summary output of all models in the user guide, we can only show summary of GLM as an example. You need to copy the summary from your own running output.

@GayathriMurali GayathriMurali changed the title [Spark-15129][R][DOC][WIP]R API changes in ML [Spark-15129][R][DOC]R API changes in ML May 26, 2016
@GayathriMurali
Copy link
Contributor Author

@yanboliang Can you please help review?

docs/sparkr.md Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have rename these with functions to spark.glm, spark.naiveBayes, spark.kmeans, spark.survreg.

@yanboliang
Copy link
Contributor

@GayathriMurali
1, Please clear the change for ml-features.md which is irrelevant to this issue.
2, Please also include examples of other machine learning algorithms (spark.survreg, spark.naiveBayes, spark.kmeans).

@GayathriMurali
Copy link
Contributor Author

@yanboliang I have included ml.r using include-example, wouldn't that cover all the examples?

@GayathriMurali GayathriMurali force-pushed the SPARK-15129 branch 3 times, most recently from b4e1d12 to 4e81a15 Compare May 31, 2016 22:49
@yanboliang
Copy link
Contributor

@GayathriMurali $example on$ and $example off$ will help us to extract only interested sections to be included in the user guide docs. You can run SKIP_API=1 jekyll build under directory ./docs to generate the html and check whether it works well.

@yanboliang
Copy link
Contributor

FYI #10219

@GayathriMurali
Copy link
Contributor Author

GayathriMurali commented Jun 1, 2016

@yanboliang $example on$ and $example off$ needs to be included in ml.R. All the code encompassed within example on and off would be joined and a single code block will be produced in the html. It is used to remove comments and other cleanup in the code from appearing in examples. It is not possible to associate a label with example on and off and select a certain label. I could add example on and off at the beginning and end of ml.R or we need to rewrite ml.R so that certain portions like creating a DF remains common for all models. What do you think?

@GayathriMurali
Copy link
Contributor Author

Also, #10219 uses include_example with different files , which is not the case here. @mengxr We need support for tags with include_example, or we need to reformat ml.R( or split every example into a different file) to be used here. I can create a JIRA and work on it, if this makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this line to L28, it's better we can include the annotation line spark.glm and glm which can help users to understand corresponding code.

@yanboliang
Copy link
Contributor

@GayathriMurali I think what is there for include_example is OK. Please see my other inline comments.

@GayathriMurali
Copy link
Contributor Author

@yanboliang Please let me know if there is anything else I can do to get this merged.

@SparkQA
Copy link

SparkQA commented Jun 10, 2016

Test build #3078 has finished for PR 13285 at commit 8e7c4fe.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@GayathriMurali
Copy link
Contributor Author

@yanboliang Please let me know if there is anything else I can do to help get this merged.Thanks!

docs/sparkr.md Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Organize better:

The examples below show how to build several models:
* GLM using the Gaussian and Binomial model families
* AFT survival regression model
* Naive Bayes model
* K-Means model

@jkbradley
Copy link
Member

This looks good, though I had one small comment. Thanks for your patience!

@GayathriMurali
Copy link
Contributor Author

@jkbradley I fixed for the review comment. Please let me know if there is anything else. Thanks!

@mengxr
Copy link
Contributor

mengxr commented Jun 18, 2016

Merged into master and branch-2.0. Saw some very minor issues. I make another pass and fix them in a follow-up PR. Thanks!

asfgit pushed a commit that referenced this pull request Jun 18, 2016
## What changes were proposed in this pull request?

Make user guide changes to SparkR documentation for all changes that happened in 2.0 to Machine Learning APIs

Author: GayathriMurali <[email protected]>

Closes #13285 from GayathriMurali/SPARK-15129.

(cherry picked from commit af2a4b0)
Signed-off-by: Xiangrui Meng <[email protected]>
@asfgit asfgit closed this in af2a4b0 Jun 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants