Skip to content

Commit 6902eda

Browse files
committed
[SPARK-17315][FOLLOW-UP][SPARKR][ML] Fix print of Kolmogorov-Smirnov test summary
## What changes were proposed in this pull request? apache#14881 added Kolmogorov-Smirnov Test wrapper to SparkR. I found that ```print.summary.KSTest``` was implemented inappropriately and result in no effect. Running the following code for KSTest: ```Scala data <- data.frame(test = c(0.1, 0.15, 0.2, 0.3, 0.25, -1, -0.5)) df <- createDataFrame(data) testResult <- spark.kstest(df, "test", "norm") summary(testResult) ``` Before this PR: ![image](https://cloud.githubusercontent.com/assets/1962026/18615016/b9a2823a-7d4f-11e6-934b-128beade355e.png) After this PR: ![image](https://cloud.githubusercontent.com/assets/1962026/18615014/aafe2798-7d4f-11e6-8b99-c705bb9fe8f2.png) The new implementation is similar with [```print.summary.GeneralizedLinearRegressionModel```](https://github.com/apache/spark/blob/master/R/pkg/R/mllib.R#L284) of SparkR and [```print.summary.glm```](https://svn.r-project.org/R/trunk/src/library/stats/R/glm.R) of native R. BTW, I removed the comparison of ```print.summary.KSTest``` in unit test, since it's only wrappers of the summary output which has been checked. Another reason is that these comparison will output summary information to the test console, it will make the test output in a mess. ## How was this patch tested? Existing test. Author: Yanbo Liang <[email protected]> Closes apache#15139 from yanboliang/spark-17315.
1 parent c133907 commit 6902eda

File tree

2 files changed

+11
-21
lines changed

2 files changed

+11
-21
lines changed

R/pkg/R/mllib.R

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1398,20 +1398,22 @@ setMethod("summary", signature(object = "KSTest"),
13981398
distParams <- unlist(callJMethod(jobj, "distParams"))
13991399
degreesOfFreedom <- callJMethod(jobj, "degreesOfFreedom")
14001400

1401-
list(p.value = pValue, statistic = statistic, nullHypothesis = nullHypothesis,
1402-
nullHypothesis.name = distName, nullHypothesis.parameters = distParams,
1403-
degreesOfFreedom = degreesOfFreedom)
1401+
ans <- list(p.value = pValue, statistic = statistic, nullHypothesis = nullHypothesis,
1402+
nullHypothesis.name = distName, nullHypothesis.parameters = distParams,
1403+
degreesOfFreedom = degreesOfFreedom, jobj = jobj)
1404+
class(ans) <- "summary.KSTest"
1405+
ans
14041406
})
14051407

14061408
# Prints the summary of KSTest
14071409

14081410
#' @rdname spark.kstest
1409-
#' @param x test result object of KSTest by \code{spark.kstest}.
1411+
#' @param x summary object of KSTest returned by \code{summary}.
14101412
#' @export
14111413
#' @note print.summary.KSTest since 2.1.0
14121414
print.summary.KSTest <- function(x, ...) {
1413-
jobj <- x@jobj
1415+
jobj <- x$jobj
14141416
summaryStr <- callJMethod(jobj, "summary")
1415-
cat(summaryStr)
1416-
invisible(summaryStr)
1417+
cat(summaryStr, "\n")
1418+
invisible(x)
14171419
}

R/pkg/inst/tests/testthat/test_mllib.R

Lines changed: 2 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -760,13 +760,7 @@ test_that("spark.kstest", {
760760

761761
expect_equal(stats$p.value, rStats$p.value, tolerance = 1e-4)
762762
expect_equal(stats$statistic, unname(rStats$statistic), tolerance = 1e-4)
763-
764-
printStr <- print.summary.KSTest(testResult)
765-
expect_match(printStr, paste0("Kolmogorov-Smirnov test summary:\\n",
766-
"degrees of freedom = 0 \\n",
767-
"statistic = 0.38208[0-9]* \\n",
768-
"pValue = 0.19849[0-9]* \\n",
769-
".*"), perl = TRUE)
763+
expect_match(capture.output(stats)[1], "Kolmogorov-Smirnov test summary:")
770764

771765
testResult <- spark.kstest(df, "test", "norm", -0.5)
772766
stats <- summary(testResult)
@@ -775,13 +769,7 @@ test_that("spark.kstest", {
775769

776770
expect_equal(stats$p.value, rStats$p.value, tolerance = 1e-4)
777771
expect_equal(stats$statistic, unname(rStats$statistic), tolerance = 1e-4)
778-
779-
printStr <- print.summary.KSTest(testResult)
780-
expect_match(printStr, paste0("Kolmogorov-Smirnov test summary:\\n",
781-
"degrees of freedom = 0 \\n",
782-
"statistic = 0.44003[0-9]* \\n",
783-
"pValue = 0.09470[0-9]* \\n",
784-
".*"), perl = TRUE)
772+
expect_match(capture.output(stats)[1], "Kolmogorov-Smirnov test summary:")
785773
})
786774

787775
sparkR.session.stop()

0 commit comments

Comments
 (0)