Skip to content

Conversation

@HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Feb 5, 2020

What changes were proposed in this pull request?

There are currently the R test failures after upgrading testthat to 2.0.0, and R version 3.5.2 as of SPARK-23435. This PR targets to fix the tests and make the tests pass. See the explanations and causes below:

test_context.R:49: failure: Check masked functions
length(maskedCompletely) not equal to length(namesOfMaskedCompletely).
1/1 mismatches
[1] 6 - 4 == 2

test_context.R:53: failure: Check masked functions
sort(maskedCompletely, na.last = TRUE) not equal to sort(namesOfMaskedCompletely, na.last = TRUE).
5/6 mismatches
x[2]: "endsWith"
y[2]: "filter"

x[3]: "filter"
y[3]: "not"

x[4]: "not"
y[4]: "sample"

x[5]: "sample"
y[5]: NA

x[6]: "startsWith"
y[6]: NA

From my cursory look, R base and R's version are mismatched. I fixed accordingly and Jenkins will test it out.

test_includePackage.R:31: error: include inside function
package or namespace load failed for ���plyr���:
 package ���plyr��� was installed by an R version with different internals; it needs to be reinstalled for use with this R version
Seems it's a package installation issue. Looks like plyr has to be re-installed.

From my cursory look, previously installed plyr remains and it's not compatible with the new R version. I fixed accordingly and Jenkins will test it out.

test_sparkSQL.R:499: warning: SPARK-17811: can create DataFrame containing NA as date and time
Your system is mis-configured: ���/etc/localtime��� is not a symlink

Seems a env problem. I suppressed the warnings for now.

test_sparkSQL.R:499: warning: SPARK-17811: can create DataFrame containing NA as date and time
It is strongly recommended to set envionment variable TZ to ���America/Los_Angeles��� (or equivalent)

Seems a env problem. I suppressed the warnings for now.

test_sparkSQL.R:1814: error: string operators
unable to find an inherited method for function ���startsWith��� for signature ���"character"���
1: expect_true(startsWith("Hello World", "Hello")) at /home/jenkins/workspace/SparkPullRequestBuilder@2/R/pkg/tests/fulltests/test_sparkSQL.R:1814
2: quasi_label(enquo(object), label)
3: eval_bare(get_expr(quo), get_env(quo))
4: startsWith("Hello World", "Hello")
5: (function (classes, fdef, mtable) 
   {
       methods <- .findInheritedMethods(classes, fdef, mtable)
       if (length(methods) == 1L) 
           return(methods[[1L]])
       else if (length(methods) == 0L) {
           cnames <- paste0("\"", vapply(classes, as.character, ""), "\"", collapse = ", ")
           stop(gettextf("unable to find an inherited method for function %s for signature %s", 
               sQuote(fdef@generic), sQuote(cnames)), domain = NA)
       }
       else stop("Internal error in finding inherited methods; didn't return a unique method", 
           domain = NA)
   })(list("character"), new("nonstandardGenericFunction", .Data = function (x, prefix) 
   {
       standardGeneric("startsWith")
   }, generic = structure("startsWith", package = "SparkR"), package = "SparkR", group = list(), 
       valueClass = character(0), signature = c("x", "prefix"), default = NULL, skeleton = (function (x, 
           prefix) 
       stop("invalid call in method dispatch to 'startsWith' (no default method)", domain = NA))(x, 
           prefix)), <environment>)
6: stop(gettextf("unable to find an inherited method for function %s for signature %s", 
       sQuote(fdef@generic), sQuote(cnames)), domain = NA)

From my cursory look, R base and R's version are mismatched. I fixed accordingly and Jenkins will test it out.

Also, this PR causes a CRAN check failure as below:

* creating vignettes ... ERROR
Error: processing vignette 'sparkr-vignettes.Rmd' failed with diagnostics:
package ���htmltools��� was installed by an R version with different internals; it needs to be reinstalled for use with this R version

This PR disables it for now.

Why are the changes needed?

To unblock other PRs.

Does this PR introduce any user-facing change?

No. Test only and dev only.

How was this patch tested?

No. I am going to use Jenkins to test.

@HyukjinKwon
Copy link
Member Author

Please let me test this via Jenkins considering the urgency.

@HyukjinKwon
Copy link
Member Author

cc @dongjoon-hyun and @viirya

@SparkQA
Copy link

SparkQA commented Feb 5, 2020

Test build #117871 has finished for PR 27460 at commit 2b3b274.

  • This patch fails R style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member

viirya commented Feb 5, 2020 via email

@dongjoon-hyun
Copy link
Member

cc @felixcheung and @shivaram , too.

@shivaram
Copy link
Contributor

shivaram commented Feb 5, 2020

Thanks @HyukjinKwon -- Changes LGTM if Jenkins tests pass

@dongjoon-hyun
Copy link
Member

So far, it looks good except the following. I'm not sure why arrow test coverage is still removed. Previously, IIRC, we have this.

test_sparkSQL_arrow.R:25: skip: createDataFrame/collect Arrow optimization
arrow cannot be loaded

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. For Arrow, we can check later.
Since this PR is a HOTFIX, please merge this first.

@dongjoon-hyun
Copy link
Member

Thank you so much for quick fix, @HyukjinKwon !

@SparkQA
Copy link

SparkQA commented Feb 5, 2020

Test build #117872 has finished for PR 27460 at commit 7ab5427.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Oh, it seems that there are another instances.

test_sparkSQL.R:504: warning: SPARK-17811: can create DataFrame containing NA as date and time
Your system is mis-configured: ���/etc/localtime��� is not a symlink

test_sparkSQL.R:504: warning: SPARK-17811: can create DataFrame containing NA as date and time
It is strongly recommended to set envionment variable TZ to ���America/Los_Angeles��� (or equivalent)

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @HyukjinKwon for quick fix!

@HyukjinKwon
Copy link
Member Author

Thanks all @dongjoon-hyun, @viirya, @shivaram. I am sure now all instances are fixed. I will merge as soon as the tests pass.

@SparkQA
Copy link

SparkQA commented Feb 5, 2020

Test build #117878 has finished for PR 27460 at commit f453529.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 5, 2020

Test build #117881 has finished for PR 27460 at commit 3113c53.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Feb 5, 2020

Okay, the tests passed and warnings all are fixed.

Duration: 598.4 s

OK:       2270
Failed:   0
Warnings: 0
Skipped:  11

I think it's a script problem. let me take a look

@dongjoon-hyun
Copy link
Member

Yes, now it's blocked by vignettes error.

* creating vignettes ... ERROR
Error: processing vignette 'sparkr-vignettes.Rmd' failed with diagnostics:
package ���htmltools��� was installed by an R version with different internals; it needs to be reinstalled for use with this R version

@HyukjinKwon
Copy link
Member Author

Argh ..

@SparkQA
Copy link

SparkQA commented Feb 5, 2020

Test build #117891 has finished for PR 27460 at commit 9aa87c3.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 5, 2020

Test build #117894 has finished for PR 27460 at commit d97794d.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon HyukjinKwon changed the title [SPARK-30733][R][HOTFIX] Fix SparkR tests per testthat and R version upgrade [SPARK-30733][R][HOTFIX] Fix SparkR tests per testthat and R version upgrade, and disable CRAN Feb 5, 2020
@SparkQA
Copy link

SparkQA commented Feb 5, 2020

Test build #117904 has finished for PR 27460 at commit b5c0e08.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

I am going to merge to unblock other PRs.

HyukjinKwon added a commit that referenced this pull request Feb 5, 2020
…upgrade, and disable CRAN

### What changes were proposed in this pull request?

There are currently the R test failures after upgrading `testthat` to 2.0.0, and R version 3.5.2 as of SPARK-23435. This PR targets to fix the tests and make the tests pass. See the explanations and causes below:

```
test_context.R:49: failure: Check masked functions
length(maskedCompletely) not equal to length(namesOfMaskedCompletely).
1/1 mismatches
[1] 6 - 4 == 2

test_context.R:53: failure: Check masked functions
sort(maskedCompletely, na.last = TRUE) not equal to sort(namesOfMaskedCompletely, na.last = TRUE).
5/6 mismatches
x[2]: "endsWith"
y[2]: "filter"

x[3]: "filter"
y[3]: "not"

x[4]: "not"
y[4]: "sample"

x[5]: "sample"
y[5]: NA

x[6]: "startsWith"
y[6]: NA
```

From my cursory look, R base and R's version are mismatched. I fixed accordingly and Jenkins will test it out.

```
test_includePackage.R:31: error: include inside function
package or namespace load failed for ���plyr���:
 package ���plyr��� was installed by an R version with different internals; it needs to be reinstalled for use with this R version
Seems it's a package installation issue. Looks like plyr has to be re-installed.
```

From my cursory look, previously installed `plyr` remains and it's not compatible with the new R version. I fixed accordingly and Jenkins will test it out.

```
test_sparkSQL.R:499: warning: SPARK-17811: can create DataFrame containing NA as date and time
Your system is mis-configured: ���/etc/localtime��� is not a symlink
```

Seems a env problem. I suppressed the warnings for now.

```
test_sparkSQL.R:499: warning: SPARK-17811: can create DataFrame containing NA as date and time
It is strongly recommended to set envionment variable TZ to ���America/Los_Angeles��� (or equivalent)
```

Seems a env problem. I suppressed the warnings for now.

```
test_sparkSQL.R:1814: error: string operators
unable to find an inherited method for function ���startsWith��� for signature ���"character"���
1: expect_true(startsWith("Hello World", "Hello")) at /home/jenkins/workspace/SparkPullRequestBuilder2/R/pkg/tests/fulltests/test_sparkSQL.R:1814
2: quasi_label(enquo(object), label)
3: eval_bare(get_expr(quo), get_env(quo))
4: startsWith("Hello World", "Hello")
5: (function (classes, fdef, mtable)
   {
       methods <- .findInheritedMethods(classes, fdef, mtable)
       if (length(methods) == 1L)
           return(methods[[1L]])
       else if (length(methods) == 0L) {
           cnames <- paste0("\"", vapply(classes, as.character, ""), "\"", collapse = ", ")
           stop(gettextf("unable to find an inherited method for function %s for signature %s",
               sQuote(fdefgeneric), sQuote(cnames)), domain = NA)
       }
       else stop("Internal error in finding inherited methods; didn't return a unique method",
           domain = NA)
   })(list("character"), new("nonstandardGenericFunction", .Data = function (x, prefix)
   {
       standardGeneric("startsWith")
   }, generic = structure("startsWith", package = "SparkR"), package = "SparkR", group = list(),
       valueClass = character(0), signature = c("x", "prefix"), default = NULL, skeleton = (function (x,
           prefix)
       stop("invalid call in method dispatch to 'startsWith' (no default method)", domain = NA))(x,
           prefix)), <environment>)
6: stop(gettextf("unable to find an inherited method for function %s for signature %s",
       sQuote(fdefgeneric), sQuote(cnames)), domain = NA)
```

From my cursory look, R base and R's version are mismatched. I fixed accordingly and Jenkins will test it out.

Also, this PR causes a CRAN check failure as below:

```
* creating vignettes ... ERROR
Error: processing vignette 'sparkr-vignettes.Rmd' failed with diagnostics:
package ���htmltools��� was installed by an R version with different internals; it needs to be reinstalled for use with this R version
```

This PR disables it for now.

### Why are the changes needed?

To unblock other PRs.

### Does this PR introduce any user-facing change?

No. Test only and dev only.

### How was this patch tested?

No. I am going to use Jenkins to test.

Closes #27460 from HyukjinKwon/r-test-failure.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit e2d984a)
Signed-off-by: HyukjinKwon <[email protected]>
HyukjinKwon added a commit that referenced this pull request Feb 5, 2020
…upgrade, and disable CRAN

### What changes were proposed in this pull request?

There are currently the R test failures after upgrading `testthat` to 2.0.0, and R version 3.5.2 as of SPARK-23435. This PR targets to fix the tests and make the tests pass. See the explanations and causes below:

```
test_context.R:49: failure: Check masked functions
length(maskedCompletely) not equal to length(namesOfMaskedCompletely).
1/1 mismatches
[1] 6 - 4 == 2

test_context.R:53: failure: Check masked functions
sort(maskedCompletely, na.last = TRUE) not equal to sort(namesOfMaskedCompletely, na.last = TRUE).
5/6 mismatches
x[2]: "endsWith"
y[2]: "filter"

x[3]: "filter"
y[3]: "not"

x[4]: "not"
y[4]: "sample"

x[5]: "sample"
y[5]: NA

x[6]: "startsWith"
y[6]: NA
```

From my cursory look, R base and R's version are mismatched. I fixed accordingly and Jenkins will test it out.

```
test_includePackage.R:31: error: include inside function
package or namespace load failed for ���plyr���:
 package ���plyr��� was installed by an R version with different internals; it needs to be reinstalled for use with this R version
Seems it's a package installation issue. Looks like plyr has to be re-installed.
```

From my cursory look, previously installed `plyr` remains and it's not compatible with the new R version. I fixed accordingly and Jenkins will test it out.

```
test_sparkSQL.R:499: warning: SPARK-17811: can create DataFrame containing NA as date and time
Your system is mis-configured: ���/etc/localtime��� is not a symlink
```

Seems a env problem. I suppressed the warnings for now.

```
test_sparkSQL.R:499: warning: SPARK-17811: can create DataFrame containing NA as date and time
It is strongly recommended to set envionment variable TZ to ���America/Los_Angeles��� (or equivalent)
```

Seems a env problem. I suppressed the warnings for now.

```
test_sparkSQL.R:1814: error: string operators
unable to find an inherited method for function ���startsWith��� for signature ���"character"���
1: expect_true(startsWith("Hello World", "Hello")) at /home/jenkins/workspace/SparkPullRequestBuilder2/R/pkg/tests/fulltests/test_sparkSQL.R:1814
2: quasi_label(enquo(object), label)
3: eval_bare(get_expr(quo), get_env(quo))
4: startsWith("Hello World", "Hello")
5: (function (classes, fdef, mtable)
   {
       methods <- .findInheritedMethods(classes, fdef, mtable)
       if (length(methods) == 1L)
           return(methods[[1L]])
       else if (length(methods) == 0L) {
           cnames <- paste0("\"", vapply(classes, as.character, ""), "\"", collapse = ", ")
           stop(gettextf("unable to find an inherited method for function %s for signature %s",
               sQuote(fdefgeneric), sQuote(cnames)), domain = NA)
       }
       else stop("Internal error in finding inherited methods; didn't return a unique method",
           domain = NA)
   })(list("character"), new("nonstandardGenericFunction", .Data = function (x, prefix)
   {
       standardGeneric("startsWith")
   }, generic = structure("startsWith", package = "SparkR"), package = "SparkR", group = list(),
       valueClass = character(0), signature = c("x", "prefix"), default = NULL, skeleton = (function (x,
           prefix)
       stop("invalid call in method dispatch to 'startsWith' (no default method)", domain = NA))(x,
           prefix)), <environment>)
6: stop(gettextf("unable to find an inherited method for function %s for signature %s",
       sQuote(fdefgeneric), sQuote(cnames)), domain = NA)
```

From my cursory look, R base and R's version are mismatched. I fixed accordingly and Jenkins will test it out.

Also, this PR causes a CRAN check failure as below:

```
* creating vignettes ... ERROR
Error: processing vignette 'sparkr-vignettes.Rmd' failed with diagnostics:
package ���htmltools��� was installed by an R version with different internals; it needs to be reinstalled for use with this R version
```

This PR disables it for now.

### Why are the changes needed?

To unblock other PRs.

### Does this PR introduce any user-facing change?

No. Test only and dev only.

### How was this patch tested?

No. I am going to use Jenkins to test.

Closes #27460 from HyukjinKwon/r-test-failure.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit e2d984a)
Signed-off-by: HyukjinKwon <[email protected]>
@HyukjinKwon
Copy link
Member Author

Merged to master, branch-3.0, and branch-2.4.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Feb 5, 2020

@shaneknapp to summarize what happen so far:

  • Seems R version itself bumped up correctly to 3.5.2; however, seems we should reinstall the packages installed previously, including r-base. It caused some problems with error messages such as the below (please refer the PR description).

    * creating vignettes ... ERROR
    Error: processing vignette 'sparkr-vignettes.Rmd' failed with diagnostics:
    package ���htmltools��� was installed by an R version with different internals; it needs to be 
    reinstalled for use with this R version
    

    Currently, I fixed it by making tests permissive and just skipping CRAN check but we should at least reenable CRAN back. Can we reinstall the packages previously installed? It should be able to test via manually calling ./R/check-cran.sh or reverting the changes made in this PR at R/run-tests.sh .

    Once this is done, I should reenable it here per SPARK-30737.

  • Looks like Arrow R library was unable to find after the upgrade:

    test_sparkSQL_arrow.R:25: skip: createDataFrame/collect Arrow optimization
    arrow cannot be loaded
    

    This seems causing to skip Arrow related tests in Jenkins.

  • I think this is minor but seems there are some enviornment issues assuming from:

    test_sparkSQL.R:499: warning: SPARK-17811: can create DataFrame containing NA as date and time
    Your system is mis-configured: ���/etc/localtime��� is not a symlink
    
    test_sparkSQL.R:499: warning: SPARK-17811: can create DataFrame containing NA as date and time
    It is strongly recommended to set envionment variable TZ to ���America/Los_Angeles��� (or 
    equivalent)
    

    Once this is done, I should remove this line. But this isn't really a big deal.

@HyukjinKwon
Copy link
Member Author

ah cc @zero323 too.

@zero323
Copy link
Member

zero323 commented Feb 5, 2020

Thanks for working on this @HyukjinKwon!

Side question ‒ is testthat on Jenkins really 2.0. not 2.3?

@shaneknapp
Copy link
Contributor

@shaneknapp to summarize what happen so far:

  • Seems R version itself bumped up correctly to 3.5.2; however, seems we should reinstall the packages installed previously, including r-base. It caused some problems with error messages such as the below (please refer the PR description).

yeah, upgrading testthat lead to my worst nightmare: the game of R dependency whack-a-mole.

i'm currently continuing playing right now and will continue to update packages i find.

Currently, I fixed it by making tests permissive and just skipping CRAN check but we should at least reenable CRAN back. Can we reinstall the packages previously installed? It should be able to test via manually calling ./R/check-cran.sh or reverting the changes made in this PR at R/run-tests.sh .

sounds good. i'm testing manually on a centos worker and will update this PR when i think i've got everything.

  • Looks like Arrow R library was unable to find after the upgrade:
    test_sparkSQL_arrow.R:25: skip: createDataFrame/collect Arrow optimization
    arrow cannot be loaded
    

interesting. i'll install that too.

  • I think this is minor but seems there are some enviornment issues assuming from:
    test_sparkSQL.R:499: warning: SPARK-17811: can create DataFrame containing NA as date and time
    Your system is mis-configured: ���/etc/localtime��� is not a symlink
    

that is definitely odd. will investigate.

have i mentioned how much i hate touching R? :)

@shaneknapp
Copy link
Contributor

btw debugging happening here: #27468

it appears that i'll need to open up a new PR (once testing is done) for this commit: ecbc087

@shaneknapp
Copy link
Contributor

Thanks for working on this @HyukjinKwon!

Side question ‒ is testthat on Jenkins really 2.0. not 2.3?

yes, 2.0.0:

testthat "testthat" "/usr/lib64/R/library" "2.0.0" NA

@zero323
Copy link
Member

zero323 commented Feb 5, 2020

@shaneknapp

yes, 2.0.0:

testthat "testthat" "/usr/lib64/R/library" "2.0.0" NA

I know I shouldn't ask, but is there any chance to get it up to 2.3? 2.0 is already pretty old, and considering its entanglement with tidyverse it will sooner or later block something. And it would be nice to have it synced with Windows builds.

@shaneknapp
Copy link
Contributor

@shaneknapp

yes, 2.0.0:
testthat "testthat" "/usr/lib64/R/library" "2.0.0" NA

I know I shouldn't ask, but is there any chance to get it up to 2.3? 2.0 is already pretty old, and considering its entanglement with tidyverse it will sooner or later block something. And it would be nice to have it synced with Windows builds.

as evidenced by the amount of time it took me to get the build system fixed, i'd rather wait until we absolutely MUST upgrade testthat to 2.3.x.

@shaneknapp
Copy link
Contributor

i mean, i'll do it, but not until next week. after the past few hours of playing dependency whack-a-mole, i think i need a stiff drink. ;)

@HyukjinKwon
Copy link
Member Author

Yeah, from my experience, touching R was really painful. So I tried to make the tests pass first to decouple the tests vs env issue to earn some enough time :-). Thanks, @shaneknapp.

@shaneknapp
Copy link
Contributor

and thanks @HyukjinKwon for taking care of this last night.

@HyukjinKwon
Copy link
Member Author

I opened a PR enable back at #27472

HyukjinKwon added a commit that referenced this pull request Feb 6, 2020
…ncoding to DESCRIPTION

### What changes were proposed in this pull request?

This PR proposes to reenable CRAN check disabled at #27460. Given the tests #27468, seems we should also port #23823 together.

### Why are the changes needed?
To check CRAN back.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
It was tested at #27468 and Jenkins should test it out.

Closes #27472 from HyukjinKwon/SPARK-30737.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
HyukjinKwon added a commit that referenced this pull request Feb 6, 2020
…ncoding to DESCRIPTION

### What changes were proposed in this pull request?

This PR proposes to reenable CRAN check disabled at #27460. Given the tests #27468, seems we should also port #23823 together.

### Why are the changes needed?
To check CRAN back.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
It was tested at #27468 and Jenkins should test it out.

Closes #27472 from HyukjinKwon/SPARK-30737.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit b95ccb1)
Signed-off-by: HyukjinKwon <[email protected]>
HyukjinKwon added a commit that referenced this pull request Feb 6, 2020
…ncoding to DESCRIPTION

### What changes were proposed in this pull request?

This PR proposes to reenable CRAN check disabled at #27460. Given the tests #27468, seems we should also port #23823 together.

### Why are the changes needed?
To check CRAN back.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
It was tested at #27468 and Jenkins should test it out.

Closes #27472 from HyukjinKwon/SPARK-30737.

Authored-by: HyukjinKwon <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
(cherry picked from commit b95ccb1)
Signed-off-by: HyukjinKwon <[email protected]>
@HyukjinKwon HyukjinKwon deleted the r-test-failure branch March 3, 2020 01:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants