[SPARK-31780][K8S][TESTS] Add R test tag to exclude R K8s image building and test #28594

dongjoon-hyun · 2020-05-20T23:27:22Z

What changes were proposed in this pull request?

This PR aims to skip R image building and one R test during integration tests by using --exclude-tags r.

Why are the changes needed?

We have only one R integration test case, Run SparkR on simple dataframe.R example, for submission test coverage. Since this is rarely changed, we can skip this and save the efforts required for building the whole R image and running the single test.

KubernetesSuite:
...
- Run SparkR on simple dataframe.R example
Run completed in 10 minutes, 20 seconds.
Total number of tests run: 20

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass the K8S integration test and do the following manually. (Note that R test is skipped)

$ resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --deploy-mode docker-for-desktop --exclude-tags r --spark-tgz $PWD/spark-*.tgz
...
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
Run completed in 10 minutes, 23 seconds.
Total number of tests run: 19
Suites: completed 2, aborted 0
Tests: succeeded 19, failed 0, canceled 0, ignored 0, pending 0
All tests passed.

…ing and test

SparkQA · 2020-05-21T00:59:52Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/27551/

dongjoon-hyun · 2020-05-21T01:03:14Z

Hi, @dbtsai .
Could you review this PR?

SparkQA · 2020-05-21T01:28:19Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/27551/

dbtsai · 2020-05-21T01:31:56Z

LGTM.

dongjoon-hyun

Thank you so much, @dbtsai .
Merged to master/3.0.

…ing and test ### What changes were proposed in this pull request? This PR aims to skip R image building and one R test during integration tests by using `--exclude-tags r`. ### Why are the changes needed? We have only one R integration test case, `Run SparkR on simple dataframe.R example`, for submission test coverage. Since this is rarely changed, we can skip this and save the efforts required for building the whole R image and running the single test. ``` KubernetesSuite: ... - Run SparkR on simple dataframe.R example Run completed in 10 minutes, 20 seconds. Total number of tests run: 20 ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the K8S integration test and do the following manually. (Note that R test is skipped) ``` $ resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --deploy-mode docker-for-desktop --exclude-tags r --spark-tgz $PWD/spark-*.tgz ... KubernetesSuite: - Run SparkPi with no resources - Run SparkPi with a very long application name. - Use SparkLauncher.NO_RESOURCE - Run SparkPi with a master URL without a scheme. - Run SparkPi with an argument. - Run SparkPi with custom labels, annotations, and environment variables. - All pods have the same service account by default - Run extraJVMOptions check on driver - Run SparkRemoteFileTest using a remote data file - Run SparkPi with env and mount secrets. - Run PySpark on simple pi.py example - Run PySpark with Python2 to test a pyfiles example - Run PySpark with Python3 to test a pyfiles example - Run PySpark with memory customization - Run in client mode. - Start pod creation from template - PVs with local storage - Launcher client dependencies - Test basic decommissioning Run completed in 10 minutes, 23 seconds. Total number of tests run: 19 Suites: completed 2, aborted 0 Tests: succeeded 19, failed 0, canceled 0, ignored 0, pending 0 All tests passed. ``` Closes #28594 from dongjoon-hyun/SPARK-31780. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit a06768e) Signed-off-by: Dongjoon Hyun <[email protected]>

SparkQA · 2020-05-21T02:12:16Z

Test build #122907 has finished for PR 28594 at commit a26fb26.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-05-21T03:05:34Z

Test build #122908 has finished for PR 28594 at commit 0a37e82.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

ifilonenko · 2020-07-22T18:26:53Z

Is there a reason we are skipping this test? We are doing:

./dev/make-distribution.sh --name ${DATE}-${REVISION} --r --pip --tgz -DzincPort=${ZINC_PORT} \
     -Phadoop-2.7 -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver
retcode=$?

in Jenkins, so we are already building the distribution with R support. What do we gain from skipping this test? I would have preferred to keep this test tbh, (so as to have greater coverage).

shaneknapp · 2020-07-22T18:29:43Z

Is there a reason we are skipping this test? We are doing:
./dev/make-distribution.sh --name ${DATE}-${REVISION} --r --pip --tgz -DzincPort=${ZINC_PORT} \
     -Phadoop-2.7 -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver
retcode=$?
in Jenkins, so we are already building the distribution with R support. What do we gain from skipping this test? I would have preferred to keep this test tbh, (so as to have greater coverage).

i agree... more testing is better imo, especially since we have all the framework in place.

dongjoon-hyun · 2020-07-23T06:25:54Z

Hi, @ifilonenko and @shaneknapp .
This is designed to run K8s test more selectively like the other test tags (e.g. ExtendedYarnTest).

dongjoon-hyun · 2020-07-23T06:27:35Z

By default (=if you don't use --exclude-tags r), the test coverage is unchanged in the community so far.

ifilonenko · 2020-07-23T17:39:26Z

Sure, I totally see the need for an --exclude-tag, but the default or current behavior of the integration tests now skips over the SparkR test:example. This has been a side-effect of this change.

shaneknapp · 2020-07-23T22:30:34Z

Sure, I totally see the need for an --exclude-tag, but the default or current behavior of the integration tests now skips over the SparkR test:example. This has been a side-effect of this change.

the specific bits to look at are towards the end of the build... right above KubernetesSuite, we build the spark-r image, and the new behavior introduced by this PR then skips the actual integration test against that image within KubernetesSuite.

By default (=if you don't use --exclude-tags r), the test coverage is unchanged in the community so far.

@dongjoon-hyun this is the exact command used by the SparkPullRequestBuilder-K8s job to launch the integration tests. we are not using --exclude-tags.

resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh \
    $JAVA_MVN_FLAG --spark-tgz ${WORKSPACE}/spark-*.tgz

dongjoon-hyun · 2020-07-23T23:27:52Z

@ifilonenko and @shaneknapp . Are you sure that is caused by this PR?

This has been a side-effect of this change.

IIRC, at that time, this PR is tested correctly and I verified that R test are executed there.

[SPARK-31780][K8S][TESTS] Add R test tag to exclude R K8s image building and test #28594 (comment)

I guess another PR after this commit may cause the missing R testing.

dongjoon-hyun · 2020-07-23T23:29:17Z

Anyway, since I made this option, let me take a look what is going on there~

resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh
$JAVA_MVN_FLAG --spark-tgz ${WORKSPACE}/spark-*.tgz

dongjoon-hyun · 2020-07-23T23:42:29Z

BTW, I also have a personal Jenkins K8s machine on master branch and the following is the result last week (2020-07-16T12:14:18-07:00). Of course, I confirmed that 2020-07-20T22:25:51-07:00 test result is missing Run SparkR on simple dataframe.R example test. In short, I suspect that something happens on master branch between July 16 and July 20.

KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
- Run SparkR on simple dataframe.R example
Run completed in 9 minutes, 11 seconds.
Total number of tests run: 19
Suites: completed 2, aborted 0
Tests: succeeded 19, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Spark Project Parent POM 3.1.0-SNAPSHOT:
[INFO] 
[INFO] Spark Project Parent POM ........................... SUCCESS [  3.535 s]
[INFO] Spark Project Tags ................................. SUCCESS [  9.053 s]
[INFO] Spark Project Local DB ............................. SUCCESS [  5.057 s]
[INFO] Spark Project Networking ........................... SUCCESS [  6.522 s]
[INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [  4.582 s]
[INFO] Spark Project Unsafe ............................... SUCCESS [ 10.181 s]
[INFO] Spark Project Launcher ............................. SUCCESS [  4.874 s]
[INFO] Spark Project Core ................................. SUCCESS [02:05 min]
[INFO] Spark Project Kubernetes Integration Tests ......... SUCCESS [12:29 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  15:20 min
[INFO] Finished at: 2020-07-16T12:14:18-07:00
[INFO] ------------------------------------------------------------------------

dongjoon-hyun · 2020-07-23T23:45:55Z

Also, please see the following commit log one week ago. It has Run SparkR on simple dataframe.R example

fb51925

dongjoon-hyun · 2020-07-23T23:55:44Z

I found another evidence from our Apache Spark AmpLab Jenkins farm log. The failed test case is R test.

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/lastUnsuccessfulBuild/console

Started 6 days 0 hr ago
Took 1 hr 18 min on research-jenkins-worker-09

PR #28708: [SPARK-20629][CORE][K8S] Co...

KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
- Run SparkR on simple dataframe.R example *** FAILED ***
  The code passed to eventually never returned normally. Attempted 190 times over 3.0009884667333337 minutes. Last failure message: false was not true. (KubernetesSuite.scala:386)
Run completed in 14 minutes, 46 seconds.
Total number of tests run: 19
Suites: completed 2, aborted 0
Tests: succeeded 18, failed 1, canceled 0, ignored 0, pending 0
*** 1 TEST FAILED ***

dongjoon-hyun · 2020-07-24T00:00:18Z

Since this PR was merged on May 20 and I provided multiple recent evidences of the R testing like the above, @ifilonenko 's claim is wrong.

This has been a side-effect of this change.

I'll stop my investigation here. Please take a look at more recent commits to solve your problems. Thanks.

ifilonenko · 2020-07-24T15:18:55Z

Of course, I confirmed that 2020-07-20T22:25:51-07:00 test result is missing Run SparkR on simple dataframe.R example test. In short, I suspect that something happens on master branch between July 16 and July 20

I was sure you did :) but it seemed that this was the only code path that touched the R tests, recently and the inclusion of the RTestTag() threw me off. Thanks so much for the thorough investigation! I’ll review elsewhere, thanks!

dongjoon-hyun · 2020-07-24T15:23:21Z

Thanks. I understand, @ifilonenko . :)

shaneknapp · 2020-07-24T16:37:15Z

@dongjoon-hyun thanks for the breakdown!

[SPARK-31780][K8S][TESTS] Add R test tag to exclude R K8s image build…

a26fb26

…ing and test

probot-autolabeler bot added BUILD KUBERNETES labels May 20, 2020

fix

0a37e82

This comment has been minimized.

Sign in to view

dongjoon-hyun commented May 21, 2020

View reviewed changes

dongjoon-hyun closed this in a06768e May 21, 2020

dongjoon-hyun deleted the SPARK-31780 branch May 21, 2020 01:34

[SPARK-31780][K8S][TESTS] Add R test tag to exclude R K8s image building and test #28594

[SPARK-31780][K8S][TESTS] Add R test tag to exclude R K8s image building and test #28594

Uh oh!

Conversation

dongjoon-hyun commented May 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

This comment has been minimized.

This comment has been minimized.

SparkQA commented May 21, 2020

Uh oh!

dongjoon-hyun commented May 21, 2020

Uh oh!

SparkQA commented May 21, 2020

Uh oh!

dbtsai commented May 21, 2020

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

SparkQA commented May 21, 2020

Uh oh!

SparkQA commented May 21, 2020

Uh oh!

ifilonenko commented Jul 22, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shaneknapp commented Jul 22, 2020

Uh oh!

dongjoon-hyun commented Jul 23, 2020

Uh oh!

dongjoon-hyun commented Jul 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ifilonenko commented Jul 23, 2020

Uh oh!

shaneknapp commented Jul 23, 2020

Uh oh!

dongjoon-hyun commented Jul 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dongjoon-hyun commented Jul 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dongjoon-hyun commented Jul 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dongjoon-hyun commented Jul 23, 2020

Uh oh!

dongjoon-hyun commented Jul 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dongjoon-hyun commented Jul 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ifilonenko commented Jul 24, 2020

Uh oh!

dongjoon-hyun commented Jul 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shaneknapp commented Jul 24, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

dongjoon-hyun commented May 20, 2020 •

edited

Loading

ifilonenko commented Jul 22, 2020 •

edited

Loading

dongjoon-hyun commented Jul 23, 2020 •

edited

Loading

dongjoon-hyun commented Jul 23, 2020 •

edited

Loading

dongjoon-hyun commented Jul 23, 2020 •

edited

Loading

dongjoon-hyun commented Jul 23, 2020 •

edited

Loading

dongjoon-hyun commented Jul 23, 2020 •

edited

Loading

dongjoon-hyun commented Jul 24, 2020 •

edited

Loading

dongjoon-hyun commented Jul 24, 2020 •

edited

Loading