Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented May 20, 2020

What changes were proposed in this pull request?

This PR aims to skip R image building and one R test during integration tests by using --exclude-tags r.

Why are the changes needed?

We have only one R integration test case, Run SparkR on simple dataframe.R example, for submission test coverage. Since this is rarely changed, we can skip this and save the efforts required for building the whole R image and running the single test.

KubernetesSuite:
...
- Run SparkR on simple dataframe.R example
Run completed in 10 minutes, 20 seconds.
Total number of tests run: 20

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass the K8S integration test and do the following manually. (Note that R test is skipped)

$ resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --deploy-mode docker-for-desktop --exclude-tags r --spark-tgz $PWD/spark-*.tgz
...
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
Run completed in 10 minutes, 23 seconds.
Total number of tests run: 19
Suites: completed 2, aborted 0
Tests: succeeded 19, failed 0, canceled 0, ignored 0, pending 0
All tests passed.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA
Copy link

SparkQA commented May 21, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/27551/

@dongjoon-hyun
Copy link
Member Author

Hi, @dbtsai .
Could you review this PR?

@SparkQA
Copy link

SparkQA commented May 21, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/27551/

@dbtsai
Copy link
Member

dbtsai commented May 21, 2020

LGTM.

Copy link
Member Author

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much, @dbtsai .
Merged to master/3.0.

dongjoon-hyun added a commit that referenced this pull request May 21, 2020
…ing and test

### What changes were proposed in this pull request?

This PR aims to skip R image building and one R test during integration tests by using `--exclude-tags r`.

### Why are the changes needed?

We have only one R integration test case, `Run SparkR on simple dataframe.R example`, for submission test coverage. Since this is rarely changed, we can skip this and save the efforts required for building the whole R image and running the single test.
```
KubernetesSuite:
...
- Run SparkR on simple dataframe.R example
Run completed in 10 minutes, 20 seconds.
Total number of tests run: 20
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the K8S integration test and do the following manually. (Note that R test is skipped)
```
$ resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --deploy-mode docker-for-desktop --exclude-tags r --spark-tgz $PWD/spark-*.tgz
...
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
Run completed in 10 minutes, 23 seconds.
Total number of tests run: 19
Suites: completed 2, aborted 0
Tests: succeeded 19, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```

Closes #28594 from dongjoon-hyun/SPARK-31780.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit a06768e)
Signed-off-by: Dongjoon Hyun <[email protected]>
@dongjoon-hyun dongjoon-hyun deleted the SPARK-31780 branch May 21, 2020 01:34
@SparkQA
Copy link

SparkQA commented May 21, 2020

Test build #122907 has finished for PR 28594 at commit a26fb26.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 21, 2020

Test build #122908 has finished for PR 28594 at commit 0a37e82.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ifilonenko
Copy link
Contributor

ifilonenko commented Jul 22, 2020

Is there a reason we are skipping this test? We are doing:

./dev/make-distribution.sh --name ${DATE}-${REVISION} --r --pip --tgz -DzincPort=${ZINC_PORT} \
     -Phadoop-2.7 -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver
retcode=$?

in Jenkins, so we are already building the distribution with R support. What do we gain from skipping this test? I would have preferred to keep this test tbh, (so as to have greater coverage).

@shaneknapp
Copy link
Contributor

Is there a reason we are skipping this test? We are doing:

./dev/make-distribution.sh --name ${DATE}-${REVISION} --r --pip --tgz -DzincPort=${ZINC_PORT} \
     -Phadoop-2.7 -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver
retcode=$?

in Jenkins, so we are already building the distribution with R support. What do we gain from skipping this test? I would have preferred to keep this test tbh, (so as to have greater coverage).

i agree... more testing is better imo, especially since we have all the framework in place.

@dongjoon-hyun
Copy link
Member Author

Hi, @ifilonenko and @shaneknapp .
This is designed to run K8s test more selectively like the other test tags (e.g. ExtendedYarnTest).

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented Jul 23, 2020

By default (=if you don't use --exclude-tags r), the test coverage is unchanged in the community so far.

@ifilonenko
Copy link
Contributor

Sure, I totally see the need for an --exclude-tag, but the default or current behavior of the integration tests now skips over the SparkR test:example. This has been a side-effect of this change.

@shaneknapp
Copy link
Contributor

Sure, I totally see the need for an --exclude-tag, but the default or current behavior of the integration tests now skips over the SparkR test:example. This has been a side-effect of this change.

the specific bits to look at are towards the end of the build... right above KubernetesSuite, we build the spark-r image, and the new behavior introduced by this PR then skips the actual integration test against that image within KubernetesSuite.

By default (=if you don't use --exclude-tags r), the test coverage is unchanged in the community so far.

@dongjoon-hyun this is the exact command used by the SparkPullRequestBuilder-K8s job to launch the integration tests. we are not using --exclude-tags.

resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh \
    $JAVA_MVN_FLAG --spark-tgz ${WORKSPACE}/spark-*.tgz

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented Jul 23, 2020

@ifilonenko and @shaneknapp . Are you sure that is caused by this PR?

This has been a side-effect of this change.

IIRC, at that time, this PR is tested correctly and I verified that R test are executed there.

I guess another PR after this commit may cause the missing R testing.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented Jul 23, 2020

Anyway, since I made this option, let me take a look what is going on there~

resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh
$JAVA_MVN_FLAG --spark-tgz ${WORKSPACE}/spark-*.tgz

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented Jul 23, 2020

BTW, I also have a personal Jenkins K8s machine on master branch and the following is the result last week (2020-07-16T12:14:18-07:00). Of course, I confirmed that 2020-07-20T22:25:51-07:00 test result is missing Run SparkR on simple dataframe.R example test. In short, I suspect that something happens on master branch between July 16 and July 20.

KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
- Run SparkR on simple dataframe.R example
Run completed in 9 minutes, 11 seconds.
Total number of tests run: 19
Suites: completed 2, aborted 0
Tests: succeeded 19, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Spark Project Parent POM 3.1.0-SNAPSHOT:
[INFO] 
[INFO] Spark Project Parent POM ........................... SUCCESS [  3.535 s]
[INFO] Spark Project Tags ................................. SUCCESS [  9.053 s]
[INFO] Spark Project Local DB ............................. SUCCESS [  5.057 s]
[INFO] Spark Project Networking ........................... SUCCESS [  6.522 s]
[INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [  4.582 s]
[INFO] Spark Project Unsafe ............................... SUCCESS [ 10.181 s]
[INFO] Spark Project Launcher ............................. SUCCESS [  4.874 s]
[INFO] Spark Project Core ................................. SUCCESS [02:05 min]
[INFO] Spark Project Kubernetes Integration Tests ......... SUCCESS [12:29 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  15:20 min
[INFO] Finished at: 2020-07-16T12:14:18-07:00
[INFO] ------------------------------------------------------------------------

@dongjoon-hyun
Copy link
Member Author

Also, please see the following commit log one week ago. It has Run SparkR on simple dataframe.R example

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented Jul 23, 2020

I found another evidence from our Apache Spark AmpLab Jenkins farm log. The failed test case is R test.

Screen Shot 2020-07-23 at 4 57 32 PM

Started 6 days 0 hr ago
Took 1 hr 18 min on research-jenkins-worker-09
PR #28708: [SPARK-20629][CORE][K8S] Co...
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
- Run SparkR on simple dataframe.R example *** FAILED ***
  The code passed to eventually never returned normally. Attempted 190 times over 3.0009884667333337 minutes. Last failure message: false was not true. (KubernetesSuite.scala:386)
Run completed in 14 minutes, 46 seconds.
Total number of tests run: 19
Suites: completed 2, aborted 0
Tests: succeeded 18, failed 1, canceled 0, ignored 0, pending 0
*** 1 TEST FAILED ***

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented Jul 24, 2020

Since this PR was merged on May 20 and I provided multiple recent evidences of the R testing like the above, @ifilonenko 's claim is wrong.

This has been a side-effect of this change.

I'll stop my investigation here. Please take a look at more recent commits to solve your problems. Thanks.

@ifilonenko
Copy link
Contributor

Of course, I confirmed that 2020-07-20T22:25:51-07:00 test result is missing Run SparkR on simple dataframe.R example test. In short, I suspect that something happens on master branch between July 16 and July 20

I was sure you did :) but it seemed that this was the only code path that touched the R tests, recently and the inclusion of the RTestTag() threw me off. Thanks so much for the thorough investigation! I’ll review elsewhere, thanks!

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented Jul 24, 2020

Thanks. I understand, @ifilonenko . :)

@shaneknapp
Copy link
Contributor

@dongjoon-hyun thanks for the breakdown!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants