-
Notifications
You must be signed in to change notification settings - Fork 9
Various improvements to integration tests #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Various improvements to integration tests #6
Conversation
Minor changes to old pom - creating a directory that needs to exist
kimoonkim
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your example command line, I see you used custom docker images like gcr.io/my-image/driver:latest. Are you assuming those docker images are pre-built? Is that a requirement to use the cloud?
I understand we won't be able to use dockerd inside mini-kube. But can we use dockerd that comes with the cloud? So we can still build docker images off a distro tarball we got?
integration-test/pom.xml
Outdated
| <arguments> | ||
| <argument>-c</argument> | ||
| <argument>rm -rf spark-distro; mkdir spark-distro-tmp; cd spark-distro-tmp; tar xfz ${spark-distro-tgz}; mv * ../spark-distro; cd ..; rm -rf spark-distro-tmp</argument> | ||
| <argument>rm -rf spark-distro; mkdir spark-distro; mkdir spark-distro-tmp; cd spark-distro-tmp; tar xfz ${spark-distro-tgz}; mv * ../spark-distro; cd ..; rm -rf spark-distro-tmp</argument> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I see this adding mkdir spark-distro. But later we go inside a tmp dir, untar the distro and do mv * ../spark-distro.
The unpacked tarball has a top level dir, like spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9. Wouldn't the mv command create a subdir hierarchy we don't want, like spark-distro/spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my usage, I was actually seeing a failure without this create step. mv * ../spark-distro actually expects the directory to be created and present was my understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent was the mv statement will rename the unpacked top level dir to be the ../spark-distro dir. I guess your tarball is not like my tarball :-) Can you check if your tarball creates a top-level dir?
Here's mine:
$ tar tvf spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9.tgz | head
drwxr-xr-x kimoonkim/staff 0 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/
drwxr-xr-x kimoonkim/staff 0 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/
-rwxr-xr-x kimoonkim/staff 1089 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/beeline
-rw-r--r-- kimoonkim/staff 1064 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/beeline.cmd
-rwxr-xr-x kimoonkim/staff 1933 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/find-spark-home
-rw-r--r-- kimoonkim/staff 2681 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/find-spark-home.cmd
-rw-r--r-- kimoonkim/staff 1892 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/load-spark-env.cmd
-rw-r--r-- kimoonkim/staff 2025 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/load-spark-env.sh
-rwxr-xr-x kimoonkim/staff 2989 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/pyspark
-rw-r--r-- kimoonkim/staff 1170 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/pyspark.cmd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, forgot to mention that the unpacked top-level dir is the only top level file. That's the precondition for mv * ../spark-distro to work:
/Tmp/spark-distro-tmp$ tar xfz ../spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9.tgz
~/Tmp/spark-distro-tmp$ ls
spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9
integration-test/pom.xml
Outdated
|
|
||
| <profiles> | ||
| <profile> | ||
| <id>v2</id> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why we need this v2 profile with duplicate plugin config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cloud environments will break from the minikube flow here and not need the pre-integration-test phase to run at all. It's not strictly needed, and I'm happy to change it if we have a way to invoke the integration-test step without running the pre-integration-test step.
cc @echarles
I think we can use dockerd in our test infrastructure. My intent there was to have the flexibility to use a different step for building those docker images. If we're specifying the repo and the docker image anyway, it could but doesn't necessarily have to be built by maven correct? |
If we want to skip image building, we can set But this should not be a requirement for using the cloud, IMO. It's still nice to be able to build Docker images as part of the integration test automation, especially in CI. To erase the doubt like "Was I using the right images for this test?" when the test fails. |
|
Ah, i see your point. Building docker images in maven is fine even on
cloud. I'll try and implement it that way. But the download minikube step
does become unnecessary so, we do need a way to separate that out.
…On Dec 20, 2017 1:24 PM, "Kimoon Kim" ***@***.***> wrote:
I think we can use dockerd in our test infrastructure. My intent there was
to have the flexibility to use a different step for building those docker
images. If we're specifying the repo and the docker image anyway, it could
but doesn't necessarily have to be built by maven correct?
If we want to skip image building, we can set -Dspark.docker.test.
skipBuildImages=true, which is already supported. That and
-Dspark.docker.test.*Image will allow people to use pre-built images.
But this should not be a requirement for using the cloud, IMO. It's still
nice to be able to build Docker images as part of the integration test
automation, especially in CI. To erase the doubt like "Was I using the
right images for this test?" when the test fails.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#6 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA3U58VesmiEVbhyVZWjjPkXAFoOvHzuks5tCXsfgaJpZM4RIal4>
.
|
|
Yes, we want to avoid downloading minikube step for the cloud. The download plugin seems to support a skip option. Can you try using that instead of the profile? |
|
Works as expected. I'm going to try without the changes to the pom again with the skip enabled. |
|
@kimoonkim, I'm getting the following when trying to unpack the tar.gz. I'm running: |
|
In #6 (comment), I had an error in the way I built the tar.gz distro. It works as expected now. |
|
Side question Is this repo aimed to replace the current integration-tests? From what I undersand, the answer is yes. I find the creation or download of the If not, we could think about a way to have faster iteration with integration tests in the spark repo, and to have the full download... in this repo with code reuse (think about a spark-integration-test.jar)? |
Removed the MINIKUBE_TEST_BACKEND requirements for the SparkPI tests and some deprecated info.
Verified running on cloud with:
cc/ @kimoonkim @mccheah @liyinan926