Various improvements to integration tests #6

foxish · 2017-12-20T13:51:05Z

Removed the MINIKUBE_TEST_BACKEND requirements for the SparkPI tests and some deprecated info.

Verified running on cloud with:

mvn clean -Ddownload.plugin.skip=true integration-test  \
-Dspark-distro-tgz=/home/ramanathana/go-workspace/src/apache-spark-on-k8s/release/spark/dist/spark.tar.gz  \
-Dspark-dockerfiles-dir=/home/ramanathana/go-workspace/src/apache-spark-on-k8s/release/spark/dist/kubernetes/dockerfiles \
-DextraScalaTestArgs="-Dspark.kubernetes.test.master=k8s://https://... -Dspark.docker.test.driverImage=spark-driver -Dspark.docker.test.executorImage=spark-executor"

cc/ @kimoonkim @mccheah @liyinan926

Minor changes to old pom - creating a directory that needs to exist

kimoonkim

In your example command line, I see you used custom docker images like gcr.io/my-image/driver:latest. Are you assuming those docker images are pre-built? Is that a requirement to use the cloud?

I understand we won't be able to use dockerd inside mini-kube. But can we use dockerd that comes with the cloud? So we can still build docker images off a distro tarball we got?

kimoonkim · 2017-12-20T19:44:20Z

integration-test/pom.xml

              <arguments>
                <argument>-c</argument>
-                <argument>rm -rf spark-distro; mkdir spark-distro-tmp; cd spark-distro-tmp; tar xfz ${spark-distro-tgz}; mv * ../spark-distro; cd ..; rm -rf spark-distro-tmp</argument>
+                <argument>rm -rf spark-distro; mkdir spark-distro; mkdir spark-distro-tmp; cd spark-distro-tmp; tar xfz ${spark-distro-tgz}; mv * ../spark-distro; cd ..; rm -rf spark-distro-tmp</argument>


Hmm. I see this adding mkdir spark-distro. But later we go inside a tmp dir, untar the distro and do mv * ../spark-distro.

The unpacked tarball has a top level dir, like spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9. Wouldn't the mv command create a subdir hierarchy we don't want, like spark-distro/spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9?

In my usage, I was actually seeing a failure without this create step. mv * ../spark-distro actually expects the directory to be created and present was my understanding.

The intent was the mv statement will rename the unpacked top level dir to be the ../spark-distro dir. I guess your tarball is not like my tarball :-) Can you check if your tarball creates a top-level dir?

Here's mine:

$ tar tvf spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9.tgz | head drwxr-xr-x kimoonkim/staff 0 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/ drwxr-xr-x kimoonkim/staff 0 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/ -rwxr-xr-x kimoonkim/staff 1089 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/beeline -rw-r--r-- kimoonkim/staff 1064 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/beeline.cmd -rwxr-xr-x kimoonkim/staff 1933 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/find-spark-home -rw-r--r-- kimoonkim/staff 2681 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/find-spark-home.cmd -rw-r--r-- kimoonkim/staff 1892 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/load-spark-env.cmd -rw-r--r-- kimoonkim/staff 2025 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/load-spark-env.sh -rwxr-xr-x kimoonkim/staff 2989 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/pyspark -rw-r--r-- kimoonkim/staff 1170 2017-12-18 13:09 spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9/bin/pyspark.cmd

Ah, forgot to mention that the unpacked top-level dir is the only top level file. That's the precondition for mv * ../spark-distro to work:

/Tmp/spark-distro-tmp$ tar xfz ../spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9.tgz ~/Tmp/spark-distro-tmp$ ls spark-2.3.0-SNAPSHOT-bin-20171218-772e4648d9

kimoonkim · 2017-12-20T19:53:41Z

integration-test/pom.xml

-
+  <profiles>
+    <profile>
+      <id>v2</id>


Can you explain why we need this v2 profile with duplicate plugin config?

Cloud environments will break from the minikube flow here and not need the pre-integration-test phase to run at all. It's not strictly needed, and I'm happy to change it if we have a way to invoke the integration-test step without running the pre-integration-test step.

cc @echarles

foxish · 2017-12-20T20:28:05Z

In your example command line, I see you used custom docker images like gcr.io/my-image/driver:latest. Are you assuming those docker images are pre-built? Is that a requirement to use the cloud?
I understand we won't be able to use dockerd inside mini-kube. But can we use dockerd that comes with the cloud? So we can still build docker images off a distro tarball we got?

I think we can use dockerd in our test infrastructure. My intent there was to have the flexibility to use a different step for building those docker images. If we're specifying the repo and the docker image anyway, it could but doesn't necessarily have to be built by maven correct?

kimoonkim · 2017-12-20T21:24:46Z

I think we can use dockerd in our test infrastructure. My intent there was to have the flexibility to use a different step for building those docker images. If we're specifying the repo and the docker image anyway, it could but doesn't necessarily have to be built by maven correct?

If we want to skip image building, we can set -Dspark.docker.test.skipBuildImages=true, which is already supported. That and -Dspark.docker.test.*Image will allow people to use pre-built images.

But this should not be a requirement for using the cloud, IMO. It's still nice to be able to build Docker images as part of the integration test automation, especially in CI. To erase the doubt like "Was I using the right images for this test?" when the test fails.

foxish · 2017-12-20T21:27:14Z

Ah, i see your point. Building docker images in maven is fine even on cloud. I'll try and implement it that way. But the download minikube step does become unnecessary so, we do need a way to separate that out.

…

On Dec 20, 2017 1:24 PM, "Kimoon Kim" ***@***.***> wrote: I think we can use dockerd in our test infrastructure. My intent there was to have the flexibility to use a different step for building those docker images. If we're specifying the repo and the docker image anyway, it could but doesn't necessarily have to be built by maven correct? If we want to skip image building, we can set -Dspark.docker.test. skipBuildImages=true, which is already supported. That and -Dspark.docker.test.*Image will allow people to use pre-built images. But this should not be a requirement for using the cloud, IMO. It's still nice to be able to build Docker images as part of the integration test automation, especially in CI. To erase the doubt like "Was I using the right images for this test?" when the test fails. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#6 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA3U58VesmiEVbhyVZWjjPkXAFoOvHzuks5tCXsfgaJpZM4RIal4> .

kimoonkim · 2017-12-20T21:35:55Z

Yes, we want to avoid downloading minikube step for the cloud. The download plugin seems to support a skip option. Can you try using that instead of the profile?

mvn help:describe -Dplugin=com.googlecode.maven-download-plugin:download-maven-plugin -Ddetail | grep -A 5 skip | head -5
    skip (Default: false)
      User property: download.plugin.skip
      Whether to skip execution of Mojo

foxish · 2017-12-21T22:11:13Z

Works as expected. I'm going to try without the changes to the pom again with the skip enabled.

foxish · 2017-12-21T22:15:30Z

@kimoonkim, I'm getting the following when trying to unpack the tar.gz.

[INFO] --- exec-maven-plugin:1.4.0:exec (unpack-spark-distro) @ spark-kubernetes-integration-tests_2.11 ---
mv: target ‘../spark-distro’ is not a directory

I'm running:

mvn clean -Ddownload.plugin.skip=true integration-test  \
-Dspark-distro-tgz=/home/ramanathana/go-workspace/src/apache-spark-on-k8s/release/spark/dist/spark.tar.gz  \
-Dspark-dockerfiles-dir=/home/ramanathana/go-workspace/src/apache-spark-on-k8s/release/spark/dist/kubernetes/dockerfiles \
-DextraScalaTestArgs="-Dspark.kubernetes.test.master=k8s://https://... -Dspark.docker.test.driverImage=spark-driver -Dspark.docker.test.executorImage=spark-executor"

foxish · 2017-12-22T00:25:31Z

In #6 (comment), I had an error in the way I built the tar.gz distro. It works as expected now.

echarles · 2017-12-24T10:35:31Z

Side question Is this repo aimed to replace the current integration-tests?

From what I undersand, the answer is yes. I find the creation or download of the tgz distribution a slowing step as developer productivity (think about making the code change, building the dist, building the docker image and running tests... - Fine if the result is green, but if the test fails, it is going to become IMHO counter-productive).

If not, we could think about a way to have faster iteration with integration tests in the spark repo, and to have the full download... in this repo with code reuse (think about a spark-integration-test.jar)?

Fixes to add new v2 phase

6629ec5

Minor changes to old pom - creating a directory that needs to exist

foxish assigned kimoonkim Dec 20, 2017

kimoonkim reviewed Dec 20, 2017

View reviewed changes

revert changes to pom

cd7f3a6

The chmod can now be removed

3c496b8

foxish merged commit f8a9dec into apache-spark-on-k8s:master Dec 22, 2017

foxish deleted the minor-fixes-and-new-phase branch December 22, 2017 00:21

Various improvements to integration tests #6

Various improvements to integration tests #6

Uh oh!

Conversation

foxish commented Dec 20, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kimoonkim left a comment

Choose a reason for hiding this comment

Uh oh!

kimoonkim Dec 20, 2017

Choose a reason for hiding this comment

Uh oh!

foxish Dec 20, 2017

Choose a reason for hiding this comment

Uh oh!

kimoonkim Dec 20, 2017

Choose a reason for hiding this comment

Uh oh!

kimoonkim Dec 20, 2017

Choose a reason for hiding this comment

Uh oh!

kimoonkim Dec 20, 2017

Choose a reason for hiding this comment

Uh oh!

foxish Dec 20, 2017

Choose a reason for hiding this comment

Uh oh!

foxish commented Dec 20, 2017

Uh oh!

kimoonkim commented Dec 20, 2017

Uh oh!

foxish commented Dec 20, 2017 via email

Uh oh!

kimoonkim commented Dec 20, 2017

Uh oh!

foxish commented Dec 21, 2017

Uh oh!

foxish commented Dec 21, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

foxish commented Dec 22, 2017

Uh oh!

echarles commented Dec 24, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

foxish commented Dec 20, 2017 •

edited

Loading

foxish commented Dec 21, 2017 •

edited

Loading