-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-8126] [BUILD] Use custom temp directory during build. #6674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Even with all the efforts to cleanup the temp directories created by unit tests, Spark leaves a lot of garbage in /tmp after a test run. This change overrides java.io.tmpdir to place those files under the build directory instead. After an sbt full unit test run, I was left with > 400 MB of temp files. Since they're now under the build dir, it's much easier to clean them up. Also make a slight change to a unit test to make it not pollute the source directory with test data. Author: Marcelo Vanzin <[email protected]> Closes apache#6653 from vanzin/unit-test-tmp and squashes the following commits: 31e2dd5 [Marcelo Vanzin] Fix tests that depend on each other. aa92944 [Marcelo Vanzin] [minor] [build] Use custom temp directory during build.
|
This reverts the revert and fixes the issue the original change caused. |
|
Can you explain what the root problem was? Why did this depend on test ordering? |
|
The root problem was that the tmp directory was not created (I added code to both builds to explicitly do it). I guess depending on which tests run first the directory might be there (some tests use guava's |
|
Test build #34297 timed out for PR 6674 at commit |
|
Jenkins, retest this please. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @marmbrus
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with doing it this way is that buildLocation is just using the current directory as the build root, which is not always a valid assumption. We should instead remove that val and use projectRoot.value anyplace that we need the root spark directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should affect also the val spark =... line above, right? So my particular change shouldn't have made anything worse than before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sorry I should have been more clear. This was a general comment about how we are doing things in the SBT build and not something that should have blocked merging this PR.
|
Test build #34317 has finished for PR 6674 at commit
|
|
Looks good to me. Do we need any additional testing before trying to merge it again? I could merge to master and then later backport if it seems OK for a few days. |
|
I've triggered a few more test runs for this patch. Hopefully this time it's not flaky. |
|
Test build #885 has finished for PR 6674 at commit
|
|
Test build #889 has finished for PR 6674 at commit
|
|
Test build #888 timed out for PR 6674 at commit |
|
Test build #887 timed out for PR 6674 at commit |
|
Test build #886 timed out for PR 6674 at commit |
|
Jenkins, retest this please |
|
Test build #34397 has finished for PR 6674 at commit
|
|
I'm going to merge this into master and see how it goes, then to 1.4 and see what happens, then 1.3. |
Even with all the efforts to cleanup the temp directories created by unit tests, Spark leaves a lot of garbage in /tmp after a test run. This change overrides java.io.tmpdir to place those files under the build directory instead. After an sbt full unit test run, I was left with > 400 MB of temp files. Since they're now under the build dir, it's much easier to clean them up. Also make a slight change to a unit test to make it not pollute the source directory with test data. Author: Marcelo Vanzin <[email protected]> Closes #6674 from vanzin/SPARK-8126 and squashes the following commits: 0f8ad41 [Marcelo Vanzin] Make sure tmp dir exists when tests run. 643e916 [Marcelo Vanzin] [MINOR] [BUILD] Use custom temp directory during build.
|
|
|
All of the 1.4 builds have succeeded since this patch, some a few times. The exception is: https://amplab.cs.berkeley.edu/jenkins/job/Spark-1.4-Maven-with-YARN/ This succeeded after, then failed, and the failure in the Kafka suite looks unrelated since it doesn't involve a temp file. I'm declaring victory and moving to 1.3. |
Even with all the efforts to cleanup the temp directories created by unit tests, Spark leaves a lot of garbage in /tmp after a test run. This change overrides java.io.tmpdir to place those files under the build directory instead. After an sbt full unit test run, I was left with > 400 MB of temp files. Since they're now under the build dir, it's much easier to clean them up. Also make a slight change to a unit test to make it not pollute the source directory with test data. Author: Marcelo Vanzin <[email protected]> Closes #6674 from vanzin/SPARK-8126 and squashes the following commits: 0f8ad41 [Marcelo Vanzin] Make sure tmp dir exists when tests run. 643e916 [Marcelo Vanzin] [MINOR] [BUILD] Use custom temp directory during build.
Even with all the efforts to cleanup the temp directories created by unit tests, Spark leaves a lot of garbage in /tmp after a test run. This change overrides java.io.tmpdir to place those files under the build directory instead. After an sbt full unit test run, I was left with > 400 MB of temp files. Since they're now under the build dir, it's much easier to clean them up. Also make a slight change to a unit test to make it not pollute the source directory with test data. Author: Marcelo Vanzin <[email protected]> Closes apache#6674 from vanzin/SPARK-8126 and squashes the following commits: 0f8ad41 [Marcelo Vanzin] Make sure tmp dir exists when tests run. 643e916 [Marcelo Vanzin] [MINOR] [BUILD] Use custom temp directory during build.
Even with all the efforts to cleanup the temp directories created by unit tests, Spark leaves a lot of garbage in /tmp after a test run. This change overrides java.io.tmpdir to place those files under the build directory instead. After an sbt full unit test run, I was left with > 400 MB of temp files. Since they're now under the build dir, it's much easier to clean them up. Also make a slight change to a unit test to make it not pollute the source directory with test data. Author: Marcelo Vanzin <[email protected]> Closes apache#6674 from vanzin/SPARK-8126 and squashes the following commits: 0f8ad41 [Marcelo Vanzin] Make sure tmp dir exists when tests run. 643e916 [Marcelo Vanzin] [MINOR] [BUILD] Use custom temp directory during build.
Even with all the efforts to cleanup the temp directories created by
unit tests, Spark leaves a lot of garbage in /tmp after a test run.
This change overrides java.io.tmpdir to place those files under the
build directory instead.
After an sbt full unit test run, I was left with > 400 MB of temp
files. Since they're now under the build dir, it's much easier to
clean them up.
Also make a slight change to a unit test to make it not pollute the
source directory with test data.