Skip to content

Conversation

@vanzin
Copy link
Contributor

@vanzin vanzin commented Jun 5, 2015

Even with all the efforts to cleanup the temp directories created by
unit tests, Spark leaves a lot of garbage in /tmp after a test run.
This change overrides java.io.tmpdir to place those files under the
build directory instead.

After an sbt full unit test run, I was left with > 400 MB of temp
files. Since they're now under the build dir, it's much easier to
clean them up.

Also make a slight change to a unit test to make it not pollute the
source directory with test data.

Marcelo Vanzin added 2 commits June 5, 2015 11:02
Even with all the efforts to cleanup the temp directories created by
unit tests, Spark leaves a lot of garbage in /tmp after a test run.
This change overrides java.io.tmpdir to place those files under the
build directory instead.

After an sbt full unit test run, I was left with > 400 MB of temp
files. Since they're now under the build dir, it's much easier to
clean them up.

Also make a slight change to a unit test to make it not pollute the
source directory with test data.

Author: Marcelo Vanzin <[email protected]>

Closes apache#6653 from vanzin/unit-test-tmp and squashes the following commits:

31e2dd5 [Marcelo Vanzin] Fix tests that depend on each other.
aa92944 [Marcelo Vanzin] [minor] [build] Use custom temp directory during build.
@vanzin
Copy link
Contributor Author

vanzin commented Jun 5, 2015

This reverts the revert and fixes the issue the original change caused.

@JoshRosen
Copy link
Contributor

Can you explain what the root problem was? Why did this depend on test ordering?

@vanzin
Copy link
Contributor Author

vanzin commented Jun 5, 2015

The root problem was that the tmp directory was not created (I added code to both builds to explicitly do it). I guess depending on which tests run first the directory might be there (some tests use guava's Files.createTempDirectory, maybe that one creates the dir if it isn't there).

@SparkQA
Copy link

SparkQA commented Jun 5, 2015

Test build #34297 timed out for PR 6674 at commit 0f8ad41 after a configured wait of 175m.

@vanzin
Copy link
Contributor Author

vanzin commented Jun 5, 2015

Jenkins, retest this please.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with doing it this way is that buildLocation is just using the current directory as the build root, which is not always a valid assumption. We should instead remove that val and use projectRoot.value anyplace that we need the root spark directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should affect also the val spark =... line above, right? So my particular change shouldn't have made anything worse than before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sorry I should have been more clear. This was a general comment about how we are doing things in the SBT build and not something that should have blocked merging this PR.

@SparkQA
Copy link

SparkQA commented Jun 5, 2015

Test build #34317 has finished for PR 6674 at commit 0f8ad41.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jun 6, 2015

Looks good to me. Do we need any additional testing before trying to merge it again? I could merge to master and then later backport if it seems OK for a few days.

@andrewor14
Copy link
Contributor

I've triggered a few more test runs for this patch. Hopefully this time it's not flaky.

@SparkQA
Copy link

SparkQA commented Jun 6, 2015

Test build #885 has finished for PR 6674 at commit 0f8ad41.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 7, 2015

Test build #889 has finished for PR 6674 at commit 0f8ad41.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 7, 2015

Test build #888 timed out for PR 6674 at commit 0f8ad41 after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Jun 7, 2015

Test build #887 timed out for PR 6674 at commit 0f8ad41 after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Jun 7, 2015

Test build #886 timed out for PR 6674 at commit 0f8ad41 after a configured wait of 175m.

@srowen
Copy link
Member

srowen commented Jun 7, 2015

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Jun 7, 2015

Test build #34397 has finished for PR 6674 at commit 0f8ad41.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jun 8, 2015

I'm going to merge this into master and see how it goes, then to 1.4 and see what happens, then 1.3.

@asfgit asfgit closed this in a1d9e5c Jun 8, 2015
@vanzin vanzin deleted the SPARK-8126 branch June 8, 2015 18:16
asfgit pushed a commit that referenced this pull request Jun 8, 2015
Even with all the efforts to cleanup the temp directories created by
unit tests, Spark leaves a lot of garbage in /tmp after a test run.
This change overrides java.io.tmpdir to place those files under the
build directory instead.

After an sbt full unit test run, I was left with > 400 MB of temp
files. Since they're now under the build dir, it's much easier to
clean them up.

Also make a slight change to a unit test to make it not pollute the
source directory with test data.

Author: Marcelo Vanzin <[email protected]>

Closes #6674 from vanzin/SPARK-8126 and squashes the following commits:

0f8ad41 [Marcelo Vanzin] Make sure tmp dir exists when tests run.
643e916 [Marcelo Vanzin] [MINOR] [BUILD] Use custom temp directory during build.
@srowen
Copy link
Member

srowen commented Jun 8, 2015

master looks OK and has all day. Moving on to 1.4.

@srowen
Copy link
Member

srowen commented Jun 9, 2015

All of the 1.4 builds have succeeded since this patch, some a few times. The exception is: https://amplab.cs.berkeley.edu/jenkins/job/Spark-1.4-Maven-with-YARN/ This succeeded after, then failed, and the failure in the Kafka suite looks unrelated since it doesn't involve a temp file. I'm declaring victory and moving to 1.3.

asfgit pushed a commit that referenced this pull request Jun 9, 2015
Even with all the efforts to cleanup the temp directories created by
unit tests, Spark leaves a lot of garbage in /tmp after a test run.
This change overrides java.io.tmpdir to place those files under the
build directory instead.

After an sbt full unit test run, I was left with > 400 MB of temp
files. Since they're now under the build dir, it's much easier to
clean them up.

Also make a slight change to a unit test to make it not pollute the
source directory with test data.

Author: Marcelo Vanzin <[email protected]>

Closes #6674 from vanzin/SPARK-8126 and squashes the following commits:

0f8ad41 [Marcelo Vanzin] Make sure tmp dir exists when tests run.
643e916 [Marcelo Vanzin] [MINOR] [BUILD] Use custom temp directory during build.
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
Even with all the efforts to cleanup the temp directories created by
unit tests, Spark leaves a lot of garbage in /tmp after a test run.
This change overrides java.io.tmpdir to place those files under the
build directory instead.

After an sbt full unit test run, I was left with > 400 MB of temp
files. Since they're now under the build dir, it's much easier to
clean them up.

Also make a slight change to a unit test to make it not pollute the
source directory with test data.

Author: Marcelo Vanzin <[email protected]>

Closes apache#6674 from vanzin/SPARK-8126 and squashes the following commits:

0f8ad41 [Marcelo Vanzin] Make sure tmp dir exists when tests run.
643e916 [Marcelo Vanzin] [MINOR] [BUILD] Use custom temp directory during build.
mtbrandy pushed a commit to ibmsoe/spark that referenced this pull request Jul 2, 2015
Even with all the efforts to cleanup the temp directories created by
unit tests, Spark leaves a lot of garbage in /tmp after a test run.
This change overrides java.io.tmpdir to place those files under the
build directory instead.

After an sbt full unit test run, I was left with > 400 MB of temp
files. Since they're now under the build dir, it's much easier to
clean them up.

Also make a slight change to a unit test to make it not pollute the
source directory with test data.

Author: Marcelo Vanzin <[email protected]>

Closes apache#6674 from vanzin/SPARK-8126 and squashes the following commits:

0f8ad41 [Marcelo Vanzin] Make sure tmp dir exists when tests run.
643e916 [Marcelo Vanzin] [MINOR] [BUILD] Use custom temp directory during build.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants