Skip to content

Conversation

@andrewor14
Copy link
Contributor

These were ignored because they are incorrectly written; they don't actually trigger stage retries, which is what the tests are testing. These tests are now rewritten to induce stage retries through fetch failures.

Note: there were 2 tests before and now there's only 1. What happened? It turns out that the case where we only resubmit a subset of of the original missing partitions is very difficult to simulate in tests without potentially introducing flakiness. This is because the DAGScheduler removes all map outputs associated with a given executor when this happens, and we will need multiple executors to trigger this case, and sometimes the scheduler still removes map outputs from all executors.

Andrew Or added 4 commits January 27, 2016 16:39
This commit actually removes one of the tests, which test stage
retries when only a subset of the original partitions are
submitted. This scenario is difficult to simulate in tests
without introducing flakiness.
There's a chance that we don't actually wait until we finish
calling the callback since currently we don't wait for it to
happen. This commit adds a way to do that.
@andrewor14 andrewor14 changed the title [SPARK-13053] Unignore tests in InternalAccumulatorSuite [SPARK-13053] [TEST] Unignore tests in InternalAccumulatorSuite Jan 28, 2016
@andrewor14
Copy link
Contributor Author

@JoshRosen

@SparkQA
Copy link

SparkQA commented Jan 28, 2016

Test build #50286 has finished for PR 10969 at commit b0a1f51.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrewor14
Copy link
Contributor Author

retest this please

Events are posted asynchronously so we may not have called
onStageCompleted or onJobEnd etc. before asserting things.
We should wait until all events have been processed before
proceeding.
@SparkQA
Copy link

SparkQA commented Jan 29, 2016

Test build #50315 has finished for PR 10969 at commit 6322d3a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 29, 2016

Test build #50317 has finished for PR 10969 at commit 1372766.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 29, 2016

Test build #50328 has finished for PR 10969 at commit 1372766.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrewor14
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Jan 29, 2016

Test build #50392 has finished for PR 10969 at commit 1372766.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrewor14
Copy link
Contributor Author

Merging into master.

@asfgit asfgit closed this in 15205da Feb 4, 2016
@andrewor14 andrewor14 deleted the unignore-accum-test branch February 4, 2016 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants