Skip to content

Conversation

@WangTaoTheTonic
Copy link
Contributor

@SparkQA
Copy link

SparkQA commented Sep 29, 2014

QA tests have started for PR 2567 at commit 9cc3f7a.

  • This patch merges cleanly.

@marmbrus
Copy link
Contributor

Thanks! I've merged this to master.

guavuslabs-builder pushed a commit to ThalesGroup/spark that referenced this pull request Sep 29, 2014
https://issues.apache.org/jira/browse/SPARK-3715

Author: WangTaoTheTonic <[email protected]>

Closes apache#2567 from WangTaoTheTonic/minortypo and squashes the following commits:

9cc3f7a [WangTaoTheTonic] minor typo

(cherry picked from commit 1f13a40)
Signed-off-by: Michael Armbrust <[email protected]>
@asfgit asfgit closed this in 1f13a40 Sep 29, 2014
@SparkQA
Copy link

SparkQA commented Sep 29, 2014

QA tests have finished for PR 2567 at commit 9cc3f7a.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20954/

tdas added a commit to tdas/spark that referenced this pull request Jun 4, 2018
…batch

**This PR is not for merging, only for looking at the change. Consider only the changes to the file MicroBatchExecution.scala.**

The error occurs when we are recovering from a failure in a no-data batch (say X) that has been planned (i.e. written to offset log) but not executed (i.e. not written to commit log). Upon recovery the following sequence of events happen.

1. `MicroBatchExecution.populateStartOffsets` sets `currentBatchId` to X. Since there was no data in the batch, the `availableOffsets` is same as `committedOffsets`, so `isNewDataAvailable` is `false`.
2. When `MicroBatchExecution.constructNextBatch` is called, ideally it should immediately return true because the next batch has already been constructed. However, the check of whether the batch has been constructed was `if (isNewDataAvailable) return true`. Since the planned batch is a no-data batch, it escaped this check and proceeded to plan the same batch X *once again*.

The correct solution is to check the offset log whether the currentBatchId is the latest or not. This is the fix below.

TODO

Author: Tathagata Das <[email protected]>

Closes apache#2567 from tdas/SC-11085.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants