-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-23438][DSTREAMS] Fix DStreams data loss with WAL when driver crashes #20620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@jose-torres @tdas @zsxwing could you take a look at this please? |
|
ok to test |
|
Test build #87489 has finished for PR 20620 at commit
|
|
Test build #87494 has finished for PR 20620 at commit
|
| s"${allocatedBlocks.streamIdToAllocatedBlocks}") | ||
| streamIdToUnallocatedBlockQueues.values.foreach { _.clear() } | ||
| allocatedBlocks.streamIdToAllocatedBlocks.foreach { | ||
| case (streamId, allocatedBlocks) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we use another name (maybe allocatedBlocksInStream?) other than allocatedBlocks to avoid confusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, fixed.
|
Test build #87519 has finished for PR 20620 at commit
|
|
Seems like unrelated issue. |
|
retest this please. |
|
Test build #87522 has finished for PR 20620 at commit
|
|
LGTM. |
|
Merging to master, will try back to 2.0. |
|
(Argh, the wifi here is horrible, I'll need to manually merge things, so hang on a sec...) |
…rashes There is a race condition introduced in SPARK-11141 which could cause data loss. The problem is that ReceivedBlockTracker.insertAllocatedBatch function assumes that all blocks from streamIdToUnallocatedBlockQueues allocated to the batch and clears the queue. In this PR only the allocated blocks will be removed from the queue which will prevent data loss. Additional unit test + manually. Author: Gabor Somogyi <[email protected]> Closes #20620 from gaborgsomogyi/SPARK-23438. (cherry picked from commit b308182) Signed-off-by: Marcelo Vanzin <[email protected]>
…rashes There is a race condition introduced in SPARK-11141 which could cause data loss. The problem is that ReceivedBlockTracker.insertAllocatedBatch function assumes that all blocks from streamIdToUnallocatedBlockQueues allocated to the batch and clears the queue. In this PR only the allocated blocks will be removed from the queue which will prevent data loss. Additional unit test + manually. Author: Gabor Somogyi <[email protected]> Closes #20620 from gaborgsomogyi/SPARK-23438. (cherry picked from commit b308182) Signed-off-by: Marcelo Vanzin <[email protected]>
…rashes There is a race condition introduced in SPARK-11141 which could cause data loss. The problem is that ReceivedBlockTracker.insertAllocatedBatch function assumes that all blocks from streamIdToUnallocatedBlockQueues allocated to the batch and clears the queue. In this PR only the allocated blocks will be removed from the queue which will prevent data loss. Additional unit test + manually. Author: Gabor Somogyi <[email protected]> Closes #20620 from gaborgsomogyi/SPARK-23438. (cherry picked from commit b308182) Signed-off-by: Marcelo Vanzin <[email protected]>
…rashes There is a race condition introduced in SPARK-11141 which could cause data loss. The problem is that ReceivedBlockTracker.insertAllocatedBatch function assumes that all blocks from streamIdToUnallocatedBlockQueues allocated to the batch and clears the queue. In this PR only the allocated blocks will be removed from the queue which will prevent data loss. Additional unit test + manually. Author: Gabor Somogyi <[email protected]> Closes #20620 from gaborgsomogyi/SPARK-23438. (cherry picked from commit b308182) Signed-off-by: Marcelo Vanzin <[email protected]>
…rashes There is a race condition introduced in SPARK-11141 which could cause data loss. The problem is that ReceivedBlockTracker.insertAllocatedBatch function assumes that all blocks from streamIdToUnallocatedBlockQueues allocated to the batch and clears the queue. In this PR only the allocated blocks will be removed from the queue which will prevent data loss. Additional unit test + manually. Author: Gabor Somogyi <[email protected]> Closes apache#20620 from gaborgsomogyi/SPARK-23438. (cherry picked from commit b308182) Signed-off-by: Marcelo Vanzin <[email protected]>
…rashes There is a race condition introduced in SPARK-11141 which could cause data loss. The problem is that ReceivedBlockTracker.insertAllocatedBatch function assumes that all blocks from streamIdToUnallocatedBlockQueues allocated to the batch and clears the queue. In this PR only the allocated blocks will be removed from the queue which will prevent data loss. Additional unit test + manually. Author: Gabor Somogyi <[email protected]> Closes apache#20620 from gaborgsomogyi/SPARK-23438. (cherry picked from commit b308182) Signed-off-by: Marcelo Vanzin <[email protected]>
What changes were proposed in this pull request?
There is a race condition introduced in SPARK-11141 which could cause data loss.
The problem is that ReceivedBlockTracker.insertAllocatedBatch function assumes that all blocks from streamIdToUnallocatedBlockQueues allocated to the batch and clears the queue.
In this PR only the allocated blocks will be removed from the queue which will prevent data loss.
How was this patch tested?
Additional unit test + manually.