-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-8404][Streaming][Tests] Use thread-safe collections to make the tests more reliable #6852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @tdas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would this be run concurrently though?
|
In all cases |
The codes in foreachRDD run in |
|
Yes I understand that argument, though have you observed a failure as a result? |
For these tests, there is no memory barrier because the checking codes are called after
Sorry. There are no enough entries in these tests to trigger this issue. The issue happens in |
|
Test build #35028 has finished for PR 6852 at commit
|
|
retest this please |
|
Are you worried about writes being visible or a correctness issue? Yes I see the potential correctness issue -- it's not I think you'd find there are definitely memory barriers in, for example, merely submitting a task. So I don't think there's a possibility that the writes never turn up in the reading thread. However that point is moot since I think the change is needed for the reason above anyway. |
|
Writes being visible is what I'm concerned. |
|
OK, that's fine, but the problems you mentioned above aren't a symptom of that. |
|
Test build #35032 timed out for PR 6852 at commit |
|
Jenkins, retest this please |
|
Test build #35050 has finished for PR 6852 at commit
|
|
@srowen We have been seeing some flakiness in this test in our daily master builds in Jenkins. So these fixes LGTM. |
|
Merging this in Spark 1.4 and master |
…the tests more reliable KafkaStreamSuite, DirectKafkaStreamSuite, JavaKafkaStreamSuite and JavaDirectKafkaStreamSuite use non-thread-safe collections to collect data in one thread and check it in another thread. It may fail the tests. This PR changes them to thread-safe collections. Note: I cannot reproduce the test failures in my environment. But at least, this PR should make the tests more reliable. Author: zsxwing <[email protected]> Closes #6852 from zsxwing/fix-KafkaStreamSuite and squashes the following commits: d464211 [zsxwing] Use thread-safe collections to make the tests more reliable (cherry picked from commit a06d9c8) Signed-off-by: Tathagata Das <[email protected]>
|
@tdas +1 yep #6852 (comment) is the good catch here; was just questioning the other motivation |
|
@srowen Sorry. The issue I mentioned in #6852 (comment) won't happen. I put |
…the tests more reliable KafkaStreamSuite, DirectKafkaStreamSuite, JavaKafkaStreamSuite and JavaDirectKafkaStreamSuite use non-thread-safe collections to collect data in one thread and check it in another thread. It may fail the tests. This PR changes them to thread-safe collections. Note: I cannot reproduce the test failures in my environment. But at least, this PR should make the tests more reliable. Author: zsxwing <[email protected]> Closes apache#6852 from zsxwing/fix-KafkaStreamSuite and squashes the following commits: d464211 [zsxwing] Use thread-safe collections to make the tests more reliable
KafkaStreamSuite, DirectKafkaStreamSuite, JavaKafkaStreamSuite and JavaDirectKafkaStreamSuite use non-thread-safe collections to collect data in one thread and check it in another thread. It may fail the tests.
This PR changes them to thread-safe collections.
Note: I cannot reproduce the test failures in my environment. But at least, this PR should make the tests more reliable.