Skip to content

Conversation

@andrewor14
Copy link
Contributor

Original poster is @zsxwing, who reported this bug in #516.

Much of SparkListenerSuite relies on LiveListenerBus's waitUntilEmpty() method. As the name suggests, this waits until the event queue is empty. However, the following race condition could happen:

(1) We dequeue an event
(2) The queue is empty, we return true (even though the event has not been processed)
(3) The test asserts something assuming that all listeners have finished executing (and fails)
(4) The listeners receive and process the event

This PR makes (1) and (4) atomic by synchronizing around it. To do that, however, we must avoid using eventQueue.take, which is blocking and will cause a deadlock if we synchronize around it. As a workaround, we use the non-blocking eventQueue.poll + a semaphore to provide the same semantics.

This has been a possible race condition for a long time, but for some reason we've never run into it.

This guards against the race condition in which we (1) dequeue an event,
and (2) check for queue emptiness before (3) actually processing the
event in all attached listeners.

The solution is to make steps (1) and (3) atomic relatively to (2).
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14462/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can only send release if eventAdded. It's a minor detail but would also mean that event should never be null, which should (slightly) help the bus to catch up with the workload.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, didn't realize offer also returns a boolean

@aarondav
Copy link
Contributor

LGTM, the semantics and performance of this seem quite reasonable. The synchronzied block should be virtually cost-less in normal operation, and I trust java.concurrent's implementation of Semaphore.

@zsxwing
Copy link
Member

zsxwing commented Apr 25, 2014

Looks great.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14470/

@pwendell
Copy link
Contributor

Thanks - merged.

@asfgit asfgit closed this in ee6f7e2 Apr 25, 2014
asfgit pushed a commit that referenced this pull request Apr 25, 2014
Original poster is @zsxwing, who reported this bug in #516.

Much of SparkListenerSuite relies on LiveListenerBus's `waitUntilEmpty()` method. As the name suggests, this waits until the event queue is empty. However, the following race condition could happen:

(1) We dequeue an event
(2) The queue is empty, we return true (even though the event has not been processed)
(3) The test asserts something assuming that all listeners have finished executing (and fails)
(4) The listeners receive and process the event

This PR makes (1) and (4) atomic by synchronizing around it. To do that, however, we must avoid using `eventQueue.take`, which is blocking and will cause a deadlock if we synchronize around it. As a workaround, we use the non-blocking `eventQueue.poll` + a semaphore to provide the same semantics.

This has been a possible race condition for a long time, but for some reason we've never run into it.

Author: Andrew Or <[email protected]>

Closes #544 from andrewor14/stage-info-test-fix and squashes the following commits:

3cbe40c [Andrew Or] Merge github.com:apache/spark into stage-info-test-fix
56dbbcb [Andrew Or] Check if event is actually added before releasing semaphore
eb486ae [Andrew Or] Synchronize accesses to the LiveListenerBus' event queue
(cherry picked from commit ee6f7e2)

Signed-off-by: Patrick Wendell <[email protected]>
@andrewor14 andrewor14 deleted the stage-info-test-fix branch April 29, 2014 21:40
pwendell pushed a commit to pwendell/spark that referenced this pull request May 12, 2014
…loses apache#544.

Fixed warnings in test compilation.

This commit fixes two problems: a redundant import, and a
deprecated function.

Author: Kay Ousterhout <[email protected]>

== Merge branch commits ==

commit da9d2e13ee4102bc58888df0559c65cb26232a82
Author: Kay Ousterhout <[email protected]>
Date:   Wed Feb 5 11:41:51 2014 -0800

    Fixed warnings in test compilation.

    This commit fixes two problems: a redundant import, and a
    deprecated function.
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
Original poster is @zsxwing, who reported this bug in apache#516.

Much of SparkListenerSuite relies on LiveListenerBus's `waitUntilEmpty()` method. As the name suggests, this waits until the event queue is empty. However, the following race condition could happen:

(1) We dequeue an event
(2) The queue is empty, we return true (even though the event has not been processed)
(3) The test asserts something assuming that all listeners have finished executing (and fails)
(4) The listeners receive and process the event

This PR makes (1) and (4) atomic by synchronizing around it. To do that, however, we must avoid using `eventQueue.take`, which is blocking and will cause a deadlock if we synchronize around it. As a workaround, we use the non-blocking `eventQueue.poll` + a semaphore to provide the same semantics.

This has been a possible race condition for a long time, but for some reason we've never run into it.

Author: Andrew Or <[email protected]>

Closes apache#544 from andrewor14/stage-info-test-fix and squashes the following commits:

3cbe40c [Andrew Or] Merge github.com:apache/spark into stage-info-test-fix
56dbbcb [Andrew Or] Check if event is actually added before releasing semaphore
eb486ae [Andrew Or] Synchronize accesses to the LiveListenerBus' event queue
asfgit pushed a commit that referenced this pull request Sep 5, 2014
yifeih added a commit to yifeih/spark that referenced this pull request May 10, 2019
Fix the stubbing of the reader benchmark tests
bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019
* Change log dir

Change log dir

* Update run.yaml

* Update run.yaml

* Update run.yaml
arjunshroff pushed a commit to arjunshroff/spark that referenced this pull request Nov 24, 2020
LuciferYang added a commit that referenced this pull request Aug 21, 2023
…te` for Java 21

### What changes were proposed in this pull request?

SPARK-44507(#42130) updated `try_arithmetic.sql.out` and `numeric.sql.out`, SPARK-44868(#42534) updated `datetime-formatting.sql.out`, but these PRs didn’t pay attention to the test health on Java 21. So this PR has regenerated the golden files `try_arithmetic.sql.out.java21`, `numeric.sql.out.java21`, and `datetime-formatting.sql.out.java21` of `SQLQueryTestSuite` so that `SQLQueryTestSuite` can be tested with Java 21.

### Why are the changes needed?
Restore `SQLQueryTestSuite` to be tested with Java 21.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Actions
- Manual checked:

```
java -version
openjdk version "21-ea" 2023-09-19
OpenJDK Runtime Environment Zulu21+69-CA (build 21-ea+28)
OpenJDK 64-Bit Server VM Zulu21+69-CA (build 21-ea+28, mixed mode, sharing)
```

```
SPARK_GENERATE_GOLDEN_FILES=0 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"
```

**Before**

```
...
[info] - datetime-formatting.sql *** FAILED *** (316 milliseconds)
[info]   datetime-formatting.sql
[info]   Array("-- Automatically generated by SQLQueryTestSuite
[info]   ", "create temporary view v as select col from values
[info]    (timestamp '1582-06-01 11:33:33.123UTC+080000'),
[info]    (timestamp '1970-01-01 00:00:00.000Europe/Paris'),
[info]    (timestamp '1970-12-31 23:59:59.999Asia/Srednekolymsk'),
[info]    (timestamp '1996-04-01 00:33:33.123Australia/Darwin'),
[info]    (timestamp '2018-11-17 13:33:33.123Z'),
[info]    (timestamp '2020-01-01 01:33:33.123Asia/Shanghai'),
[info]    (timestamp '2100-01-01 01:33:33.123America/Los_Angeles') t(col)
[info]   ", "struct<>
[info]   ", "
[info]
[info]
[info]   ", "select col, date_format(col, 'G GG GGG GGGG') from v
[info]   ", "struct<col:timestamp,date_format(col, G GG GGG GGGG):string>
[info]   ", "1582-05-31 19:40:35.123	AD AD AD Anno Domini
[info]   1969-12-31 15:00:00	AD AD AD Anno Domini
[info]   1970-12-31 04:59:59.999	AD AD AD Anno Domini
[info]   1996-03-31 07:03:33.123	AD AD AD Anno Domini
[info]   2018-11-17 05:33:33.123	AD AD AD Anno Domini
[info]   2019-12-31 09:33:33.123	AD AD AD Anno Domini
[info]   2100-01-01 01:33:33.123	AD AD AD Anno Domini
[info]
[info]
[info]   ", "select col, date_format(col, 'y yy yyy yyyy yyyyy yyyyyy') from v
[info]   ", "struct<col:timestamp,date_format(col, y yy yyy yyyy yyyyy yyyyyy):string>
[info]   ", "1582-05-31 19:40:35.123	1582 82 1582 1582 01582 001582
[info]   1969-12-31 15:00:00	1969 69 1969 1969 01969 001969
[info]   1970-12-31 04:59:59.999	1970 70 1970 1970 01970 001970
[info]   1996-03-31 07:03:33.123	1996 96 1996 1996 01996 001996
[info]   2018-11-17 05:33:33.123	2018 18 2018 2018 02018 002018
[info]   2019-12-31 09:33:33.123	2019 19 2019 2019 02019 002019
[info]   2100-01-01 01:33:33.123	2100 00 2100 2100 02100 002100
[info]
...
[info] - postgreSQL/numeric.sql *** FAILED *** (35 seconds, 848 milliseconds)
[info]   postgreSQL/numeric.sql
[info]   Expected "...rg.apache.spark.sql.[]AnalysisException
[info]   {
[info]   ...", but got "...rg.apache.spark.sql.[catalyst.Extended]AnalysisException
[info]   {
[info]   ..." Result did not match for query #544
[info]   SELECT '' AS to_number_2,  to_number('-34,338,492.654,878', '99G999G999D999G999') (SQLQueryTestSuite.scala:876)
[info]   org.scalatest.exceptions.TestFailedException:
...
[info] - try_arithmetic.sql *** FAILED *** (314 milliseconds)
[info]   try_arithmetic.sql
[info]   Expected "...rg.apache.spark.sql.[]AnalysisException
[info]   {
[info]   ...", but got "...rg.apache.spark.sql.[catalyst.Extended]AnalysisException
[info]   {
[info]   ..." Result did not match for query #20
[info]   SELECT try_add(interval 2 year, interval 2 second) (SQLQueryTestSuite.scala:876)
[info]   org.scalatest.exceptions.TestFailedException:
```

**After**
```
[info] Run completed in 9 minutes, 10 seconds.
[info] Total number of tests run: 572
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 572, failed 0, canceled 0, ignored 59, pending 0
[info] All tests passed.
```

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #42580 from LuciferYang/SPARK-44888.

Authored-by: yangjie01 <[email protected]>
Signed-off-by: yangjie01 <[email protected]>
valentinp17 pushed a commit to valentinp17/spark that referenced this pull request Aug 24, 2023
…te` for Java 21

### What changes were proposed in this pull request?

SPARK-44507(apache#42130) updated `try_arithmetic.sql.out` and `numeric.sql.out`, SPARK-44868(apache#42534) updated `datetime-formatting.sql.out`, but these PRs didn’t pay attention to the test health on Java 21. So this PR has regenerated the golden files `try_arithmetic.sql.out.java21`, `numeric.sql.out.java21`, and `datetime-formatting.sql.out.java21` of `SQLQueryTestSuite` so that `SQLQueryTestSuite` can be tested with Java 21.

### Why are the changes needed?
Restore `SQLQueryTestSuite` to be tested with Java 21.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Actions
- Manual checked:

```
java -version
openjdk version "21-ea" 2023-09-19
OpenJDK Runtime Environment Zulu21+69-CA (build 21-ea+28)
OpenJDK 64-Bit Server VM Zulu21+69-CA (build 21-ea+28, mixed mode, sharing)
```

```
SPARK_GENERATE_GOLDEN_FILES=0 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"
```

**Before**

```
...
[info] - datetime-formatting.sql *** FAILED *** (316 milliseconds)
[info]   datetime-formatting.sql
[info]   Array("-- Automatically generated by SQLQueryTestSuite
[info]   ", "create temporary view v as select col from values
[info]    (timestamp '1582-06-01 11:33:33.123UTC+080000'),
[info]    (timestamp '1970-01-01 00:00:00.000Europe/Paris'),
[info]    (timestamp '1970-12-31 23:59:59.999Asia/Srednekolymsk'),
[info]    (timestamp '1996-04-01 00:33:33.123Australia/Darwin'),
[info]    (timestamp '2018-11-17 13:33:33.123Z'),
[info]    (timestamp '2020-01-01 01:33:33.123Asia/Shanghai'),
[info]    (timestamp '2100-01-01 01:33:33.123America/Los_Angeles') t(col)
[info]   ", "struct<>
[info]   ", "
[info]
[info]
[info]   ", "select col, date_format(col, 'G GG GGG GGGG') from v
[info]   ", "struct<col:timestamp,date_format(col, G GG GGG GGGG):string>
[info]   ", "1582-05-31 19:40:35.123	AD AD AD Anno Domini
[info]   1969-12-31 15:00:00	AD AD AD Anno Domini
[info]   1970-12-31 04:59:59.999	AD AD AD Anno Domini
[info]   1996-03-31 07:03:33.123	AD AD AD Anno Domini
[info]   2018-11-17 05:33:33.123	AD AD AD Anno Domini
[info]   2019-12-31 09:33:33.123	AD AD AD Anno Domini
[info]   2100-01-01 01:33:33.123	AD AD AD Anno Domini
[info]
[info]
[info]   ", "select col, date_format(col, 'y yy yyy yyyy yyyyy yyyyyy') from v
[info]   ", "struct<col:timestamp,date_format(col, y yy yyy yyyy yyyyy yyyyyy):string>
[info]   ", "1582-05-31 19:40:35.123	1582 82 1582 1582 01582 001582
[info]   1969-12-31 15:00:00	1969 69 1969 1969 01969 001969
[info]   1970-12-31 04:59:59.999	1970 70 1970 1970 01970 001970
[info]   1996-03-31 07:03:33.123	1996 96 1996 1996 01996 001996
[info]   2018-11-17 05:33:33.123	2018 18 2018 2018 02018 002018
[info]   2019-12-31 09:33:33.123	2019 19 2019 2019 02019 002019
[info]   2100-01-01 01:33:33.123	2100 00 2100 2100 02100 002100
[info]
...
[info] - postgreSQL/numeric.sql *** FAILED *** (35 seconds, 848 milliseconds)
[info]   postgreSQL/numeric.sql
[info]   Expected "...rg.apache.spark.sql.[]AnalysisException
[info]   {
[info]   ...", but got "...rg.apache.spark.sql.[catalyst.Extended]AnalysisException
[info]   {
[info]   ..." Result did not match for query apache#544
[info]   SELECT '' AS to_number_2,  to_number('-34,338,492.654,878', '99G999G999D999G999') (SQLQueryTestSuite.scala:876)
[info]   org.scalatest.exceptions.TestFailedException:
...
[info] - try_arithmetic.sql *** FAILED *** (314 milliseconds)
[info]   try_arithmetic.sql
[info]   Expected "...rg.apache.spark.sql.[]AnalysisException
[info]   {
[info]   ...", but got "...rg.apache.spark.sql.[catalyst.Extended]AnalysisException
[info]   {
[info]   ..." Result did not match for query apache#20
[info]   SELECT try_add(interval 2 year, interval 2 second) (SQLQueryTestSuite.scala:876)
[info]   org.scalatest.exceptions.TestFailedException:
```

**After**
```
[info] Run completed in 9 minutes, 10 seconds.
[info] Total number of tests run: 572
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 572, failed 0, canceled 0, ignored 59, pending 0
[info] All tests passed.
```

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#42580 from LuciferYang/SPARK-44888.

Authored-by: yangjie01 <[email protected]>
Signed-off-by: yangjie01 <[email protected]>
ragnarok56 pushed a commit to ragnarok56/spark that referenced this pull request Mar 2, 2024
…te` for Java 21

### What changes were proposed in this pull request?

SPARK-44507(apache#42130) updated `try_arithmetic.sql.out` and `numeric.sql.out`, SPARK-44868(apache#42534) updated `datetime-formatting.sql.out`, but these PRs didn’t pay attention to the test health on Java 21. So this PR has regenerated the golden files `try_arithmetic.sql.out.java21`, `numeric.sql.out.java21`, and `datetime-formatting.sql.out.java21` of `SQLQueryTestSuite` so that `SQLQueryTestSuite` can be tested with Java 21.

### Why are the changes needed?
Restore `SQLQueryTestSuite` to be tested with Java 21.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Actions
- Manual checked:

```
java -version
openjdk version "21-ea" 2023-09-19
OpenJDK Runtime Environment Zulu21+69-CA (build 21-ea+28)
OpenJDK 64-Bit Server VM Zulu21+69-CA (build 21-ea+28, mixed mode, sharing)
```

```
SPARK_GENERATE_GOLDEN_FILES=0 build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"
```

**Before**

```
...
[info] - datetime-formatting.sql *** FAILED *** (316 milliseconds)
[info]   datetime-formatting.sql
[info]   Array("-- Automatically generated by SQLQueryTestSuite
[info]   ", "create temporary view v as select col from values
[info]    (timestamp '1582-06-01 11:33:33.123UTC+080000'),
[info]    (timestamp '1970-01-01 00:00:00.000Europe/Paris'),
[info]    (timestamp '1970-12-31 23:59:59.999Asia/Srednekolymsk'),
[info]    (timestamp '1996-04-01 00:33:33.123Australia/Darwin'),
[info]    (timestamp '2018-11-17 13:33:33.123Z'),
[info]    (timestamp '2020-01-01 01:33:33.123Asia/Shanghai'),
[info]    (timestamp '2100-01-01 01:33:33.123America/Los_Angeles') t(col)
[info]   ", "struct<>
[info]   ", "
[info]
[info]
[info]   ", "select col, date_format(col, 'G GG GGG GGGG') from v
[info]   ", "struct<col:timestamp,date_format(col, G GG GGG GGGG):string>
[info]   ", "1582-05-31 19:40:35.123	AD AD AD Anno Domini
[info]   1969-12-31 15:00:00	AD AD AD Anno Domini
[info]   1970-12-31 04:59:59.999	AD AD AD Anno Domini
[info]   1996-03-31 07:03:33.123	AD AD AD Anno Domini
[info]   2018-11-17 05:33:33.123	AD AD AD Anno Domini
[info]   2019-12-31 09:33:33.123	AD AD AD Anno Domini
[info]   2100-01-01 01:33:33.123	AD AD AD Anno Domini
[info]
[info]
[info]   ", "select col, date_format(col, 'y yy yyy yyyy yyyyy yyyyyy') from v
[info]   ", "struct<col:timestamp,date_format(col, y yy yyy yyyy yyyyy yyyyyy):string>
[info]   ", "1582-05-31 19:40:35.123	1582 82 1582 1582 01582 001582
[info]   1969-12-31 15:00:00	1969 69 1969 1969 01969 001969
[info]   1970-12-31 04:59:59.999	1970 70 1970 1970 01970 001970
[info]   1996-03-31 07:03:33.123	1996 96 1996 1996 01996 001996
[info]   2018-11-17 05:33:33.123	2018 18 2018 2018 02018 002018
[info]   2019-12-31 09:33:33.123	2019 19 2019 2019 02019 002019
[info]   2100-01-01 01:33:33.123	2100 00 2100 2100 02100 002100
[info]
...
[info] - postgreSQL/numeric.sql *** FAILED *** (35 seconds, 848 milliseconds)
[info]   postgreSQL/numeric.sql
[info]   Expected "...rg.apache.spark.sql.[]AnalysisException
[info]   {
[info]   ...", but got "...rg.apache.spark.sql.[catalyst.Extended]AnalysisException
[info]   {
[info]   ..." Result did not match for query apache#544
[info]   SELECT '' AS to_number_2,  to_number('-34,338,492.654,878', '99G999G999D999G999') (SQLQueryTestSuite.scala:876)
[info]   org.scalatest.exceptions.TestFailedException:
...
[info] - try_arithmetic.sql *** FAILED *** (314 milliseconds)
[info]   try_arithmetic.sql
[info]   Expected "...rg.apache.spark.sql.[]AnalysisException
[info]   {
[info]   ...", but got "...rg.apache.spark.sql.[catalyst.Extended]AnalysisException
[info]   {
[info]   ..." Result did not match for query apache#20
[info]   SELECT try_add(interval 2 year, interval 2 second) (SQLQueryTestSuite.scala:876)
[info]   org.scalatest.exceptions.TestFailedException:
```

**After**
```
[info] Run completed in 9 minutes, 10 seconds.
[info] Total number of tests run: 572
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 572, failed 0, canceled 0, ignored 59, pending 0
[info] All tests passed.
```

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#42580 from LuciferYang/SPARK-44888.

Authored-by: yangjie01 <[email protected]>
Signed-off-by: yangjie01 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants