Skip to content

Conversation

@alismess-db
Copy link
Contributor

What changes were proposed in this pull request?

The update methods in HiveThriftServer2Listener now check if the parameter operation/session ID actually exist in the sessionList and executionList respectively. This prevents NullPointerExceptions if the operation or session ID is unknown. Instead, a warning is written to the log.

Also, in HiveSessionImpl.close(), we catch any exception thrown by operationManager.closeOperation. If for any reason this throws an exception, other operations are not prevented from being closed.

Why are the changes needed?

The listener's update methods would throw an exception if the operation or session ID is unknown. In Spark 2, where the listener is called directly, this hampers with the caller's control flow. In Spark 3, the exception is caught by the ListenerBus but results in an uninformative NullPointerException.

In HiveSessionImpl.close(), if an exception is thrown when closing an operation, all following operations are not closed.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Unit tests

@juliuszsompolski
Copy link
Contributor

juliuszsompolski commented Apr 8, 2020

cc @wangyum

In Spark 2, where the listener is called directly, this hampers with the caller's control flow.

This exception could happen if an operation in ERROR or CANCELED state that was never closed by the client was already evicted from the listener state (as not-active) when the session was later closed.

@juliuszsompolski
Copy link
Contributor

It is more serious in Spark 2, where the exception from the listener would bubble up all the way to session, but IMO we could also catch it in Spark 3 listener, as this PR proposes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/apache/spark/pull/28155/checks?check_run_id=571648817

[ERROR] [Error] /home/runner/work/spark/spark/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/service/cli/session/HiveSessionImplSuite.scala:22: object rpc is not a member of package org.apache.hive.service
[ERROR] [Error] /home/runner/work/spark/spark/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/service/cli/session/HiveSessionImplSuite.scala:37: not found: value TProtocolVersion
[ERROR] two errors found
[ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:4.3.0:testCompile (scala-test-compile-first) on project spark-hive-thriftserver_2.12: Execution scala-test-compile-first of goal net.alchim31.maven:scala-maven-plugin:4.3.0:testCompile failed.: CompileFailed -> [Help 1]

You can use ThriftserverShimUtils.testedProtocolVersions(0) to make it work with both hive-1.2 and hive-2.3 profile.
You must import TProtocolVersion from

@alismess-db alismess-db force-pushed the hive-thriftserver-listener-update-safer branch from e65424e to 3a842b6 Compare May 9, 2020 14:25
@juliuszsompolski
Copy link
Contributor

@wangyum @gatorsmile could you take a look?
It improves the robustness of Thriftserver session / Thriftserver listener against uncaught exceptions.
In Spark 2.4, when the listener was running in the same thread as execution, this could cause some sessions to not be properly cleaned up if an exception was thrown from one of the operations in the session during session close.

Copy link
Contributor

@juliuszsompolski juliuszsompolski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gatorsmile
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented May 11, 2020

Test build #122514 has finished for PR 28155 at commit 3a842b6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class HiveSessionImplSuite extends SparkFunSuite

@gatorsmile
Copy link
Member

LGTM

Thanks! Merged to master/3.0

gatorsmile pushed a commit that referenced this pull request May 12, 2020
…2Listener

### What changes were proposed in this pull request?

The update methods in HiveThriftServer2Listener now check if the parameter operation/session ID actually exist in the `sessionList` and `executionList` respectively. This prevents NullPointerExceptions if the operation or session ID is unknown. Instead, a warning is written to the log.

Also, in HiveSessionImpl.close(), we catch any exception thrown by `operationManager.closeOperation`. If for any reason this throws an exception, other operations are not prevented from being closed.

### Why are the changes needed?

The listener's update methods would throw an exception if the operation or session ID is unknown. In Spark 2, where the listener is called directly, this hampers with the caller's control flow. In Spark 3, the exception is caught by the ListenerBus but results in an uninformative NullPointerException.

In HiveSessionImpl.close(), if an exception is thrown when closing an operation, all following operations are not closed.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Unit tests

Closes #28155 from alismess-db/hive-thriftserver-listener-update-safer.

Authored-by: Ali Smesseim <[email protected]>
Signed-off-by: gatorsmile <[email protected]>
(cherry picked from commit 6994c64)
Signed-off-by: gatorsmile <[email protected]>
@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented May 14, 2020

@dongjoon-hyun
Copy link
Member

Ur, I found that this breaks hive-2.3, too. In short, this seems to break all Maven tests in branch-3.0.

Screen Shot 2020-05-14 at 11 49 34 AM

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented May 14, 2020

Ur, it turns out that master branch Maven jobs are broken, too.
Screen Shot 2020-05-14 at 11 59 43 AM

I'll revert this patch from master/3.0. Please make another PR and test with Maven since Maven is our official build system.

alismess-db added a commit to alismess-db/spark that referenced this pull request May 15, 2020
…2Listener

### What changes were proposed in this pull request?

The update methods in HiveThriftServer2Listener now check if the parameter operation/session ID actually exist in the `sessionList` and `executionList` respectively. This prevents NullPointerExceptions if the operation or session ID is unknown. Instead, a warning is written to the log.

Also, in HiveSessionImpl.close(), we catch any exception thrown by `operationManager.closeOperation`. If for any reason this throws an exception, other operations are not prevented from being closed.

### Why are the changes needed?

The listener's update methods would throw an exception if the operation or session ID is unknown. In Spark 2, where the listener is called directly, this hampers with the caller's control flow. In Spark 3, the exception is caught by the ListenerBus but results in an uninformative NullPointerException.

In HiveSessionImpl.close(), if an exception is thrown when closing an operation, all following operations are not closed.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Unit tests

Closes apache#28155 from alismess-db/hive-thriftserver-listener-update-safer.

Authored-by: Ali Smesseim <[email protected]>
Signed-off-by: gatorsmile <[email protected]>
(cherry picked from commit 6994c64)
Signed-off-by: Ali Smesseim <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request May 20, 2020
…erver2Listener

### What changes were proposed in this pull request?

This is a recreation of #28155, which was reverted due to causing test failures.

The update methods in HiveThriftServer2Listener now check if the parameter operation/session ID actually exist in the `sessionList` and `executionList` respectively. This prevents NullPointerExceptions if the operation or session ID is unknown. Instead, a warning is written to the log.

To improve robustness, we also make the following changes in HiveSessionImpl.close():

- Catch any exception thrown by `operationManager.closeOperation`. If for any reason this throws an exception, other operations are not prevented from being closed.
- Handle not being able to access the scratch directory. When closing, all `.pipeout` files are removed from the scratch directory, which would have resulted in an NPE if the directory does not exist.

### Why are the changes needed?

The listener's update methods would throw an exception if the operation or session ID is unknown. In Spark 2, where the listener is called directly, this changes the caller's control flow. In Spark 3, the exception is caught by the ListenerBus but results in an uninformative NullPointerException.

In HiveSessionImpl.close(), if an exception is thrown when closing an operation, all following operations are not closed.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Unit tests

Closes #28544 from alismess-db/hive-thriftserver-listener-update-safer-2.

Authored-by: Ali Smesseim <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request May 20, 2020
…erver2Listener

### What changes were proposed in this pull request?

This is a recreation of #28155, which was reverted due to causing test failures.

The update methods in HiveThriftServer2Listener now check if the parameter operation/session ID actually exist in the `sessionList` and `executionList` respectively. This prevents NullPointerExceptions if the operation or session ID is unknown. Instead, a warning is written to the log.

To improve robustness, we also make the following changes in HiveSessionImpl.close():

- Catch any exception thrown by `operationManager.closeOperation`. If for any reason this throws an exception, other operations are not prevented from being closed.
- Handle not being able to access the scratch directory. When closing, all `.pipeout` files are removed from the scratch directory, which would have resulted in an NPE if the directory does not exist.

### Why are the changes needed?

The listener's update methods would throw an exception if the operation or session ID is unknown. In Spark 2, where the listener is called directly, this changes the caller's control flow. In Spark 3, the exception is caught by the ListenerBus but results in an uninformative NullPointerException.

In HiveSessionImpl.close(), if an exception is thrown when closing an operation, all following operations are not closed.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Unit tests

Closes #28544 from alismess-db/hive-thriftserver-listener-update-safer-2.

Authored-by: Ali Smesseim <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit d40ecfa)
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants