Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Dec 10, 2020

What changes were proposed in this pull request?

Throw PartitionsAlreadyExistException from createPartitions() in Hive external catalog when a partition exists. Currently, HiveExternalCatalog.createPartitions() throws AlreadyExistsException wrapped by AnalysisException.

In the PR, I propose to catch AlreadyExistsException in HiveClientImpl and replace it by PartitionsAlreadyExistException.

Why are the changes needed?

The behaviour of Hive external catalog deviates from V1/V2 in-memory catalogs that throw PartitionsAlreadyExistException. To improve user experience with Spark SQL, it would be better to throw the same exception.

Does this PR introduce any user-facing change?

Yes

How was this patch tested?

By running existing test suites:

$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *AlterTableAddPartitionSuite"

@github-actions github-actions bot added the SQL label Dec 10, 2020
var cause = e
var found = false
while (!found && depth < maxDepth && cause != null) {
found = cause.getClass.getCanonicalName == classOf[AlreadyExistsException].getCanonicalName
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to compare exceptions by their names because pattern matching and isInstanceOf don't work as we could expect. See the comment:

* Due to classloader isolation issues, pattern matching won't work here so we need
* to compare the canonical names of the exceptions, which we assume to be stable.

}

private def isAlreadyExistsException(e: Throwable): Boolean = {
val maxDepth = 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you test with different hive versions? one idea is to re-throw the exception in HiveClientImpl, and we can test it in VersionsSuite.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. Added a test to VersionsSuite.

@SparkQA
Copy link

SparkQA commented Dec 10, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37188/

@SparkQA
Copy link

SparkQA commented Dec 10, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37188/

@SparkQA
Copy link

SparkQA commented Dec 10, 2020

Test build #132583 has finished for PR 30711 at commit e7685a0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 10, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37195/

@SparkQA
Copy link

SparkQA commented Dec 10, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37195/

@SparkQA
Copy link

SparkQA commented Dec 11, 2020

Test build #132590 has finished for PR 30711 at commit f834246.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Retest this please

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.
Merged to master for Apache Spark 3.2.0.
Could you make backporting PRs for the older release branches to pass the tests?

@SparkQA
Copy link

SparkQA commented Dec 11, 2020

Test build #132606 has finished for PR 30711 at commit f834246.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

MaxGekk added a commit to MaxGekk/spark that referenced this pull request Dec 11, 2020
…ernalCatalog.createPartitions()

Throw `PartitionsAlreadyExistException` from `createPartitions()` in Hive external catalog when a partition exists. Currently, `HiveExternalCatalog.createPartitions()` throws `AlreadyExistsException` wrapped by `AnalysisException`.

In the PR, I propose to catch `AlreadyExistsException` in `HiveClientImpl` and replace it by `PartitionsAlreadyExistException`.

The behaviour of Hive external catalog deviates from V1/V2 in-memory catalogs that throw `PartitionsAlreadyExistException`. To improve user experience with Spark SQL, it would be better to throw the same exception.

Yes

By running existing test suites:
```
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *AlterTableAddPartitionSuite"
```

Closes apache#30711 from MaxGekk/hive-partition-exception.

Authored-by: Max Gekk <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit fab2995)
Signed-off-by: Max Gekk <[email protected]>
MaxGekk added a commit to MaxGekk/spark that referenced this pull request Dec 11, 2020
…ernalCatalog.createPartitions()

Throw `PartitionsAlreadyExistException` from `createPartitions()` in Hive external catalog when a partition exists. Currently, `HiveExternalCatalog.createPartitions()` throws `AlreadyExistsException` wrapped by `AnalysisException`.

In the PR, I propose to catch `AlreadyExistsException` in `HiveClientImpl` and replace it by `PartitionsAlreadyExistException`.

The behaviour of Hive external catalog deviates from V1/V2 in-memory catalogs that throw `PartitionsAlreadyExistException`. To improve user experience with Spark SQL, it would be better to throw the same exception.

Yes

By running existing test suites:
```
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *AlterTableAddPartitionSuite"
```

Closes apache#30711 from MaxGekk/hive-partition-exception.

Authored-by: Max Gekk <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit fab2995)
Signed-off-by: Max Gekk <[email protected]>
(cherry picked from commit b284ea3)
Signed-off-by: Max Gekk <[email protected]>
MaxGekk added a commit to MaxGekk/spark that referenced this pull request Dec 11, 2020
…ernalCatalog.createPartitions()

Throw `PartitionsAlreadyExistException` from `createPartitions()` in Hive external catalog when a partition exists. Currently, `HiveExternalCatalog.createPartitions()` throws `AlreadyExistsException` wrapped by `AnalysisException`.

In the PR, I propose to catch `AlreadyExistsException` in `HiveClientImpl` and replace it by `PartitionsAlreadyExistException`.

The behaviour of Hive external catalog deviates from V1/V2 in-memory catalogs that throw `PartitionsAlreadyExistException`. To improve user experience with Spark SQL, it would be better to throw the same exception.

Yes

By running existing test suites:
```
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *AlterTableAddPartitionSuite"
```

Closes apache#30711 from MaxGekk/hive-partition-exception.

Authored-by: Max Gekk <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit fab2995)
Signed-off-by: Max Gekk <[email protected]>
(cherry picked from commit b284ea3)
Signed-off-by: Max Gekk <[email protected]>
(cherry picked from commit 0226cfd)
Signed-off-by: Max Gekk <[email protected]>
HyukjinKwon pushed a commit that referenced this pull request Dec 27, 2020
… in `HiveClientImpl`

### What changes were proposed in this pull request?
Update the SQL migration guide about the changes made by:
- #30778
- #30711
- #30866

### Why are the changes needed?
To inform users about the recent changes in the upcoming releases.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
N/A

Closes #30925 from MaxGekk/sql-migr-guide-hiveclientimpl.

Authored-by: Max Gekk <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
HyukjinKwon pushed a commit that referenced this pull request Dec 27, 2020
…anges in `HiveClientImpl`

### What changes were proposed in this pull request?
Update the SQL migration guide about the changes made by:
- #30778
- #30711

### Why are the changes needed?
To inform users about the recent changes in the upcoming releases.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
N/A

Closes #30931 from MaxGekk/sql-migr-guide-hiveclientimpl-3.1.

Authored-by: Max Gekk <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
HyukjinKwon pushed a commit that referenced this pull request Dec 27, 2020
…anges in `HiveClientImpl`

### What changes were proposed in this pull request?
Update the SQL migration guide about the changes made by:
- #30778
- #30711

### Why are the changes needed?
To inform users about the recent changes in the upcoming releases.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
N/A

Closes #30932 from MaxGekk/sql-migr-guide-hiveclientimpl-3.0.

Authored-by: Max Gekk <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
HyukjinKwon pushed a commit that referenced this pull request Dec 27, 2020
…anges in `HiveClientImpl`

### What changes were proposed in this pull request?
Update the SQL migration guide about the changes made by:
- #30778
- #30711

### Why are the changes needed?
To inform users about the recent changes in the upcoming releases.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
N/A

Closes #30933 from MaxGekk/sql-migr-guide-hiveclientimpl-2.4.

Authored-by: Max Gekk <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
@MaxGekk MaxGekk deleted the hive-partition-exception branch February 19, 2021 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants