Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Dec 20, 2020

What changes were proposed in this pull request?

Throw PartitionAlreadyExistsException from ALTER TABLE .. RENAME TO PARTITION for a table from Hive V1 External Catalog in the case when the target partition already exists.

Why are the changes needed?

  1. To have the same behavior of V1 In-Memory and Hive External Catalog.
  2. To not propagate internal Hive's exceptions to users.

Does this PR introduce any user-facing change?

Yes. After the changes, the partition renaming command throws PartitionAlreadyExistsException for tables from the Hive catalog.

How was this patch tested?

Added new UT:

$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *HiveCatalogedDDLSuite"

@github-actions github-actions bot added the SQL label Dec 20, 2020
@MaxGekk MaxGekk changed the title [SPARK-33862][SQL] Throw PartitionAlreadyExistsException if the target partition existing while renaming [SPARK-33862][SQL] Throw PartitionAlreadyExistsException if the target partition exists while renaming Dec 20, 2020
@SparkQA
Copy link

SparkQA commented Dec 21, 2020

Test build #133115 has finished for PR 30866 at commit e14e8f3.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, do you propose a behavior change from AnalysisException due to org.apache.hadoop.hive.ql.metadata.HiveException to PartitionAlreadyExistsException (extends AnalysisException) at Apache Spark 3.2.0?

Do you think we need a migration document, @MaxGekk ?

scala> sql("alter table t partition (b='a') rename to partition (b='a')").show
org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename partition. Partition already exists:default.t.[a];

cc @gatorsmile

@MaxGekk
Copy link
Member Author

MaxGekk commented Dec 21, 2020

So, do you propose a behavior change from AnalysisException .. to PartitionAlreadyExistsException

Yes, I do. This will be consistent with In-Memory catalog:

  private def requirePartitionsNotExist(
      db: String,
      table: String,
      specs: Seq[TablePartitionSpec]): Unit = {
    specs.foreach { s =>
      if (partitionExists(db, table, s)) {
        throw new PartitionAlreadyExistsException(db = db, table = table, spec = s)
      }
    }
  }

Do you think we need a migration document, @MaxGekk ?

As you mentioned PartitionAlreadyExistsException is a sub-class of AnalysisException, so, if users expect AnalysisException, we will not break their code. In any case, we could update the SQL migration guide... but if we do that for the current changes, we should update the guide for: #30778 and #30711

…adyExistsException

# Conflicts:
#	sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
@SparkQA
Copy link

SparkQA commented Dec 21, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37735/

@SparkQA
Copy link

SparkQA commented Dec 21, 2020

Test build #133136 has finished for PR 30866 at commit 4361fdf.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class AlterTableRenamePartition(

@SparkQA
Copy link

SparkQA commented Dec 21, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37735/

@SparkQA
Copy link

SparkQA commented Dec 21, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37741/

@SparkQA
Copy link

SparkQA commented Dec 21, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37741/

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am good with this change but will leave it to @dongjoon-hyun since he's still reviewing it.

@SparkQA
Copy link

SparkQA commented Dec 21, 2020

Test build #133142 has finished for PR 30866 at commit 1c478e8.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM for Apache Spark 3.2.
Thank you, @MaxGekk and @HyukjinKwon .

For the following, please file a JIRA ID and proceed as a documentation PR.

if we do that for the current changes, we should update the guide for: #30778 and #30711

@MaxGekk
Copy link
Member Author

MaxGekk commented Dec 25, 2020

For the following, please file a JIRA ID and proceed as a documentation PR.

@dongjoon-hyun Here is the PR for that: #30925

HyukjinKwon pushed a commit that referenced this pull request Dec 27, 2020
… in `HiveClientImpl`

### What changes were proposed in this pull request?
Update the SQL migration guide about the changes made by:
- #30778
- #30711
- #30866

### Why are the changes needed?
To inform users about the recent changes in the upcoming releases.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
N/A

Closes #30925 from MaxGekk/sql-migr-guide-hiveclientimpl.

Authored-by: Max Gekk <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
@MaxGekk MaxGekk deleted the throw-PartitionAlreadyExistsException branch February 19, 2021 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants