Skip to content

Conversation

@planga82
Copy link
Contributor

What changes were proposed in this pull request?

In the case of V2SessionCatalog.alterTable, SessionCatalog.alterTable is called for any type of modification. In case HiveExternalCatalog is configured, the alterTable method does not support schema modification. For schema modifications it is necessary to use alterTableDataSchema. For this particular case, operations like adding columns or changing columns are not working.

If we analyze the commands that perform column modifications in SessionCatalog (AlterTableChangeColumnCommand, AlterTableAddColumnsCommand) they use the alterTableDataSchema method for schema modification and not alterTable.

In this PR it is proposed that the modifications of the schema in V2sessionCatalog are done through the alterTableDataSchema method of the SessionCatalog.

Why are the changes needed?

To fix this bug

Does this PR introduce any user-facing change?

No

How was this patch tested?

A HiveExternalV2SessionCatalogSuite is introduced to launch the same tests as V2SessionCatalogTableSuite but with the hive catalog configured. Despite the improvements introduced not everything is supported for HiveExternalCatalog so some tests are excluded.

@github-actions github-actions bot added the SQL label Jan 31, 2023
@planga82
Copy link
Contributor Author

CC @cloud-fan @rdblue thanks!

Comment on lines +156 to +157
// Not supported changes in alterTableDataSchema
!change.isInstanceOf[RenameColumn] && !change.isInstanceOf[DeleteColumn]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems awkward. In SessionCatalog I see there is a comment "not supporting dropping columns yet" (emphasis added), should we instead make the change to allow this within alterTableDataSchema? It seems that it can make this much cleaner and have a clear separation of schema changes from alterTableDataSchema vs other changes in alterTable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or another somewhat related question, is the right approach to instead make HiveExternalCatalog.alterTable() support modifying the schema? It seems like this would be more in line with other implementations of alterTable. I see this comment was added as part of PR #14155:

   * Note: As of now, this doesn't support altering table schema, partition column names and bucket
   * specification. We will ignore them even if users do specify different values for these fields.

It's not clear to me if this was intentional, or something that was intended to be built on top of / fixed. @cloud-fan , thoughts here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand what you say about RenameColumn and DeleteColumn, I am not clear which is the best option, in this case as it is not supported I considered that it was better to keep the current behavior.

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label May 13, 2023
@github-actions github-actions bot closed this May 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants