-
Notifications
You must be signed in to change notification settings - Fork 343
feat: sql catalog support update table #862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
liurenjie1024
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Li0k for this pr, but I have concerning introducing update table at this moment as there are many missing features such as conflict detection, commit retry.
|
|
||
| /// Returns snapshot references. | ||
| #[inline] | ||
| pub fn snapshot_refs(&self) -> &HashMap<String, SnapshotReference> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we add this method? We already have lookup method for snapshot
| /// TableCommit represents the commit of a table in the catalog. | ||
| #[derive(Debug, TypedBuilder)] | ||
| #[builder(build_method(vis = "pub(crate)"))] | ||
| #[builder(build_method(vis = "pub"))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason we make TableCommit crate only is that we don't want to allow user to build it manually, all table commits construction should go through transaction api.
…/catalog_sql_update_table
…/catalog_sql_update_table
8dbbf3b to
8d0f168
Compare
| update_table_metadata_builder = table_update.apply(update_table_metadata_builder)?; | ||
| } | ||
|
|
||
| for table_requirement in requirements { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the requirements be checked in a transaction (that executes the update statement)? Otherwise a conflicting concurrent commit can update first and we end up in a broken table state.
The table metadata that's used to validate the requirements would also need to be loaded within the transaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it would also make sense to explicitly set a transaction isolation level of repeatable read. Postgres for example, defaults to read committed which can similarly get us into a broken table state:
read committed allows us to see different versions of the same row between the SELECT statement (that we use to validate the commit requirements) and the UPDATE statement. Effectively, a concurrently running conflicting update operation that commits between SELECT and UPDATE will still allow our UPDATE to succeed. We were not able to re-check the new table requirements but only checked the old ones -> we end up in a broken state.
With repeatable read on the other hand, the UPDATE should safely fail with a serialization error.
This PR support
update_tableinterface for sql catalogupdate_tableOther PRs for reference:
After these PRs have been merged, we can use sql database as the catalog backend