-
Notifications
You must be signed in to change notification settings - Fork 343
feat: Support metadata table "Metadata Log Entries" #846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
bdb6372 to
067ada8
Compare
crates/iceberg/src/metadata_scan.rs
Outdated
| fn snapshot_id_as_of_time( | ||
| table_metadata: &TableMetadataRef, | ||
| timestamp_ms_inclusive: i64, | ||
| ) -> Option<&SnapshotRef> { | ||
| let mut snapshot_id = None; | ||
| // The table metadata snapshot log is chronological | ||
| for log_entry in table_metadata.history() { | ||
| if log_entry.timestamp_ms <= timestamp_ms_inclusive { | ||
| snapshot_id = Some(log_entry.snapshot_id); | ||
| } | ||
| } | ||
| snapshot_id.and_then(|id| table_metadata.snapshot_by_id(id)) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes that history() is chronologically ordered because that gets asserted TableMetadata#try_normalize.
067ada8 to
8d76183
Compare
crates/iceberg/src/metadata_scan.rs
Outdated
| // Include the current metadata locaction and modification time in the table. This matches | ||
| // the Java implementation. Unlike the Java implementation, a current metadata location is | ||
| // optional here. In that case, we omit current metadata from the metadata log table. | ||
| if let Some(current_metadata_location) = &scan.metadata_location { | ||
| append_metadata_log_entry( | ||
| scan.metadata_ref.last_updated_ms(), | ||
| current_metadata_location, | ||
| ); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| .metadata_location(table_metadata1_location.as_os_str().to_str().unwrap()) | ||
| .metadata_location(template_json_location) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this change, the location in the metadata log and the current location would both be ../metadata/v1.json.
I wanted to have a distinction so we can assert that metadata_log_entries includes the current metadata location, even if not in the metadata log.
|
#822 has been merged, let's move on! |
8d76183 to
395ff1f
Compare
|
Thank you, @Xuanwo. This is rebased and ready for review. |
76c2bf2 to
439c529
Compare
crates/iceberg/src/metadata_table.rs
Outdated
| -- validity: | ||
| -- validity: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was failing for me locally.
| pub mod spec; | ||
|
|
||
| pub mod metadata_scan; | ||
| pub mod metadata_table; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up to this comment: #822 (comment)
Don't need to do in this PR. Up to you.
| /// Creates a metadata table which provides table-like APIs for inspecting metadata. | ||
| /// See [`MetadataTable`] for more details. | ||
| pub fn metadata_table(self) -> MetadataTable { | ||
| pub fn metadata_table(&self) -> MetadataTable<'_> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up to this comment: #822 (comment)
Also not necessary to do in this PR.
|
Let's get #863 merge first. |
+1. |
Re #823. This adds support for the Metadata Log Entries metadata table.
This is building on @xxchan's unmerged #822. I'll update and rebase this PR when #822 merges.✅Metadata Log Entries is the metadata log with the latest snapshot per metadata file.
Reference implementations: