Skip to content

Conversation

@dentiny
Copy link
Contributor

@dentiny dentiny commented May 15, 2025

Which issue does this PR close?

TLDR:

  • I want to add customized metadata (with size controlled within 200B) for each snapshot, summary properties is the perfect place for this feature request

What changes are included in this PR?

I add a new table update action for snapshot summary properties.

Are these changes tested?

Yes, unit tests are added.

@dentiny dentiny marked this pull request as draft May 15, 2025 17:45
@dentiny dentiny changed the title [WIP] feat: Introduce snapshot summary properties Introduce snapshot summary properties May 15, 2025
@dentiny dentiny changed the title Introduce snapshot summary properties feat: Introduce snapshot summary properties May 15, 2025
@dentiny dentiny marked this pull request as ready for review May 15, 2025 18:49
@dentiny dentiny requested a review from jonathanc-n May 17, 2025 23:04
Copy link
Contributor

@jonathanc-n jonathanc-n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, sorry for the misunderstanding. Thanks for this pr @dentiny!

@dentiny dentiny force-pushed the hjiang/add-snapshot-summary-properties branch from d6d554b to e268be7 Compare May 23, 2025 11:51
Xuanwo
Xuanwo previously approved these changes May 23, 2025
Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice change, thank you!

@Xuanwo
Copy link
Member

Xuanwo commented May 23, 2025

One thing left here is to format the code.

@dentiny
Copy link
Contributor Author

dentiny commented May 23, 2025

Hi @Xuanwo , I think the linting issue has been fixed, also merged with main branch; could you please take another look when you have some time? Thank you!

@dentiny
Copy link
Contributor Author

dentiny commented May 26, 2025

@Xuanwo friendly ping :)

Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice change, thank you @dentiny!

@Xuanwo Xuanwo merged commit 8bb9a88 into apache:main May 28, 2025
17 checks passed
},
/// Add snapshot summary properties.
#[serde(rename_all = "kebab-case")]
AddSnapshotSummaryProperties {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liurenjie1024
Copy link
Contributor

Hi, @Xuanwo @dentiny We need to revert this pr, the new added update doesn't exist in openapi spec: https://github.com/apache/iceberg/blob/cab0decbb0e32bf314039e30807eb033c50665d5/open-api/rest-catalog-open-api.yaml#L3058

@dentiny
Copy link
Contributor Author

dentiny commented May 28, 2025

Hi, @Xuanwo @dentiny We need to revert this pr, the new added update doesn't exist in openapi spec: https://github.com/apache/iceberg/blob/cab0decbb0e32bf314039e30807eb033c50665d5/open-api/rest-catalog-open-api.yaml#L3058

Hi @liurenjie1024 , there's a SnapshotUpdate action in Java implementation, which supports setting summary property.
https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/SnapshotUpdate.java

@liurenjie1024
Copy link
Contributor

Also I don't think we should allow user to udpate snapshot summaries alone, it's dangerous.

@dentiny
Copy link
Contributor Author

dentiny commented May 28, 2025

Also I don't think we should allow user to udpate snapshot summaries alone, it's dangerous.

HI @liurenjie1024, curious if you have other suggestions to update snapshot summary?
My feature request is to write customized metadata for snapshot.

The Java impl I linked above seems to fulfill the requirement, and is exposed as public?
Would like to know your thoughts :)

@liurenjie1024
Copy link
Contributor

Also I don't think we should allow user to udpate snapshot summaries alone, it's dangerous.

HI @liurenjie1024, curious if you have other suggestions to update snapshot summary? My feature request is to write customized metadata for snapshot.

The Java impl I linked above seems to fulfill the requirement, and is exposed as public? Would like to know your thoughts :)

The SnapshotUpdate is a super interface of tx actions that could produce a new snapshot, so in theory you could use any tx action that produces a new snapshot to do that. Currently, you could you FastAppend.

liurenjie1024 added a commit to liurenjie1024/iceberg-rust that referenced this pull request May 28, 2025
@dentiny
Copy link
Contributor Author

dentiny commented May 28, 2025

The SnapshotUpdate is a super interface of tx actions that could produce a new snapshot, so in theory you could use any tx action that produces a new snapshot to do that. Currently, you could you FastAppend.

Yeah I actually checked existing interface before I made this PR, I didn't any interface exposing the capability to set / update snapshot summary:

#[allow(clippy::too_many_arguments)]
pub(crate) fn new(
tx: Transaction<'a>,
snapshot_id: i64,
commit_uuid: Uuid,
key_metadata: Vec<u8>,
snapshot_properties: HashMap<String, String>,
) -> Result<Self> {
Ok(Self {
snapshot_produce_action: SnapshotProduceAction::new(
tx,
snapshot_id,
key_metadata,
commit_uuid,
snapshot_properties,
)?,
check_duplicate: true,
})
}

liurenjie1024 pushed a commit that referenced this pull request May 29, 2025
## Which issue does this PR close?

- Closes #1329

## What changes are included in this PR?

This PR tries to do the same thing as reverted
[PR](#1336), which adds
capability to set snapshot summary properties.
Followup on thread:
#1336 (comment),
in this PR I took a different way, which expose interfaces within fast
append action.

## Are these changes tested?

Yes, I add new unit tests.
@liurenjie1024
Copy link
Contributor

The SnapshotUpdate is a super interface of tx actions that could produce a new snapshot, so in theory you could use any tx action that produces a new snapshot to do that. Currently, you could you FastAppend.

Yeah I actually checked existing interface before I made this PR, I didn't any interface exposing the capability to set / update snapshot summary:

#[allow(clippy::too_many_arguments)]
pub(crate) fn new(
tx: Transaction<'a>,
snapshot_id: i64,
commit_uuid: Uuid,
key_metadata: Vec<u8>,
snapshot_properties: HashMap<String, String>,
) -> Result<Self> {
Ok(Self {
snapshot_produce_action: SnapshotProduceAction::new(
tx,
snapshot_id,
key_metadata,
commit_uuid,
snapshot_properties,
)?,
check_duplicate: true,
})
}

I think you find the correct way to do it in #1391

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: support to update snapshot summary

4 participants