Skip to content

Conversation

noam-sol
Copy link

@noam-sol noam-sol commented Sep 28, 2025

Description

Before the fix, we lose the granularity between Beginning and offset 0 which
affects the output checkpoint_delta.

How was this PR tested?

Added a UT

@noam-sol noam-sol force-pushed the suppress-log-on-beginning branch 2 times, most recently from dbcc250 to 9d1868c Compare September 28, 2025 14:04
@guilload
Copy link
Member

This is not the right fix. It's a bug in the checkpoint delta logic in the source. If you print the batch checkpoint deltas in that unit test, you will see something like:

for batch in batches {
            println!("batch checkpoint delta: {:?}", batch.checkpoint_delta);
        }
batch checkpoint delta: ∆(file:///var/folders/cj/3nfybfc572b9s2qvd18qzh_m0000gp/T/.tmpDz99t1:(00000000000000000000..~00000000000000000350])
batch checkpoint delta: ∆(file:///var/folders/cj/3nfybfc572b9s2qvd18qzh_m0000gp/T/.tmp6o6nbz:(00000000000000000070..~00000000000000000350])

The first checkpoint delta should be (beginning..~00000000000000000350].

Do you want to take a stab at it?

@noam-sol
Copy link
Author

ahh I see, on this line -

we lose the granularity between Beginning and offset 0; however I don't feel like properly fixing it is worth the risk of breaking something

@guilload
Copy link
Member

This is the right fix and we're not going to break anything. I will not merge this PR as is.

I will remove is_eof on a next commit, so remove a usage of it.
Before the fix, we lose the granularity between Beginning and offset 0 which
affects the output checkpoint_delta.
@noam-sol noam-sol force-pushed the suppress-log-on-beginning branch from 9d1868c to dcc9cec Compare September 30, 2025 15:11
@noam-sol
Copy link
Author

@guilload I updated the branch to fix the issue as you suggested, and added a UT as well. Thanks for the feedback! Do you have any other ideas on how to test this?

@noam-sol noam-sol changed the title metastore: skip logging checkpoint on position::beginning source: preserve Position arg in ObjectUriBatchReader Sep 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants