Skip to content

Conversation

jackkleeman
Copy link
Contributor

No description provided.

@jackkleeman jackkleeman requested a review from gvdongen June 3, 2025 12:52
Copy link
Collaborator

@gvdongen gvdongen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jack. I was just wondering why this isn't structured more like this:

  • An object per bucket with as key "my-key/last-millis" with last-millis being the bucket end (e.g. 1749024660000). And just store a single integer count in state.
  • A proxy service which forwards incoming requests e.g. {key: blabla, timestamp: 174902466055555} to the correct VO by calculating the end-millis of the bucket. And the VO just does count + 1.

Then you don't bump into the state limitation of 1mb and you don't need to always send the entire state along. And you can decomission objects much later.

Would this be a possibility or am I missing some nuance? :)


// find the first entry that is greater than or equal to the start
const startIndex =
entries.findLastIndex((entry) => entry[0] < startBucket) + 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment mentions greater than or equal but the code uses <

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, its accurate though, because of the + 1; we find the last entry that is less than the start bucket, and add one, so we have found the first entry that is greater than or equal to the start bucket

// find the first entry that is greater than or equal to the end
// the entry will not be included
const endIndex = endBucket
? entries.findLastIndex((entry) => entry[0] < endBucket) + 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment mentions greater than or equal but the code uses <

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above, the comment is accurate because of the + 1

@jackkleeman
Copy link
Contributor Author

wouldn't the downside of such an approach be that, if i have 1-second precision and want to range over 5 minutes, I have to make 300 concurrent invocations to read a count? As I don't know what buckets have a value, so I have to check them all. In practice buckets are probably pretty sparse. My goal was to make the read path pretty cheap

@jackkleeman jackkleeman force-pushed the fixedwindowcounter branch from 2d6b9e3 to ecd808d Compare June 9, 2025 08:38
@gvdongen
Copy link
Collaborator

@jackkleeman I guess it depends on the use case and what you are trying to optimize (read or write). Executing a set of invocations in parallel that only do a read is also not so expensive I would think. If you don't need such fine precision you can just use bigger bucket intervals. My feeling is that having an object per bucket would fit better for the majority of these stream-processing-like use cases (similar to tumbling windows in Flink). And you can use a delayed call to retire them after a specified interval. Or is that not the type of use case we want to show here?

This also reminds me of this stream processor Stephan once implemented with Restate: https://github.com/restatedev/demos-private/blob/main/streamprocessing/src/streamapi.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants