Skip to content

Conversation

@jgallagher
Copy link
Contributor

Builds on #7713, and is followup from #7713 (comment). In #7652 I changed all the executor substeps to take iterators instead of &BTreeMap references that no longer existed, but that introduced a weird split where the top-level caller had to filter the blueprint down to just the items that the inner functions expected. @smklein pointed out one place where the inner code was being extra defensive, which was just more confusing than anything.

This PR removes that split: the top-level executor now always passes a full &Blueprint down, and the inner modules are responsible for doing their own filtering as appropriate. To easy testing, I kept the versions that take an iterator of already-filtered items as private *_impl functions that the new functions-that-take-a-full-Blueprint themselves call too.

Comment on lines -110 to -113
// We expect to only be called with expunged zones that are ready
// for cleanup; skip any with a different disposition.
if !config.disposition.is_ready_for_cleanup() {
return None;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the offender Sean pointed out. I think it's now reasonable to just remove this entirely: the private _impl function is fine to assume it's only called with zones that are ready for cleanup, and it's obvious that that's true in its single caller (other than tests).

Copy link
Contributor

@andrewjstone andrewjstone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me.

) -> Result<(), Vec<anyhow::Error>> {
deploy_nodes_impl(
opctx,
blueprint.all_omicron_zones(BlueprintZoneDisposition::any),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you're just moving this - but should this disposition be is_in_service? Why is any acceptable here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you're just moving this - but should this disposition be is_in_service? Why is any acceptable here?

John kept this in because the keeper code was doing it's own filtering, and the goal was not to change any behavior. But looking at this execution code again, it does seem wrong. There's no need to send updates to expunged zones. I was worried that the keeper config would be in correct, but this doesn't change that config, it only changes which keepers we send the config to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious to me that this set of zones is only used for which sleds to talk to - maybe it is? E.g., it looks like the IPs of these zones end up sent to everyone - maybe that's also wrong if they're expunged, but I know there was a lot of nitpicky stuff about how much can be changed at once for clickhouse. Do you mind if I file an issue and let you or Karen take a look at changing this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please file an issue and I'll take care of it. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but I know there was a lot of nitpicky stuff about how much can be changed at once for clickhouse.

This is definitely true, but I believe only for the keeper's raft config, which looks like it doesn't change. I will have to review all that code, and possibly add another test. None of this is trivial, so agree it warrants a deeper look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 filed #7724

Base automatically changed from john/execution-ready-for-cleanup to main March 4, 2025 18:12
@jgallagher jgallagher force-pushed the john/execution-iter-cleanup branch from a80b39f to 21953e0 Compare March 4, 2025 18:13
@jgallagher jgallagher merged commit 07a3a50 into main Mar 4, 2025
16 checks passed
@jgallagher jgallagher deleted the john/execution-iter-cleanup branch March 4, 2025 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants