[reconfigurator] Executor internal cleanup around iterators #7722

jgallagher · 2025-03-04T16:50:25Z

Builds on #7713, and is followup from #7713 (comment). In #7652 I changed all the executor substeps to take iterators instead of &BTreeMap references that no longer existed, but that introduced a weird split where the top-level caller had to filter the blueprint down to just the items that the inner functions expected. @smklein pointed out one place where the inner code was being extra defensive, which was just more confusing than anything.

This PR removes that split: the top-level executor now always passes a full &Blueprint down, and the inner modules are responsible for doing their own filtering as appropriate. To easy testing, I kept the versions that take an iterator of already-filtered items as private *_impl functions that the new functions-that-take-a-full-Blueprint themselves call too.

jgallagher · 2025-03-04T16:51:37Z

nexus/reconfigurator/execution/src/omicron_zones.rs

-            // We expect to only be called with expunged zones that are ready
-            // for cleanup; skip any with a different disposition.
-            if !config.disposition.is_ready_for_cleanup() {
-                return None;


This was the offender Sean pointed out. I think it's now reasonable to just remove this entirely: the private _impl function is fine to assume it's only called with zones that are ready for cleanup, and it's obvious that that's true in its single caller (other than tests).

andrewjstone

Makes sense to me.

smklein · 2025-03-04T17:30:23Z

nexus/reconfigurator/execution/src/clickhouse.rs

+) -> Result<(), Vec<anyhow::Error>> {
+    deploy_nodes_impl(
+        opctx,
+        blueprint.all_omicron_zones(BlueprintZoneDisposition::any),


I know you're just moving this - but should this disposition be is_in_service? Why is any acceptable here?

I know you're just moving this - but should this disposition be is_in_service? Why is any acceptable here?

John kept this in because the keeper code was doing it's own filtering, and the goal was not to change any behavior. But looking at this execution code again, it does seem wrong. There's no need to send updates to expunged zones. I was worried that the keeper config would be in correct, but this doesn't change that config, it only changes which keepers we send the config to.

It's not obvious to me that this set of zones is only used for which sleds to talk to - maybe it is? E.g., it looks like the IPs of these zones end up sent to everyone - maybe that's also wrong if they're expunged, but I know there was a lot of nitpicky stuff about how much can be changed at once for clickhouse. Do you mind if I file an issue and let you or Karen take a look at changing this?

Yes, please file an issue and I'll take care of it. Thanks!

but I know there was a lot of nitpicky stuff about how much can be changed at once for clickhouse.

This is definitely true, but I believe only for the keeper's raft config, which looks like it doesn't change. I will have to review all that code, and possibly add another test. None of this is trivial, so agree it warrants a deeper look.

👍 filed #7724

jgallagher requested review from andrewjstone and smklein March 4, 2025 16:50

jgallagher commented Mar 4, 2025

View reviewed changes

andrewjstone approved these changes Mar 4, 2025

View reviewed changes

smklein approved these changes Mar 4, 2025

View reviewed changes

smklein reviewed Mar 4, 2025

View reviewed changes

Base automatically changed from john/execution-ready-for-cleanup to main March 4, 2025 18:12

jgallagher added 5 commits March 4, 2025 13:12

iterator cleanup: zones

61b6c4c

iterator cleanup: decommissioned disks

2e7ca1c

iterator cleanup: clickhouse

8810816

iterator cleanup: sled decommissioning

f4c3cb2

iterator cleanup: datasets

21953e0

jgallagher force-pushed the john/execution-iter-cleanup branch from a80b39f to 21953e0 Compare March 4, 2025 18:13

jgallagher mentioned this pull request Mar 4, 2025

Blueprint executor: Multinode clickhouse deployment acts on expunged zones #7724

Open

jgallagher merged commit 07a3a50 into main Mar 4, 2025
16 checks passed

jgallagher deleted the john/execution-iter-cleanup branch March 4, 2025 19:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[reconfigurator] Executor internal cleanup around iterators #7722

[reconfigurator] Executor internal cleanup around iterators #7722

Uh oh!

jgallagher commented Mar 4, 2025

Uh oh!

jgallagher Mar 4, 2025

Uh oh!

andrewjstone left a comment

Uh oh!

smklein Mar 4, 2025

Uh oh!

andrewjstone Mar 4, 2025

Uh oh!

jgallagher Mar 4, 2025

Uh oh!

andrewjstone Mar 4, 2025

Uh oh!

andrewjstone Mar 4, 2025

Uh oh!

jgallagher Mar 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[reconfigurator] Executor internal cleanup around iterators #7722

[reconfigurator] Executor internal cleanup around iterators #7722

Uh oh!

Conversation

jgallagher commented Mar 4, 2025

Uh oh!

jgallagher Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

andrewjstone left a comment

Choose a reason for hiding this comment

Uh oh!

smklein Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

andrewjstone Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

jgallagher Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

andrewjstone Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

andrewjstone Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

jgallagher Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants