Skip to content

Conversation

@jgallagher
Copy link
Contributor

This is staged on top of #6972 but isn't really dependent on it (it's just to reduce minor diff churn).

Almost all of the diffs in this PR is code moving around:

  • BlueprintDisksBuilder, BlueprintDatasetsBuilder, and BlueprintSledDatasetsBuilder moved from builder.rs to the new disks_editor.rs
  • BlueprintBuilder::sled_ensure_{disks,datasets}() are now methods on the new BlueprintDisksEditor type (also in disks_editor.rs)
  • The two unit tests that hit this related code also moved
  • BlueprintBuilder::build() now takes an argument (a BlueprintDisksEditor), since the builder itself doesn't hold the changed disks/datasets any more
  • Planner now holds the BlueprintDisksEditor and calls methods on it instead of where it used to call the same methods on BlueprintBuilder

There were minor changes in the moved code, but should be nothing substantial or behavioral. BlueprintDisksEditor is the only new thing, but it's just a tiny wrapper around the moved existing stuff.

Copy link
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, the new error variants make sense, and eyeballing this the move seems mostly mechanical.

pub fn all_sleds(
&self,
filter: SledFilter,
) -> impl Iterator<Item = (SledUuid, &SledDetails)> + '_ {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, what made this Clone requirement show up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/oxidecomputer/omicron/pull/6973/files#diff-3d648e59b20ff006cd1642fd0572f90c55a3106cebb8288eab9581b19ff38e78R338-R346

If github doesn't like that link: the DisksEditor::build() method calls into_*_map for both disks and datasets, both of which want to iterate over the sled IDs.

@jgallagher
Copy link
Contributor Author

So the CI failure is legit and looks harder to untangle than I expected: reconfigurator-cli and the test_nexus_add_remove live test both use BlueprintBuilder without going through the planner, which made them fail to compile due to the change to build() to now take a DisksEditor. But maybe that reveals that both of them are actually wrong, today? Neither of them call builder.sled_ensure_datasets() despite making changes to the zones - is that incorrect?

@smklein
Copy link
Collaborator

smklein commented Oct 31, 2024

So the CI failure is legit and looks harder to untangle than I expected: reconfigurator-cli and the test_nexus_add_remove live test both use BlueprintBuilder without going through the planner, which made them fail to compile due to the change to build() to now take a DisksEditor. But maybe that reveals that both of them are actually wrong, today? Neither of them call builder.sled_ensure_datasets() despite making changes to the zones - is that incorrect?

I think you're right, I think this must have been an oversight for both use-cases. I was focused on the planner-based workflow, I must have not caught these cases.

As we discussed offline, I think this is actually an old bug, because sled_ensure_disks was also not being invoked by either of these uses. But with the datasets transitioning to blueprints, this is certainly more visible.

Base automatically changed from john/planner-cleanup-1 to main November 1, 2024 14:11
@jgallagher
Copy link
Contributor Author

Ok, I think I fixed up reconfigurator-cli and the live tests in 987d4d5: both now call sled_ensure_disks and sled_ensure_datasets surrounding the edits they otherwise make.

I less-accurately fixed up some DataStore tests in 6d70107 and blueprint execution tests in d75619f, both of which use BlueprintBuilder directly without going through the planner. In these cases I did not bother adding sled_ensure_{datasets,disks} calls, but the tests still seem to pass? I'm not sure if that's good or bad.

@jgallagher
Copy link
Contributor Author

I think this PR is probably still a step in the right direction, in that it makes some of these interactions more obvious, but this DisksEditor / BlueprintBuilder split still doesn't seem right; it's too easy to use incorrectly (e.g., the datastore tests as of this PR). Rereading sled_ensure_{datasets,disks} - is this understanding correct?

  1. sled_ensure_disks adds any disks that are new in PlanningInput but are not present in the blueprint (i.e., disks that have been added to sleds since the last blueprint).
  2. sled_ensure_disks removes any disks that have been expunged in PlanningInput but are still present in the blueprint. It does not have access to zones, so doesn't expunge any zones that were on those disks; that's the responsibility of the planner.
  3. sled_ensure_disks must be called before sled_ensure_datasets, before the planner does its "expungement planning" phase, and before the planner does its "add zones" phase.
  4. sled_ensure_datasets will add a Debug and ZoneRoot dataset to every disk that doesn't have one in the blueprint already.
  5. sled_ensure_datasets will add filesystem and durable datasets for any control plane zones that need them and that aren't in the blueprint already.
  6. sled_ensure_datasets will mark any datasets that are no longer needed (I assume in practice this is "the zone/disk has been expunged"?) as expunged.
  7. sled_ensure_datasets must be called after the planner does its expunge/add zone phases, as it needs the zones after those phases to do the previous two points.

Item 2 seems to be problematic for clients other than Planner: This PR fixes up calling sled_ensure_{datasets,disks} at more or less the right time in reconfigurator-cli and live-tests, but doesn't expunge zones on newly-expunged disks.

Do items 4-6 typically only make changes relevant to other changes made in this planning run (e.g., item 4 will only add Debug/ZoneRoot datasets to disks that were newly added in this planning run, because other disks would have already gotten them from the planning runs in which they were added)? (With the major exception of "the first planning run after these functions were added", during which all the disks/datasets will be newly added.) I wonder if that points to a path where this could be made less error prone if the builder did this dataset work as it added disks / zones? That would mean moving this work back into the builder, which doesn't seem quite right either.

I'm not really sure what I would expect reconfigurator-cli's behavior to be in the "a disk has been expunged in PlanningInput but is still present in the blueprint being edited" case. Should it automatically expunge that disk (which it does as of this PR) and any zones present on it (which it does not do as of this PR)? Or should it warn and give the operator hooks to edit the disks more manually? Similarly on sled_ensure_datasets: it seems like reconfigurator-cli should certainly add datasets for zones it adds, but I'm less sure about automatically adding/expunging other datasets that aren't being explicitly modified.

@smklein
Copy link
Collaborator

smklein commented Nov 1, 2024

is this understanding correct?

...

Yeah, this tracks. There's some documentation to this effect in planner.rs - which is where I thought this was primarily being used - that mentions this.

The callsite to sled_ensure_disks has a comment saying "do this before trying to allocate any zones".
The callsite to sled_ensure_datasets (via do_plan_datasets) has a comment saying "do this after adding all disks and zones".

Item 2 seems to be problematic for clients other than Planner: This PR fixes up calling sled_ensure_{datasets,disks} at more or less the right time in reconfigurator-cli and live-tests, but doesn't expunge zones on newly-expunged disks.

The decision to expunge zones due to disk expungement comes from zone_needs_expungement, which returns a ZoneExpungeReason::DiskExpunged type explicitly.

The callstack is as follows, jumping back and forth between the planner and the builder:

  • do_plan ->
  • do_plan_expunge ->
  • do_plan_expunge ->
  • expunge_zones_for_sled ->
  • zone_needs_expungement

I think this aligns with the conversation we had yesterday:

image

"If you use the planner", we do the right thing. "If you use the builder directly", then, well, YOLO. You're on the hook for completing all steps correctly, so this seems like a pathway we should discourage, since you're right -- we could, as one example of many, forget to expunge zones on expunged disks.

Do items 4-6 typically only make changes relevant to other changes made in this planning run (e.g., item 4 will only add Debug/ZoneRoot datasets to disks that were newly added in this planning run, because other disks would have already gotten them from the planning runs in which they were added)? (With the major exception of "the first planning run after these functions were added", during which all the disks/datasets will be newly added.)

I think this is correct, but it's hard to fully predict the future. The current construction of the builder "conforms" disks to this format, regardless of whether they:

  • Currently have no datasets in the blueprint
  • Have these dataset in the blueprint
  • Some future third thing, where we want to alter the datasets but they already exist

I wonder if that points to a path where this could be made less error prone if the builder did this dataset work as it added disks / zones? That would mean moving this work back into the builder, which doesn't seem quite right either.

I'm not sure I'm on-board with this pathway - let's compare it to zones on sleds.

We could provision all zones to sleds as they become in-service, but the codebase doesn't do that. It walks through all sleds, makes sure they have the zones they need, and also ensures that "rack/fleet"-wide zones also exist on some number of sleds.

To me, datasets are equivalent -- sure, there are some datasets that are 1:1 with sleds/zones, but calling sled_ensure_datasets gives the reconfigurator the ability to create, destroy, and alter the set of all datasets with the same codepath.

I'm not really sure what I would expect reconfigurator-cli's behavior to be in the "a disk has been expunged in PlanningInput but is still present in the blueprint being edited" case. Should it automatically expunge that disk (which it does as of this PR) and any zones present on it (which it does not do as of this PR)? Or should it warn and give the operator hooks to edit the disks more manually? Similarly on sled_ensure_datasets: it seems like reconfigurator-cli should certainly add datasets for zones it adds, but I'm less sure about automatically adding/expunging other datasets that aren't being explicitly modified.

This is a good question, and I think it can be phrased more generally as:

"If the reconfigurator-cli wants to take some action that's different from what the planner would do, how do we handle that?"

I wonder if, for the reconfigurator-cli case, we should run the actual planner's do_plan logic alongside whatever operations the user is trying to perform, to let them know how their constructed blueprint might diverge from the more cautiously constructed system.

@smklein
Copy link
Collaborator

smklein commented Nov 1, 2024

We discussed this a bit in chats, and to summarize:

Pain

  • The planner vs builder split is causing pain right now because the planner does more work to ensure the system is valid, while the builder gives more raw control over altering the blueprint.
  • The planner is used by Nexus internally, but the raw builder APIs are useful for raw access to blueprint editing, via tools like reconfigurator-cli
  • this is painful, because the guarantees the planner is trying to uphold might not be upheld by the blueprint builder alone

A way out

  • One proposed solution: creating a "blueprint checker" -- for example, see the verify_blueprint function used by tests in nexus/reconfigurator/planning/src/blueprint_builder/builder.rs -- to validate the state of a blueprint, regardless of where it originated from. These checks could be enumerated separately, and enabled via a configuration option.
  • Utilities like reconfigurator-cli could use this checker, hopefully opting into all checks by default, but giving options for overrides if explicitly requested.
  • As a neat benefit - we could use this same checker in tests and in the reconfigurator-cli tool.

@jgallagher
Copy link
Contributor Author

Closing this after our discussion yesterday; will reopen something similar but in a different direction related to #6990

@jgallagher jgallagher closed this Nov 5, 2024
@jgallagher jgallagher deleted the john/planner-cleanup-2 branch February 21, 2025 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants