-
Notifications
You must be signed in to change notification settings - Fork 62
[reconfigurator] Introduce a combined SledEditor inside BlueprintBuilder
#7204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
andrewjstone
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite an improvement to the current state of affairs. I believe it will help me cleanup my branch for making disk expungement and decommissioning explicit in blueprints.
You can take or leave my suggestions. I just want to merge this in and use it immediately.
| self.state = new_state; | ||
| } | ||
|
|
||
| pub fn disks( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is lovely. Thank you.
| self.datasets.ensure(zone_root); | ||
| } | ||
|
|
||
| pub fn expunge_disk( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so nice to read and be exposed at a high level. Thank you.
| .build(datasets, rng) | ||
| }); | ||
| let durable_dataset = zone.zone_type.durable_dataset().map(|dataset| { | ||
| let address = match &zone.zone_type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take it crucible is the only durable dataset with an IP address. Maybe add a comment about why this is abnormal. That information should really be held outside the Dataset IMO, but no changes needed right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed and certainly; added in c6e2dcb
| /// 2. Is the dataset in `preexisting_database_ids`? If so, use that ID. | ||
| /// 3. Generate a new random ID. | ||
| #[derive(Debug)] | ||
| pub(crate) struct PreexistingDatasetIds( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Can we make reference in the name that these dataset IDs are coming from the database?
Maybe something like DatasetIdsFromDatabase or DatasetIdsFromDbButNotInBlueprint. I know the latter is wordy, but it's also quite descriptive :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed in 73045a7
| } | ||
|
|
||
| #[derive(Debug)] | ||
| pub(crate) struct PartialDatasetConfig { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please put a comment describing the purpose of this type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added along with the rework in bd66165
| sled_state.entry(sled_id).or_insert(details.state); | ||
| // Helper to build a `PreexistingDatasetIds` for a given sled. This will | ||
| // go away with https://github.com/oxidecomputer/omicron/issues/6645. | ||
| let build_preexisting_dataset_ids = |sled_id| -> anyhow::Result< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great change. Thanks.
| { | ||
| Some(state) => state, | ||
| None => { | ||
| // If we have zones but no state for a sled, we assume |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something it would really be nice to encode in the type system some how. A sled can only be moved to decommissioned if all it's zones, disks, and datasets are expunged. I'm not sure how feasible this is (certainly not in this PR), but it would be nice to get rid of checks like this and many others throughout the planner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried putting a check in SledEditor::set_state(); i.e., "if you're try to set the state to Decommissioned, the editor will fail if ... insert whatever checks we want", and it caused some test failures. Almost certainly those tests are taking invalid shortcuts; I'd like to take this as a followup.
| ); | ||
| } | ||
| } | ||
| // Preserving backwards compatibility, for now: prune sled_state of any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this backwards compatibility is going to have to change when decommissioning disks based on whether a sled is decommissioned, which is presumably why you wrote for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, definitely. I'd like to remove all of these "preserving backwards compatibility" blocks, as they all result in the four maps (potentially) having different sets of keys, which is confusing at best.
| "for sled {sled_id}, error computing zones to expunge" | ||
| ))) | ||
| })?; | ||
| // TODO-john we lost the check for "are we expunging a zone we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I"m not sure I really understand why that was there. If we mark a zone expunged in this iteration, why wouldn't we set the disposition to expunged in the editor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check was something else; it was saying "if we're about to mark a zone as expunged, first check that we didn't already edit this zone in some way in this same planning pass". I think it was to guard against the planner adding a zone and then immediately expunging it, or changing the parameters of a zone and then expunging it, or other nonsense operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that makes sense. I think it's probably fine to remove the check. We know when we want to expunge something which should trump everything else.
| // https://github.com/oxidecomputer/omicron/issues/6645; for now, it | ||
| // confirms we maintain the compatibility layer it needs. | ||
| #[test] | ||
| fn test_backcompat_reuse_existing_database_dataset_ids() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great test!
…uiring callers to look up IDs
* added doc comment * build() is gone * DatasetsEditor now takes a PartialDatasetConfig when adding/updating config so it can determine IDs on its own
|
Thanks for all the cleanup @jgallagher. Looking good. |
#7234) This is a followup from #7204 (comment) and makes two changes, neither of which should affect behavior: * `SledEditor` will now fail if a caller attempts to make changes to a decommissioned sled (internally, this is statically enforced by a type state enum - a sled in the decommissioned state does not have any methods that support editing, so we're forced to return an error) * `SledEditor::set_state()` is now `SledEditor::decommission()`, and it performs some checks that the sled looks decommissionable The second bullet is more questionable than I expected it to be: 1. There are some arguments that `SledEditor` shouldn't do any checks here; in particular, it doesn't have the full context (e.g., any checks on "should we decommission this sled" that depend on the `PlanningInput` can't be performed here, because `SledEditor` intentionally doesn't have access to `PlanningInput`). 2. I wanted to check zones + disks + datasets, but in practice it can only check zones today; I left a comment (and the commented-out disks + datasets checks we should do) about why. I think we will eventually be able to turn these on; the current behavior of removing disks/datasets from the blueprint for expunged sleds will have to change to fix #7078, at which point these checks should be valid. I don't feel super strongly about the checks in `decommission()` or even this PR as a whole; if this doesn't look like a useful direction, I'd be fine with discarding it. Please review with a pretty critical eye.
This is a step on the road to #7078. The current
BlueprintBuilderinternals would not at all be amenable to squishing the disparate maps inBlueprinttogether, so this PR tries to rework those internals.BlueprintBuildernow does its own map squishing (insidenew_based_on()), combining state+zones+disks+datasets into one helper (SledEditor). All modifications are preformed via those editors, thenBlueprintBuilder::build()breaks the combined editors' results back out into four maps forBlueprint.There should be only one functional change in this PR (marked with a
TODO-john): previously, expunging a zone checked that that zone had not been modified in the current planning cycle. I'm not sure that's really necessary; it felt like some defensiveness born out of the complexity of the builder itself, maybe? I think we could put it back (insideZonesEditor, presumably) if folks think it would be good to keep.My intention was to change the tests as little as possible to ensure I didn't break anything as I was moving functionality around, so
new_based_on()andbuild()have some arguably-pretty-gross concessions to behave exactly the way the builder did before these changes. I'd like to remove those concessions, but those will be nontrivial behavioral changes that I don't want to try to roll in with all of this cleanup. I think I landed this pretty well; there are only a few expectorate changes (due to the slightly reworkedZonesEditorproducing a different ordering for a few tests), and none inomdb(which is where I've often seen incidental planner changes bubble out).Apologies for the size of the PR. I'd lightly recommend that the new
blueprint_editor/module and its subcomponents should be pretty quickly reviewable and should be looked at first; they're supposed to be simple and obvious, and are largely ported over from the prior storage / disks / datasets / zones editors. I did rework how we're handling #6645 backwards compat to try to reduce how much we need to passSledResourcesaround, so that complicates things inDatasetsEditorsome, but not too bad IMO. (I also added a test for this, since I changed it and I don't think we had one before.) TheBlueprintBuilterchanges should then be more or less the natural way one would use a collection ofSledEditors without changing its API or behavior (yet!).