Skip to content

Commit 3ee1546

Browse files
committed
WIP FOR A GARFIELD MONDAY
1 parent 4910000 commit 3ee1546

File tree

2 files changed

+38
-0
lines changed

2 files changed

+38
-0
lines changed

nexus/db-queries/src/db/datastore/physical_disk.rs

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -281,6 +281,25 @@ impl DataStore {
281281
}
282282

283283
/// Decommissions all expunged disks.
284+
//
285+
// TODO: This is not safe to do.
286+
//
287+
// We have multliple nexuses running, and one can be lagging behind the others.
288+
// It may think it knows which disks have been expunged at the current sled,
289+
// but new disks could have been added in the `ExpungedButActive` state
290+
// that will now get decommissioned prematurely.
291+
//
292+
// The planner needs to explicitly decide which disks to decommission based
293+
// on the omicron_physical_disks_config field in sled_agent inventory. If
294+
// the sled the disk is on has already been expunged the planner can use
295+
// that instead of the sled-agent field in inventory to decide that its safe
296+
// to decommission the disk. It needs to mark these as decommissioned in the
297+
// blueprint and the executor needs to then perform the decommissioning for
298+
// all disk ids.
299+
//
300+
// The planner on the next round will then go ahead and see that the given
301+
// disks are decommissioned in the planning input (database) and remove the
302+
// disks to decommission from the new blueprint.
284303
pub async fn physical_disk_decommission_all_expunged(
285304
&self,
286305
opctx: &OpContext,

nexus/reconfigurator/planning/src/planner.rs

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,26 @@ impl<'a> Planner<'a> {
106106
Ok(())
107107
}
108108

109+
/// Decommission any `ExpungedButActive` disks that the sled-agent knows
110+
/// are expunged.
111+
///
112+
/// We need to find the set of all `ExpungedButActive` disks that no
113+
/// longer exist in `parent_blueprint.blueprint_disks`. Then we need to
114+
/// look at the inventory and see if the corresponding sled-agent has seen
115+
/// a the omicron_physical_disks_generation from the parent_blueprint.
116+
/// Alternatively, we can mark a disk decommissioned if the sled it's on
117+
/// has been expunged. If it has then the disks can be decommissioned at
118+
/// that sled-agent.
119+
fn do_plan_decommission_disks(&mut self) -> Result<(), Error> {
120+
todo!()
121+
}
122+
109123
fn do_plan_decommission(&mut self) -> Result<(), Error> {
124+
self.do_plan_decommission_disks()?;
125+
self.do_plan_decommission_sleds()
126+
}
127+
128+
fn do_plan_decommission_sleds(&mut self) -> Result<(), Error> {
110129
// Check for any sleds that are currently commissioned but can be
111130
// decommissioned. Our gates for decommissioning are:
112131
//

0 commit comments

Comments
 (0)