You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
blueprint-execution has two cleanup steps that depend on the success of the earlier PUT /omicron-zones step:
zone cleanup
saga reassignment
failed support bundle cleanup
All of these steps assume that the zones they are cleaning up after are no longer running, but that's only true today because PUT /omicron-zones is synchronous (i.e., sled-agent only returns success if it has already stopped any zones that shouldn't be running) and because execution is stopped if the PUT /omicron-zones step fails. We definitely want to change the second of those (this is #6999), and longer term we probably want to change the first one too (converting sled-agent into more of a "accept and return the new config, then make it real via a reconciler loop in the background).
#7524 is a small PR that makes these dependencies explicit in code. We should break this dependency somehow:
The planner could confirm that a zone is gone and indicate it's ready for cleanup via some property in the blueprint (similar to the treatment disks got in Expunge and Decommission disks in planner #7286)
Is it possible for the executor to know this on its own? (I don't think so but maybe?)