-
Notifications
You must be signed in to change notification settings - Fork 54
[sled-agent] Refactor service management out of StorageManager
#2946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…zones, zone_name -> zone_type, config -> ledger
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice cleanup! I'm happy to get the StorageManager
out of the business of launching zones.
// pretty tight; we should consider merging them together. | ||
let storage_manager = | ||
StorageManager::new(&log, underlay_etherstub.clone()).await; | ||
let storage_manager = StorageManager::new(&log).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woohoo!
sled-agent/src/services.rs
Outdated
.join(STORAGE_SERVICES_CONFIG_FILENAME) | ||
} | ||
|
||
// TODO(ideas): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this first part is implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually implemented this out fully within #2972
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment removed in 541f68d
sled-agent/src/services.rs
Outdated
// - ... Writer which *knows the type* to be serialized, so can direct it to the | ||
// appropriate output path. | ||
// | ||
// - TODO: later: Can also make the path writing safer, by... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move this technique out into an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See: #2972 - I just finished implementing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. Excellent!
sled-agent/src/services.rs
Outdated
); | ||
} | ||
self.load_non_storage_services().await?; | ||
// TODO: These will fail if the disks aren't attached. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having a hard time reasoning about when we'd want to retry, since it's unclear to me when a disk would become attached within a retry period. My admittedly, somewhat uninformed, take at this moment is that we shouldn't retry.
I think the distinction between these calls not retrying and NTP retrying is twofold:
- NTP retries even on successful startup, since we are waiting for time sync
- NTP is a barrier to starting other services. If it fails later services will not work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely deserves a follow-up bug - it's only relevant in the "cold boot" case, but that does matter!
Here's the deal:
- When we boot the sled agent, it's possible that not all disks have been parsed (e.g., suppose there's a U.2 that's slow to bind a driver).
- When we call this function, we'll read the service ledger from the M.2s, see what services should be started, and try starting them.
- If any of those services...
- ... have datasets in the U.2s
- ... have zone filesystems in the U.2s
- ... then we'd fail to start them.
But that doesn't mean we should never launch the service - if it's in the M.2 service ledger, either RSS or Nexus provisioned that service, so we should keep giving it a shot. "The driver did bind, but it just took a while" and "The U.2 was unplugged, but someone put it back" are both cases where we should be able to launch these services, even if we might initially fail.
If, after a long enough period of time, we determine that we cannot launch that service (if the U.2 is detached, fails, etc), then we'd have an opportunity to do some notification through the fault tolerance system (we either tell Nexus that the service isn't booting, or Nexus notices on its own somehow).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #2973 , will mention it in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioned in ed20fff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. Those justifications make sense to me.
pub all_svcs_config_path: PathBuf, | ||
// The path for the ServiceManager to store information about | ||
// all running services. | ||
all_svcs_ledger_path: PathBuf, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the change from config
to ledger
here?
Also, if we are going to change the names, we may also want to change the constants to refer to LEDGER
instead of CONFIG
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I got confused between the following overloaded use of the term "config":
- We use "config" to describe parameters for a variety of modules within the sled agent, to describe tweaks on how they should be executed. For example, the sled agent has a "config", the bootstrap agent has a "config", the storage manager also has a "config".
- totally separately, we're storing the list of services that the sled manages. I had previously called this "config", but I think the name was not-totally-accurate, so I'm updating it to ledger. After all, it's just a ledger of "what am I responsible for running".
RE: the constants, will do!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 5d59951
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanations and cleanup. Ship it!
History
The Sled Agent has historically had two different "managers" responsible for Zones:
ServiceManager
, which resided over zones that do not operate on DatasetsStorageManager
, which manages disks, but also manages zones which operate on those disksThis separation is even reflected in the sled agent API exposed to Nexus - the Sled Agent exposes:
PUT /services
PUT /filesystem
For "add a service (within a zone) to this sled" vs "add a dataset (and corresponding zone) to this sled within a particular zpool".
This has been kinda handy for Nexus, since "provision CRDB on this dataset" and "start the CRDB service on that dataset" don't need to be separate operations. Within the Sled Agent, however, it has been a pain-in-the-butt from a perspective of diverging implementations. The
StorageManager
andServiceManager
have evolved their own mechanisms for storing configs, identifying filesystems on which to place zpools, etc, even though their responsibilities (managing running zones) overlap quite a lot.This PR
This PR migrates the responsibility for "service management" entirely into the
ServiceManager
, leaving theStorageManager
responsible for monitoring disks.In detail, this means:
storage_manager.rs
intoservices.rs
StorageManager
no longer requires an Etherstub device during constructionServiceZoneRequest
can operate on an optionaldataset
argument/var/oxide/services.toml
and/var/oxide/storage-services.toml
for each group.filesystem_ensure
- which previously asked theStorageManager
to format a dataset and also launch a zone - now asks theStorageManager
to format a dataset, and separately asks theServiceManager
to launch a zone.