Skip to content

Conversation

@jmpesp
Copy link
Contributor

@jmpesp jmpesp commented Oct 20, 2025

To implement RFD 590, Propolis will use a delegated ZVOL as a file backed NVMe device. This PR creates a new "local_storage" dataset that can be used for this purpose on every zpool in the rack. This is created under the crypt parent so that the delegated storage is encrypted at rest.

This is done with the planner instead of adding to U2_EXPECTED_DATASETS, and this required changing a bunch of tests that were assuming a specific number of datasets per pool, along with a whole whack of tests that use expectorate.

To implement RFD 590, Propolis will use a delegated ZVOL as a file
backed NVMe device. This PR creates a new "local_storage" dataset that
can be used for this purpose on every zpool in the rack. This is created
under the `crypt` parent so that the delegated storage is encrypted at
rest.

This is done with the planner instead of adding to U2_EXPECTED_DATASETS,
and this required changing a bunch of tests that were assuming a
specific number of datasets per pool, along with a whole whack of tests
that use expectorate.
@jmpesp jmpesp requested a review from jgallagher October 20, 2025 20:10
Copy link
Contributor

@jgallagher jgallagher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't gone over all the expectorate output yet, but just a few comments from the code. (I assume all of the expectorate output is pretty noisy and not useful - there's now an extra dataset for every disk, which shifts all the output and IDs around. But I'd still like to skim through it.)

// LocalStorage isn't in the U2_EXPECTED_DATASETS list, add it
// here. XXX does RSS blueprint have to be blippy clean? Isn't
// the plan to RSS only enough to run Nexus and have it use
// blueprints to make the rest true?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XXX does RSS blueprint have to be blippy clean?

Yes - blippy should only complain about blueprints that should not exist and may lead to some kind of internal inconsistencies, I think?

Isn't the plan to RSS only enough to run Nexus and have it use blueprints to make the rest true?

That is one possible plan for a future where we've rewritten RSS. 😅 Today, we expect the Nexus takeover after RSS to find that it does not need to make any changes to the current target blueprint (and I think we even have a test for that somewhere - @davepacheco wrote it IIRC).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 rewrote this comment in e8b0d00

) WHERE
time_deleted IS NULL;

CREATE TABLE IF NOT EXISTS omicron.public.local_storage_dataset (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Should we prefix this table name with rendezvous_? The existing ones dataset-related rendezvous tables are rendezvous_debug_dataset and crucible_dataset; IIRC the latter doesn't have a rendezvous prefix because it really predates the idea, and it seemed silly to rename it just for that.

Less of a nit: I would naively expect the structure of this table to look slightly more like rendezvous_debug_dataset instead of crucible_dataset; the former has more goop for handling deletion, for example, and doesn't have an rcgen (which I don't think makes sense for reconfigurator-managed resources like this?). I could also imagine a version of this PR that doesn't add this table at all, I think? Add it to the blueprint -> sled-agent path, but not do any bookkeeping in Nexus, and add that in separately later? (Not suggesting you actually do this necessarily, unless it would make things easier.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, done in 70ccf19

use nexus_db_schema::schema::crucible_dataset::dsl;
let zpool_id = dataset.pool_id();
Zpool::insert_resource(
let _: CrucibleDataset = Zpool::insert_resource(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we skipping inserting the local dataset records here because we expect blueprint execution to come along and do it for us? (I'm 100% on board with that, if it's right - just making sure I understand the intent.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in keeping with the idea that rendezvous tables are only managed through blueprint execution. I think CrucibleDataset might be an exception to this? I can't imagine removing this insert would result in them not being populated later automatically but didn't want to remove it with everything else going on in this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think I left them here when we did the initial rendezvous table rework for similar reasoning - it's probably not necessary to insert them during RSS handoff, but also is harmless, and testing that it's really okay to remove them would be tedious.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants