Skip to content

Conversation

@papertigers
Copy link
Contributor

Fixes #8197

Created using spr 1.3.6-beta.1
@papertigers papertigers requested a review from smklein June 24, 2025 19:44
@papertigers
Copy link
Contributor Author

papertigers commented Jun 24, 2025

@smklein I will test bundle collection on a racklette once CI kicks out a TUF repo for me. The idea here is that we now use u.2 debug datasets and attempt to find the dataset with the most available storage.

Edit

Bundle collection was successful

root@oxz_switch1:~# omdb nexus sb inspect | tail
note: Nexus URL not specified.  Will pick one from DNS.
note: using DNS server for subnet fd00:1122:3344::/48
note: (if this is not right, use --dns-server to specify an alternate DNS server)
note: using Nexus URL http://[fd00:1122:3344:102::4]:12221
Inspecting bundle c0ae8c12-368d-4b82-a152-408a1d29b71d from 2025-06-24 21:40:28.543670 UTC
{"msg":"request completed","v":0,"name":"SledAgent","level":30,"time":"2025-06-24T21:41:04.390130639Z","hostname":"BRM42220062","pid":644,"uri":"/support/pstack-info","method":"GET","req_id":"71d82e81-ea12-4754-8d4d-4756ce35fb7b","remote_addr":"[fd00:1122:3344:102::4]:58264","local_addr":"[fd00:1122:3344:103::1]:12345","component":"dropshot (SledAgent)","file":"/home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/dropshot-0.16.2/src/server.rs:867","latency_us":1598803,"response_code":200}
{"msg":"request completed","v":0,"name":"SledAgent","level":30,"time":"2025-06-24T21:41:04.836322793Z","hostname":"BRM42220062","pid":644,"uri":"/support/pargs-info","method":"GET","req_id":"5ce88a5d-30e3-4755-8f12-744bfcfb5436","remote_addr":"[fd00:1122:3344:102::4]:64474","local_addr":"[fd00:1122:3344:103::1]:12345","component":"dropshot (SledAgent)","file":"/home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/dropshot-0.16.2/src/server.rs:867","latency_us":2044536,"response_code":200}
{"msg":"request completed","v":0,"name":"SledAgent","level":30,"time":"2025-06-24T21:41:05.192827848Z","hostname":"BRM42220062","pid":644,"uri":"/support/logs/zones","method":"GET","req_id":"e3f0e353-1e32-4913-8aa3-3883368fc2d5","remote_addr":"[fd00:1122:3344:102::4]:58264","local_addr":"[fd00:1122:3344:103::1]:12345","component":"dropshot (SledAgent)","file":"/home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/dropshot-0.16.2/src/server.rs:867","latency_us":3383,"response_code":200}
{"msg":"accepted connection","v":0,"name":"SledAgent","level":30,"time":"2025-06-24T21:41:05.2439313Z","hostname":"BRM42220062","pid":644,"local_addr":"[fd00:1122:3344:103::1]:12345","component":"dropshot (SledAgent)","file":"/home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/dropshot-0.16.2/src/server.rs:1025","remote_addr":"[fd00:1122:3344:102::4]:45490"}
{"msg":"accepted connection","v":0,"name":"SledAgent","level":30,"time":"2025-06-24T21:41:05.28395166Z","hostname":"BRM42220062","pid":644,"local_addr":"[fd00:1122:3344:103::1]:12345","component":"dropshot (SledAgent)","file":"/home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/dropshot-0.16.2/src/server.rs:1025","remote_addr":"[fd00:1122:3344:102::4]:44415"}
{"msg":"accepted connection","v":0,"name":"SledAgent","level":30,"time":"2025-06-24T21:41:05.284006855Z","hostname":"BRM42220062","pid":644,"local_addr":"[fd00:1122:3344:103::1]:12345","component":"dropshot (SledAgent)","file":"/home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/dropshot-0.16.2/src/server.rs:1025","remote_addr":"[fd00:1122:3344:102::4]:33256"}
{"msg":"accepted connection","v":0,"name":"SledAgent","level":30,"time":"2025-06-24T21:41:05.304467518Z","hostname":"BRM42220062","pid":644,"local_addr":"[fd00:1122:3344:103::1]:12345","component":"dropshot (SledAgent)","file":"/home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/dropshot-0.16.2/src/server.rs:1025","remote_addr":"[fd00:1122:3344:102::4]:53804"}
{"msg":"accepted connection","v":0,"name":"SledAgent","level":30,"time":"2025-06-24T21:41:05.315533544Z","hostname":"BRM42220062","pid":644,"local_addr":"[fd00:1122:3344:103::1]:12345","component":"dropshot (SledAgent)","file":"/home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/dropshot-0.16.2/src/server.rs:1025","remote_addr":"[fd00:1122:3344:102::4]:38626"}
{"msg":"accepted connection","v":0,"name":"SledAgent","level":30,"time":"2025-06-24T21:41:05.315558402Z","hostname":"BRM42220062","pid":644,"local_addr":"[fd00:1122:3344:103::1]:12345","component":"dropshot (SledAgent)","file":"/home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/dropshot-0.16.2/src/server.rs:1025","remote_addr":"[fd00:1122:3344:102::4]:60043"}
{"msg":"accepted connection","v":0,"name":"SledAgent","level":30,"time":"2025-06-24T21:41:05.315575384Z","hostname":"BRM42220062","pid":644,"local_addr":"[fd00:1122:3344:103::1]:12345","component":"dropshot (SledAgent)","file":"/home/build/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/dropshot-0.16.2/src/server.rs:1025","remote_addr":"[fd00:1122:3344:102::4]:62584"}

Comment on lines 93 to 97
match illumos_utils::zfs::Zfs::get_value(
path.as_str(),
"available",
true,
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function was expanded to include a parsable bool so that we get bytes rather than a value like "100G". I made it an option so that we don't change the behavior of any existing code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to pass a bare boolean at all the call sites is a little gnarly. How much do the existing callers depend on the returned value not being parsable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I agree that it's not ideal, I was mostly trying to minimize any unexpected behavior. The man page states -p Display numbers in parsable (exact) values., and all of the call sites should be in this PR. Looking through the call sites I think everything would be okay so we could likely just always pass the -p.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in 0fabc0a

Created using spr 1.3.6-beta.1
@papertigers papertigers requested a review from jgallagher June 25, 2025 19:28
Comment on lines +83 to +85
// Attempt to find a U.2 device with the most available free space
// for temporary storage to assemble a zip file made up of all of the
// discovered zone's logs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit I don't feel strongly about - I'd maybe move this chunk of code to a separate method? dataset_for_temporary_storage() or something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I had saw the approval on mobile when I stepped away from the laptop and marked it for auto merge, I didn't catch this nit in time so here's the follow up: #8454


/// Return the directories that can be used for temporary sled-diagnostics
/// file storage.
pub fn all_sled_diagnostics_directories(&self) -> Vec<Utf8PathBuf> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds me we should really prune a whole bunch of stuff from this crate now that the config reconciler has landed... A lot of it is only used in a couple tests (and I should update those tests to use the config reconciler instead!)

No objection to pruning this method though.

@papertigers papertigers enabled auto-merge (squash) June 25, 2025 21:02
@papertigers papertigers merged commit 2627bf8 into main Jun 25, 2025
16 checks passed
@papertigers papertigers deleted the spr/papertigers/sled-diagnostics-log-collection-should-happen-on-u2s branch June 25, 2025 21:16
papertigers added a commit that referenced this pull request Jun 26, 2025
This fixes a missed style nit from #8438
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[sled-diagnostics] log collection should happen on u.2s

3 participants