Skip to content

Conversation

hawkw
Copy link
Member

@hawkw hawkw commented Aug 7, 2025

Depends on #2181.

it works:

eliza@lurch ~ $ pfexec humility -t cosmo-sp pmbus -r V12_SYS_A2 -w VOUT_COMMAND=8 && sleep 1 && pfexec humility -t cosmo-sp pmbus -r V12_SYS_A2 -w VOUT_COMMAND=12
humility: WARNING: archive in environment variable overriding archive in environment file
humility: attached to 0483:3754:003F00164741500920383733 via ST-Link V3
humility: I2C4, port F, dev 0x67: successfully wrote VOUT_COMMAND
humility: WARNING: archive in environment variable overriding archive in environment file
humility: attached to 0483:3754:003F00164741500920383733 via ST-Link V3
humility: I2C4, port F, dev 0x67: successfully wrote VOUT_COMMAND
eliza@lurch ~ $ pfexec humility -t cosmo-sp ringbuf cosmo_seq
humility: WARNING: archive in environment variable overriding archive in environment file
humility: attached to 0483:3754:003F00164741500920383733 via ST-Link V3
humility: ring buffer drv_cosmo_seq_server::__RINGBUF in cosmo_seq:
   TOTAL VARIANT
    3373 RegStateValues
     520 ContinueBitstreamLoad
       2 SetState(InitialPowerOn)
       1 FpgaInit
       1 WaitForDone
       1 Programmed
       1 Startup
       1 SequencerInterrupt
       1 Coretype
 NDX LINE      GEN    COUNT PAYLOAD
  82  332        3        1 ContinueBitstreamLoad(0xa7)
  
# ...

  65  332        4        1 ContinueBitstreamLoad(0xf)
  66  353        4        1 WaitForDone
  67  355        4        1 Programmed
  68  379        4        1 Startup { early_power_rdbks: EarlyPowerRdbksView { fan_fail: false, fan_hsc_east_pg: true, fan_hsc_central_pg: true, fan_hsc_west_pg: true, fan_hsc_east_disable: false, fan_hsc_central_disable: false, fan_hsc_west_disable: false } }
  69  382        4        1 SetState { prev: None, next: A2, why: InitialPowerOn, now: 0x2ee3 }
  70  436        4        1 SetState { prev: Some(A2), next: A0, why: InitialPowerOn, now: 0x2ee3 }
  71  410        4       92 RegStateValues { seq_api_status: SeqApiStatusView { a0_sm: Ok(EnableGrpA) }, seq_raw_status: SeqRawStatusView { hw_sm: 0x3 }, nic_api_status: NicApiStatusView { nic_sm: Ok(Idle) }, nic_raw_status: NicRawStatusView { hw_sm: 0x0 } }
  72  410        4       20 RegStateValues { seq_api_status: SeqApiStatusView { a0_sm: Ok(EnableGrpA) }, seq_raw_status: SeqRawStatusView { hw_sm: 0x5 }, nic_api_status: NicApiStatusView { nic_sm: Ok(Idle) }, nic_raw_status: NicRawStatusView { hw_sm: 0x0 } }
  73  410        4        2 RegStateValues { seq_api_status: SeqApiStatusView { a0_sm: Ok(Sp5FinalCheckpoint) }, seq_raw_status: SeqRawStatusView { hw_sm: 0xc }, nic_api_status: NicApiStatusView { nic_sm: Ok(Idle) }, nic_raw_status: NicRawStatusView { hw_sm: 0x0 } }
  74  410        4        1 RegStateValues { seq_api_status: SeqApiStatusView { a0_sm: Ok(Done) }, seq_raw_status: SeqRawStatusView { hw_sm: 0xe }, nic_api_status: NicApiStatusView { nic_sm: Ok(EnablePower) }, nic_raw_status: NicRawStatusView { hw_sm: 0x2 } }
  75  495        4        1 Coretype { coretype0: true, coretype1: false, coretype2: true, sp5r1: true, sp5r2: false, sp5r3: false, sp5r4: false }
  76  410        4        7 RegStateValues { seq_api_status: SeqApiStatusView { a0_sm: Ok(Done) }, seq_raw_status: SeqRawStatusView { hw_sm: 0xe }, nic_api_status: NicApiStatusView { nic_sm: Ok(EnablePower) }, nic_raw_status: NicRawStatusView { hw_sm: 0x2 } }
  77  410        4        2 RegStateValues { seq_api_status: SeqApiStatusView { a0_sm: Ok(Done) }, seq_raw_status: SeqRawStatusView { hw_sm: 0xe }, nic_api_status: NicApiStatusView { nic_sm: Ok(NicReset) }, nic_raw_status: NicRawStatusView { hw_sm: 0x3 } }
  78  410        4        2 RegStateValues { seq_api_status: SeqApiStatusView { a0_sm: Ok(Done) }, seq_raw_status: SeqRawStatusView { hw_sm: 0xe }, nic_api_status: NicApiStatusView { nic_sm: Ok(NicReset) }, nic_raw_status: NicRawStatusView { hw_sm: 0x4 } }
  79  410        4     2592 RegStateValues { seq_api_status: SeqApiStatusView { a0_sm: Ok(Done) }, seq_raw_status: SeqRawStatusView { hw_sm: 0xe }, nic_api_status: NicApiStatusView { nic_sm: Ok(NicReset) }, nic_raw_status: NicRawStatusView { hw_sm: 0x6 } }
  80  667        4        1 SequencerInterrupt { our_state: A0, seq_state: Ok(Done), ifr: IfrView { fanfault: false, thermtrip: false, smerr_assert: false, a0mapo: false, nicmapo: false, amd_pwrok_fedge: false, amd_rstn_fedge: false, fan_central_hsc_alert: false, fan_east_hsc_alert: false, fan_west_hsc_alert: false, ibc_alert: false, m2_hsc_alert: false, nic_hsc_alert: false, v12_ddr5_abcdef_hsc_alert: false, v12_ddr5_ghijkl_hsc_alert: false, v12_mcio_a0hp_hsc_alert: false, main_hsc_alert: false, vr_v1p8_sys_to_fpga1_alert: true, vr_v3p3_sys_to_fpga1_alert: true, vr_v5p0_sys_to_fpga1_alert: true, v0p96_nic_to_fpga1_alert: false, pwr_cont1_to_fpga1_alert: true, pwr_cont2_to_fpga1_alert: false, pwr_cont3_to_fpga1_alert: false } }
  81  410        4      655 RegStateValues { seq_api_status: SeqApiStatusView { a0_sm: Ok(Done) }, seq_raw_status: SeqRawStatusView { hw_sm: 0xe }, nic_api_status: NicApiStatusView { nic_sm: Ok(NicReset) }, nic_raw_status: NicRawStatusView { hw_sm: 0x6 } }
humility: ring buffer drv_cosmo_seq_server::vcore::__RINGBUF in cosmo_seq:
 NDX LINE      GEN    COUNT PAYLOAD
   0  122        1        1 Initializing
   1  131        1        1 LimitsLoaded
   2  160        1        1 FaultsCleared(Rails { vddcr_cpu0: true, vddcr_cpu1: true })
   3  142        1        1 Initialized
   4  166        1        1 Pmalert { timestamp: 0x9985, faulted: Rails { vddcr_cpu0: true, vddcr_cpu1: false } }
   5  187        1        1 VinFault(VddcrCpu0)
   6  228        1        1 Status(VddcrCpu0, PmbusStatus { status_word: 0x2001, status_iout: 0x0, status_vout: 0x0, status_input: 0x20, status_temperature: 0x0, status_cml: 0x0 })
   7  285        1        1 Reading { timestamp: 0x998b, vddcr_cpu0_vin: Volts(11.290001), vddcr_cpu1_vin: Volts(11.260001) }
   8  285        1        1 Reading { timestamp: 0x998c, vddcr_cpu0_vin: Volts(11.120001), vddcr_cpu1_vin: Volts(11.090001) }
   9  285        1        1 Reading { timestamp: 0x998e, vddcr_cpu0_vin: Volts(10.970001), vddcr_cpu1_vin: Volts(10.940001) }
  10  285        1        1 Reading { timestamp: 0x9990, vddcr_cpu0_vin: Volts(10.81), vddcr_cpu1_vin: Volts(10.81) }
  11  285        1        1 Reading { timestamp: 0x9991, vddcr_cpu0_vin: Volts(10.670001), vddcr_cpu1_vin: Volts(10.650001) }
  12  285        1        1 Reading { timestamp: 0x9993, vddcr_cpu0_vin: Volts(10.500001), vddcr_cpu1_vin: Volts(10.47) }
  13  285        1        1 Reading { timestamp: 0x9995, vddcr_cpu0_vin: Volts(10.370001), vddcr_cpu1_vin: Volts(10.330001) }
  14  285        1        1 Reading { timestamp: 0x9996, vddcr_cpu0_vin: Volts(10.2300005), vddcr_cpu1_vin: Volts(10.160001) }
  15  285        1        1 Reading { timestamp: 0x9998, vddcr_cpu0_vin: Volts(10.06), vddcr_cpu1_vin: Volts(10.02) }
  16  285        1        1 Reading { timestamp: 0x9999, vddcr_cpu0_vin: Volts(9.900001), vddcr_cpu1_vin: Volts(9.880001) }
  17  285        1        1 Reading { timestamp: 0x999b, vddcr_cpu0_vin: Volts(9.740001), vddcr_cpu1_vin: Volts(9.710001) }
  18  285        1        1 Reading { timestamp: 0x999d, vddcr_cpu0_vin: Volts(9.580001), vddcr_cpu1_vin: Volts(9.570001) }
  19  285        1        1 Reading { timestamp: 0x999e, vddcr_cpu0_vin: Volts(9.440001), vddcr_cpu1_vin: Volts(9.43) }
  20  285        1        1 Reading { timestamp: 0x99a0, vddcr_cpu0_vin: Volts(9.280001), vddcr_cpu1_vin: Volts(9.27) }
  21  285        1        1 Reading { timestamp: 0x99a2, vddcr_cpu0_vin: Volts(9.130001), vddcr_cpu1_vin: Volts(9.130001) }
  22  285        1        1 Reading { timestamp: 0x99a3, vddcr_cpu0_vin: Volts(8.9800005), vddcr_cpu1_vin: Volts(8.97) }
  23  285        1        1 Reading { timestamp: 0x99a5, vddcr_cpu0_vin: Volts(8.860001), vddcr_cpu1_vin: Volts(8.76) }
  24  285        1        1 Reading { timestamp: 0x99a8, vddcr_cpu0_vin: Volts(8.6), vddcr_cpu1_vin: Volts(8.51) }
  25  285        1        1 Reading { timestamp: 0x99ab, vddcr_cpu0_vin: Volts(8.35), vddcr_cpu1_vin: Volts(8.27) }
  26  285        1        1 Reading { timestamp: 0x99af, vddcr_cpu0_vin: Volts(8.1), vddcr_cpu1_vin: Volts(7.9900007) }
  27  285        1        1 Reading { timestamp: 0x99b1, vddcr_cpu0_vin: Volts(7.9700007), vddcr_cpu1_vin: Volts(7.9900007) }
  28  285        1        1 Reading { timestamp: 0x99b4, vddcr_cpu0_vin: Volts(7.9700007), vddcr_cpu1_vin: Volts(8.000001) }
  29  285        1        1 Reading { timestamp: 0x99b6, vddcr_cpu0_vin: Volts(7.9800005), vddcr_cpu1_vin: Volts(7.9800005) }
  30  285        1        1 Reading { timestamp: 0x99b8, vddcr_cpu0_vin: Volts(7.9600005), vddcr_cpu1_vin: Volts(8.000001) }
  31  285        1        1 Reading { timestamp: 0x99b9, vddcr_cpu0_vin: Volts(7.9600005), vddcr_cpu1_vin: Volts(8.01) }
  32  160        1        1 FaultsCleared(Rails { vddcr_cpu0: true, vddcr_cpu1: false })
humility: ring buffer drv_oxide_vpd::__RINGBUF in cosmo_seq:
humility: ring buffer drv_packrat_vpd_loader::__RINGBUF in cosmo_seq:
eliza@lurch ~ $

@hawkw
Copy link
Member Author

hawkw commented Aug 7, 2025

This will become an ereport a la #2184 once the ereport stuff is merged.

@hawkw hawkw mentioned this pull request Aug 7, 2025
@labbott labbott force-pushed the cosmo_seq_interrupt branch 2 times, most recently from 2f0f151 to da335b8 Compare August 19, 2025 13:32
Base automatically changed from cosmo_seq_interrupt to master August 19, 2025 16:22
@hawkw hawkw force-pushed the eliza/cosmo-vcore-ereport branch from ebed722 to 641d712 Compare August 27, 2025 17:54
@hawkw hawkw closed this in #2184 Sep 2, 2025
pull bot pushed a commit to AKJUS/hubris that referenced this pull request Sep 2, 2025
This branch implements ereports for PMBus input fault and warning alerts
in the Cosmo and Gimlet sequencers. There's kind of a lot going on here.
Essentially, I've hacked up @bcantrill's `drv_gimlet_seq_server::vcore`
module to produce an ereport when we detect a PMBus input alert. These
ereports contain the value of PMBus status registers read from the VRM,
as well as whether or not the VRM claims `POWER_GOOD` has been
deasserted, and identity information of the device ID and power rail. On
Cosmo, I've added similarish code on top of the sequencer interrupt
handling code @labbott added in oxidecomputer#2181. It's a bit more complex on Cosmo
as there are two RAA22960A VRMs driving the `VDDCR_CPU0` and
`VDDCR_CPU1` rails, rather than a single regulator as in SP3's single
`VDD_VCORE` rail.

This branch closes oxidecomputer#2188, which has been merged into this PR.

Closes oxidecomputer#2141
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant