Skip to content

Sled Agent should always use the "A" M.2 slot for ledgered data #8654

@sunshowers

Description

@sunshowers

Currently, Sled Agent tries to read from and write to both M.2 slots:

  • sometimes scanning both disks and looking at the highest generation across them (Ledgerable trait)
  • sometimes considering the boot slot as authoritative (e.g. mupdate override data)

That is incorrect in some pretty important ways:

  • if one of the disks disappears temporarily, implementations of the Ledgerable trait might read old data
  • if the disks couldn't be synced and the boot disk slot changes, we'll suddenly be making decisions based on outdated information

If we had an odd number of disks we could potentially address this through majority consensus. But we have two disks, and it's not really feasible to keep them in sync at all times. For this kind of data we should just always pick a specific slot -- between the A and the B slots, the natural one to pick is A.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions