Skip to content

DNS zones should provision storage in a dataset explicitly #3018

@smklein

Description

@smklein

How Omicron manages storage

Zones generally store data in one of two ways:

  • "Anywhere within their zone filesystem" - this is the case for data such as logs, tmp directories within zones, etc
  • "A durable dataset mounted at a well-known location" - this is the case for zones like Crucible, CockroachDB, and Clickhouse, which explicitly manage storage that should be durable across reboots

How DNS manages storage

The internal and external DNS services are configured to use the "anywhere within their zone filesystem" storage.

storage_path = "/var/oxide/dns"

storage_path = "/var/oxide/dns"

In general, this data acting as a cache from Nexus is fine - after all, Nexus periodically updates these values, bumps a generation number, and should ensure that a (redundant!) number of DNS servers are storing this information.

However in the case of cold boot, it's pretty important that at least one internal DNS server boots, to provide the necessary machinery for:

  • The CockroachDB instances to boot up and find each other, and
  • Also for Nexus to be able to find CockroachDB

Why would this be a problem

@jclulow has identified that the "service model" for zones means that we should be able to delete/recreate zone filesystems arbitrarily, and recreate them with persistent dataset, to recreate a stable state after cold boot.

Personally, I'm on board with this, as it means there's less "reconstruction" logic the Sled Agent needs to perform after reboot (e.g., parsing of vnics, IP interfaces, etc). We can just set everything as temporary, and recreate what we need, relying only on datasets to be durable.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions