Skip to content

Conversation

@swagatbora90
Copy link
Contributor

This PR adds support for configuring DNS settings globally through the nerdctl configuration file (nerdctl.toml), allowing users to set default DNS servers, search domains, and options that apply to all containers unless explicitly overridden by command-line flags.

This is for a usecase where I want all containers to use a specific dns server instead of system default. Currently, I have to use --dns flag with every container create/run commands. This functionality is available in docker which allows setting global dns config value in the daemon config file.

Testing

Added a end to end test TestDNSWithGlobalConfig() that checks 4 scenarios:

  1. Global DNS settings used when no command-line options provided
  2. Command-line DNS options override global config (precedence test)
  3. Global DNS settings apply with host networking
  4. Global DNS settings apply with none networking

@apostasie
Copy link
Contributor

Hey @swagatbora90

Noting that:

docker run -ti --dns="" debian cat /etc/resolv.conf
nerdctl run -ti --dns="" debian cat /etc/resolv.conf

Different behaviors on this ^.
(also same difference on dns-search="")

Appreciate your PR is not about that, but maybe, if you wish, this PR could fix that discrepancy as well since it is touching that part?

nerdtest.Setup()

testCase := &test.Case{
Env: map[string]string{"NERDCTL_TOML": tomlPath},
Copy link
Contributor

@apostasie apostasie Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tigron has a helper for that:

Just add this to your testCase instead of the Env:

			Config:      test.WithConfig(nerdtest.NerdctlToml, configContent),

(also takes care of writing the file in a temp location for you, and make it easier to retrieve)

{
Description: "Global DNS settings are used when command line options are not provided",
Command: func(data test.Data, helpers test.Helpers) test.TestableCommand {
nerdctlTomlPath := os.Getenv("NERDCTL_TOML")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to work as you think?
os.Getenv points to global env, not the test / commands env.

If you use the config from above ^^^

helpers.Read(nerdtest.NerdctlToml)

Should give you what you want?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read it correctly, the Env var should be passed to all Subtest. I need all the Subtest to use the global env, so it does work.

However, your suggestion to use Config should work too. Let me change it like you suggested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, testCase.Env is propagated to subtests, but this is not the global os.Environ (otherwise, you would pollute other unrelated tests), so os.Getenv will not give you the test environment...

Copy link
Contributor

@apostasie apostasie Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not completely sure I was clear, so, in a shell:

  • do not rely on os.Getenv or os.Setenv - that will cross pollute other tests and is specifically frowned-upon
  • the testCase Env is private to the test, but it is inherited by subtests
  • every test has a custom nerdctl.toml, regardless of the fact you want one explicitly or not (this is to guarantee that we are isolated from a possible, pre-existing, host nerdctl.toml)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see your point. I did check that the NERDCTL_TOML config values are correct when running the test. Anyways, I have changed to use the Config attribute like you suggested.

cmd := helpers.Command("run", "--rm", testutil.CommonImage, "cat", "/etc/resolv.conf")
return cmd
},
Expected: func(data test.Data, helpers test.Helpers) *test.Expected {
Copy link
Contributor

@apostasie apostasie Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More compact, using the test.Expects helper and comparators instead of naked asserts (better debugging info):

Expected: test.Expects(expect.ExitCodeSuccess, nil, expect.All(
   expect.Contains("foo"),
   expect.Contains("bla"),
)),

@apostasie
Copy link
Contributor

apostasie commented Jun 26, 2025

@swagatbora90 left a few comments on tests.

Tigron testing has some (semblance of) documentation here: https://github.com/containerd/nerdctl/blob/main/docs/testing/tools.md

Very much work in progress, so, it is what it is... but then, am always happy to cosplay "testing-helper-AI-bot" until the doc and API are settled :-) (also, feedback on Tigron is always welcome)

@swagatbora90
Copy link
Contributor Author

Hey @swagatbora90

Noting that:

docker run -ti --dns="" debian cat /etc/resolv.conf
nerdctl run -ti --dns="" debian cat /etc/resolv.conf

Different behaviors on this ^. (also same difference on dns-search="")

Appreciate your PR is not about that, but maybe, if you wish, this PR could fix that discrepancy as well since it is touching that part?

Looks like there is no validation today on the DNS servers and domain names that are being passed. I can add a follow up PR to add validation, I would prefer to keep this PR only for adding the global config.

@apostasie
Copy link
Contributor

Looks like there is no validation today on the DNS servers and domain names that are being passed. I can add a follow up PR to add validation, I would prefer to keep this PR only for adding the global config.

Fair.

@apostasie
Copy link
Contributor

@swagatbora90 re-run the CI? (eg: rebase or close/open)

@apostasie
Copy link
Contributor

Failures:

  • soci (fixed upstream)
  • TestLogs (possibly a similar issue to the recent IO fixes that got merged)
  • 2x the infamous content digest not found (the good news part on this one is that it now shows a lot more, which should make it easier to repro)
  • [Flaky] TestComposeExecWithIndex #4373 - somehow, ip addr show dev eth0 returns nothing - could be a case of failing to wait on task IO

All unrelated to this PR ^

@apostasie
Copy link
Contributor

Note: #4392 just got in - curious is this is going to help TestComposeExecWithIndex

@swagatbora90 swagatbora90 reopened this Jul 1, 2025
@swagatbora90 swagatbora90 force-pushed the add-global-dns-config branch from 1cb05ec to 2bc1aa9 Compare July 1, 2025 22:47
@swagatbora90
Copy link
Contributor Author

New set of failing tests:

  1. TestIPFSCompNoBuild - Is this known?
  2. TestKubeCommitSave
  3. Almalinux TestPush/plain_http_with_localhost - I have seen this consistently failing across many test runs
  4. Rootful canary TestImagePullSoci - This is known and currently being worked on.

@apostasie
Copy link
Contributor

Number 4 yeah ack

Number 3 Content digest not found (which you have seen on TestPush) has a long history of pain.
In a shell: containerd does garbage collect layers it should not (either because nerdctl does not hold correctly on references, or because containerd GC has issues - very likely on convert, which is clearly wrong).

Number 2: #4106

Number 1: is one of these legacy tests - we should start by rewriting the test with the new test toolkit and ensure the test itself is not broken - that being said, IPFS overall is not stable IMHO, and our compose implementation (shelling out) is a serious problem - so, I would not expect miracles out of the test rewrite. Also, IPFS might involve number 3 as well.

@apostasie
Copy link
Contributor

Content digest issue maybe starts here: #4062 (and then a bunch of linked issues)
We did fix a number of cases, but clearly there are more.
While there is still clearly a containerd issue on convert that is unfixed (PR upstream was not accepted), it seemed like we did have a workaround, but apparently not...

@apostasie
Copy link
Contributor

@swagatbora90 generally speaking, here is the central collection point for all CI evil: #4120

@apostasie
Copy link
Contributor

Correction: kube issue is #4364

@Shubhranshu153
Copy link
Contributor

@AkihiroSuda This one too if we can add to milestone 2.1.3.
Thanks

@Shubhranshu153 Shubhranshu153 reopened this Jul 2, 2025
@AkihiroSuda AkihiroSuda added this to the v2.1.3 milestone Jul 2, 2025
@swagatbora90 swagatbora90 force-pushed the add-global-dns-config branch from 2bc1aa9 to 1ac8bb4 Compare July 2, 2025 16:48
@swagatbora90 swagatbora90 reopened this Jul 2, 2025
@swagatbora90
Copy link
Contributor Author

The failed test does not appear to be related to the changes

=== Failed
=== FAIL: cmd/nerdctl/image TestPush/with_hosts_dir,_with_login (0.87s)
    image_push_linux_test.go:282: 

and is related to #4062

Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@AkihiroSuda AkihiroSuda merged commit 14a59a2 into containerd:main Jul 3, 2025
110 of 116 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants