Skip to content

Conversation

@apostasie
Copy link
Contributor

CI fails regularly because of third-party services transient errors.

Most typically, debian or ubuntu servers, or Docker Hub, returning a 500 during the build phase.

This PR is an experimental proposal to alleviate a bit of the pain with this problem, by adding a local proxy that will retry backends requests on such failures.

The proxy also provisionally does caching for debian and ubuntu domains. Currently, this is not going to do much, although it is good practice IMHO to minimize hammering debian servers.

Besides this PR, we might want to rethink our strategy with building the test image though.
Right now, we build everything once per target.
This is happening in parallel, but this is (obviously) significantly increasing the chances of failures against these services.

Note that this proxy will intercept requests done from the host and from the build phase - not for the tests themselves.

@AkihiroSuda
Copy link
Member

AkihiroSuda commented Oct 21, 2024

Besides this PR, we might want to rethink our strategy with building the test image though.

Can we just use https://docs.docker.com/build/cache/backends/gha/ ?
Then probably no need to set up a proxy

@apostasie
Copy link
Contributor Author

Besides this PR, we might want to rethink our strategy with building the test image though.

Can we just use https://docs.docker.com/build/cache/backends/gha/ ? Then probably no need to set up a proxy

Will definitely give it a try. I am not optimistic that we will fit in the limitations (especially if we try to mode=max) - but let's see.

@apostasie
Copy link
Contributor Author

apostasie commented Oct 21, 2024

Besides this PR, we might want to rethink our strategy with building the test image though.

Can we just use https://docs.docker.com/build/cache/backends/gha/ ? Then probably no need to set up a proxy

Will definitely give it a try. I am not optimistic that we will fit in the limitations (especially if we try to mode=max) - but let's see.

Notes for later:

  • 10G is the limit - cache will get evicted once we hit that
  • a PR can use cached entries from the main branch, but not from other PRs
  • with mode=max, image cache is about 2G compressed - 2.9G uncompressed - with currently 8 images (rootless, rootful, containerd/ubuntu versions), this will not work - we have to rewrite the image build logic and cache a common target instead

@apostasie
Copy link
Contributor Author

Dropping this for now in favor of the github action build cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants