Skip to content

Allow cache TTL override for --find-links urls #8109

@wisechengyi

Description

@wisechengyi

What's the problem this feature will solve?

Our wheels (thousands of them) are hosted in a flat directory, e.g. https://myhost.com/wheels/ has

...
django_auth_ldap-1.6.1-py2.py3-none-any.whl
django_bootstrap_toolkit-2.15.0-py2-none-any.whl
django_bootstrap_toolkit-2.15.0-py3-none-any.whl
django_braces-1.14.0-py2.py3-none-any.whl
django_common_helpers-0.9.2-py2-none-any.whl
django_common_helpers-0.9.2-py3-none-any.whl
django_configurations-2.2-py2.py3-none-any.whl
django_crispy_forms-1.8.1-py2.py3-none-any.whl
django_cron-0.5.1-py2-none-any.whl
django_cron-0.5.1-py3-none-any.whl
...

With --no-index --find-links https://myhost.com/wheels/, pip will query it for every artifact it needs transitively. However, when large number of builds are running concurrently with pip, the wheel server can be overwhelmed.

Describe the solution you'd like

We'd like some mechanism to force the cache TTL for the index page. Something to the effect of:

diff --git a/src/pip/_vendor/cachecontrol/controller.py b/src/pip/_vendor/cachecontrol/controller.py
index dafe55c..f0066a7 100644
--- a/src/pip/_vendor/cachecontrol/controller.py
+++ b/src/pip/_vendor/cachecontrol/controller.py
@@ -84,7 +84,9 @@ class CacheController(object):
 
         retval = {}
 
+        cc_headers = 'max-age=8'
         for cc_directive in cc_headers.split(","):
             if not cc_directive.strip():
                 continue

In this case I was hardcoding the cache TTL to be 8 seconds.

It can be plumbed via an option, e.g. adding --force-index-cache-ttl=<some seconds> to below:

Package Index Options:
  -i, --index-url <url>       Base URL of the Python Package Index (default https://pypi.org/simple). This should point to a repository compliant with PEP 503 (the
                              simple repository API) or a local directory laid out in the same format.
  --extra-index-url <url>     Extra URLs of package indexes to use in addition to --index-url. Should follow the same rules as --index-url.
  --no-index                  Ignore package index (only looking at --find-links URLs instead).
  -f, --find-links <url>      If a URL or path to an html file, then parse for links to archives such as sdist (.tar.gz) or wheel (.whl) files. If a local path or
                              file:// URL that's a directory,  then look for archives in the directory listing. Links to VCS project URLs are not supported.

Please kindly let me know if the concept is acceptable or if there's a better approach to this.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    S: needs triageIssues/PRs that need to be triaged

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions