-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Repeatable installs via hashing #3137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
e058486
62ac258
9211d6e
3303be0
1e41f01
11dbb92
0c17248
b0ef6ab
f3f73f1
910b82c
4f67374
14506f8
bf0ff80
c62cd71
09008bf
d477ae6
7a0a97c
0e6058b
6f828c3
52111c1
b95599a
3824d73
be4e315
304c90a
05b7ef9
f35ce75
d541304
76983f3
be6dccb
9e5e34e
4c405a0
dcf39bf
7c5e503
e23f596
925e4b4
622b430
ee9d6fb
3af5ffa
f38fc90
4488047
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -14,5 +14,4 @@ Reference Guide | |
| pip_show | ||
| pip_search | ||
| pip_wheel | ||
|
|
||
|
|
||
| pip_hash | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| .. _`pip hash`: | ||
|
|
||
| pip hash | ||
| ------------ | ||
|
|
||
| .. contents:: | ||
|
|
||
| Usage | ||
| ***** | ||
|
|
||
| .. pip-command-usage:: hash | ||
|
|
||
|
|
||
| Description | ||
| *********** | ||
|
|
||
| .. pip-command-description:: hash | ||
|
|
||
|
|
||
| Overview | ||
| ++++++++ | ||
| ``pip hash`` is a convenient way to get a hash digest for use with | ||
| :ref:`hash-checking mode`, especially for packages with multiple archives. The | ||
| error message from ``pip install --require-hashes ...`` will give you one | ||
| hash, but, if there are multiple archives (like source and binary ones), you | ||
| will need to manually download and compute a hash for the others. Otherwise, a | ||
| spurious hash mismatch could occur when :ref:`pip install` is passed a | ||
| different set of options, like :ref:`--no-binary <install_--no-binary>`. | ||
|
|
||
|
|
||
| Options | ||
| ******* | ||
|
|
||
| .. pip-command-options:: hash | ||
|
|
||
|
|
||
| Example | ||
| ******** | ||
|
|
||
| Compute the hash of a downloaded archive:: | ||
|
|
||
| $ pip download SomePackage | ||
| Collecting SomePackage | ||
| Downloading SomePackage-2.2.tar.gz | ||
| Saved ./pip_downloads/SomePackage-2.2.tar.gz | ||
| Successfully downloaded SomePackage | ||
| $ pip hash ./pip_downloads/SomePackage-2.2.tar.gz | ||
| ./pip_downloads/SomePackage-2.2.tar.gz: | ||
| --hash=sha256:93e62e05c7ad3da1a233def6731e8285156701e3419a5fe279017c429ec67ce0 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -101,14 +101,15 @@ and the newline following it is effectively ignored. | |
|
|
||
| Comments are stripped *before* line continuations are processed. | ||
|
|
||
| Additionally, the following Package Index Options are supported: | ||
| The following options are supported: | ||
|
|
||
| * :ref:`-i, --index-url <--index-url>` | ||
| * :ref:`--extra-index-url <--extra-index-url>` | ||
| * :ref:`--no-index <--no-index>` | ||
| * :ref:`-f, --find-links <--find-links>` | ||
| * :ref:`--no-binary <install_--no-binary>` | ||
| * :ref:`--only-binary <install_--only-binary>` | ||
| * :ref:`--require-hashes <--require-hashes>` | ||
|
|
||
| For example, to specify :ref:`--no-index <--no-index>` and 2 :ref:`--find-links <--find-links>` locations: | ||
|
|
||
|
|
@@ -380,16 +381,16 @@ See the :ref:`pip install Examples<pip install Examples>`. | |
| SSL Certificate Verification | ||
| ++++++++++++++++++++++++++++ | ||
|
|
||
| Starting with v1.3, pip provides SSL certificate verification over https, for the purpose | ||
| of providing secure, certified downloads from PyPI. | ||
| Starting with v1.3, pip provides SSL certificate verification over https, to | ||
| prevent man-in-the-middle attacks against PyPI downloads. | ||
|
|
||
|
|
||
| .. _`Caching`: | ||
|
|
||
| Caching | ||
| +++++++ | ||
|
|
||
| Starting with v6.0, pip provides an on by default cache which functions | ||
| Starting with v6.0, pip provides an on-by-default cache which functions | ||
| similarly to that of a web browser. While the cache is on by default and is | ||
| designed do the right thing by default you can disable the cache and always | ||
| access PyPI by utilizing the ``--no-cache-dir`` option. | ||
|
|
@@ -425,14 +426,14 @@ Windows | |
|
|
||
| .. _`Wheel cache`: | ||
|
|
||
| Wheel cache | ||
| *********** | ||
| Wheel Cache | ||
| ~~~~~~~~~~~ | ||
|
|
||
| Pip will read from the subdirectory ``wheels`` within the pip cache dir and use | ||
| any packages found there. This is disabled via the same ``no-cache-dir`` option | ||
| that disables the HTTP cache. The internal structure of that cache is not part | ||
| of the pip API. As of 7.0 pip uses a subdirectory per sdist that wheels were | ||
| built from, and wheels within that subdirectory. | ||
| Pip will read from the subdirectory ``wheels`` within the pip cache directory | ||
| and use any packages found there. This is disabled via the same | ||
| ``--no-cache-dir`` option that disables the HTTP cache. The internal structure | ||
| of that is not part of the pip API. As of 7.0, pip makes a subdirectory for | ||
| each sdist that wheels are built from and places the resulting wheels inside. | ||
|
|
||
| Pip attempts to choose the best wheels from those built in preference to | ||
| building a new wheel. Note that this means when a package has both optional | ||
|
|
@@ -445,19 +446,123 @@ When no wheels are found for an sdist, pip will attempt to build a wheel | |
| automatically and insert it into the wheel cache. | ||
|
|
||
|
|
||
| Hash Verification | ||
| +++++++++++++++++ | ||
|
|
||
| PyPI provides md5 hashes in the hash fragment of package download urls. | ||
| .. _`hash-checking mode`: | ||
|
|
||
| pip supports checking this, as well as any of the | ||
| guaranteed hashlib algorithms (sha1, sha224, sha384, sha256, sha512, md5). | ||
|
|
||
| The hash fragment is case sensitive (i.e. sha1 not SHA1). | ||
| Hash-Checking Mode | ||
| ++++++++++++++++++ | ||
|
|
||
| This check is only intended to provide basic download corruption protection. | ||
| It is not intended to provide security against tampering. For that, | ||
| see :ref:`SSL Certificate Verification` | ||
| Since version 8.0, pip can check downloaded package archives against local | ||
| hashes to protect against remote tampering. To verify a package against one or | ||
| more hashes, add them to the end of the line:: | ||
|
|
||
| FooProject == 1.2 --hash:sha256=2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 \ | ||
| --hash:sha256=486ea46224d1bb4fb680f34f7c9ad96a8f24ec88be73ea8e5a6c65260e9cb8a7 | ||
|
|
||
| (The ability to use multiple hashes is important when a package has both | ||
| binary and source distributions or when it offers binary distributions for a | ||
| variety of platforms.) | ||
|
|
||
| The recommended hash algorithm at the moment is sha256, but stronger ones are | ||
| allowed, including all those supported by ``hashlib``. However, weaker ones | ||
| such as md5, sha1, and sha224 are excluded to avoid giving a false sense of | ||
| security. | ||
|
|
||
| Hash verification is an all-or-nothing proposition. Specifying a ``--hash`` | ||
| against any requirement not only checks that hash but also activates a global | ||
| *hash-checking mode*, which imposes several other security restrictions: | ||
|
|
||
| * Hashes are required for all requirements. This is because a partially-hashed | ||
| requirements file is of little use and thus likely an error: a malicious | ||
| actor could slip bad code into the installation via one of the unhashed | ||
| requirements. Note that hashes embedded in URL-style requirements via the | ||
| ``#md5=...`` syntax suffice to satisfy this rule (regardless of hash | ||
| strength, for legacy reasons), though you should use a stronger | ||
| hash like sha256 whenever possible. | ||
| * Hashes are required for all dependencies. An error results if there is a | ||
| dependency that is not spelled out and hashed in the requirements file. | ||
| * Requirements that take the form of project names (rather than URLs or local | ||
| filesystem paths) must be pinned to a specific version using ``==``. This | ||
| prevents a surprising hash mismatch upon the release of a new version | ||
| that matches the requirement specifier. | ||
| * ``--egg`` is disallowed, because it delegates installation of dependencies | ||
| to setuptools, giving up pip's ability to enforce any of the above. | ||
|
|
||
| .. _`--require-hashes`: | ||
|
|
||
| Hash-checking mode can be forced on with the ``--require-hashes`` command-line | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Have you considered
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, I think that doesn't read quite right. We always verify hashes if they are given (either via PyPI, or in the requirements file in this PR), so you're not telling pip to verify hashes, you're mandating that there must be a hash to verify.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well…hashes are always verified if they occur in the requirements file. All
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, fair enough, so maybe --enforce-hashes? I just dread the term "requirement" (setup_require, install_requires, requirements.txt, RequirementSet). All kind of used inconsistently, don't you think?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't feel like it's really an overload; we're just using the verb "require", not redefining "requirement", which is really the technical term at issue, I believe. (Also, I can't find a better word. "Demand" is the only decent synonym I can find, and it feels impolite to me. "Enforce", to me, would suggest that pip will somehow make hashes exist for me if they aren't there.) |
||
| option:: | ||
|
|
||
| $ pip install --require-hashes -r requirements.txt | ||
| ... | ||
| Hashes are required in --require-hashes mode (implicitly on when a hash is | ||
| specified for any package). These requirements were missing hashes, | ||
| leaving them open to tampering. These are the hashes the downloaded | ||
| archives actually had. You can add lines like these to your requirements | ||
| files to prevent tampering. | ||
| pyelasticsearch==1.0 --hash=sha256:44ddfb1225054d7d6b1d02e9338e7d4809be94edbe9929a2ec0807d38df993fa | ||
| more-itertools==2.2 --hash=sha256:93e62e05c7ad3da1a233def6731e8285156701e3419a5fe279017c429ec67ce0 | ||
|
|
||
| This can be useful in deploy scripts, to ensure that the author of the | ||
| requirements file provided hashes. It is also a convenient way to bootstrap | ||
| your list of hashes, since it shows the hashes of the packages it fetched. It | ||
| fetches only the preferred archive for each package, so you may still need to | ||
| add hashes for alternatives archives using :ref:`pip hash`: for instance if | ||
| there is both a binary and a source distribution. | ||
|
|
||
| The :ref:`wheel cache <Wheel cache>` is disabled in hash-checking mode to | ||
| prevent spurious hash mismatch errors. These would otherwise occur while | ||
| installing sdists that had already been automatically built into cached wheels: | ||
| those wheels would be selected for installation, but their hashes would not | ||
| match the sdist ones from the requirements file. A further complication is that | ||
| locally built wheels are nondeterministic: contemporary modification times make | ||
| their way into the archive, making hashes unpredictable across machines and | ||
| cache flushes. Compilation of C code adds further nondeterminism, as many | ||
| compilers include random-seeded values in their output. However, wheels fetched | ||
| from index servers are the same every time. They land in pip's HTTP cache, not | ||
| its wheel cache, and are used normally in hash-checking mode. The only downside | ||
| of having the the wheel cache disabled is thus extra build time for sdists, and | ||
| this can be solved by making sure pre-built wheels are available from the index | ||
| server. | ||
|
|
||
| Hash-checking mode also works with :ref:`pip download` and :ref:`pip wheel`. A | ||
| :ref:`comparison of hash-checking mode with other repeatability strategies | ||
| <Repeatability>` is available in the User Guide. | ||
|
|
||
| .. warning:: | ||
| Beware of the ``setup_requires`` keyword arg in :file:`setup.py`. The | ||
| (rare) packages that use it will cause those dependencies to be downloaded | ||
| by setuptools directly, skipping pip's hash-checking. If you need to use | ||
| such a package, see :ref:`Controlling | ||
| setup_requires<controlling-setup-requires>`. | ||
|
|
||
| .. warning:: | ||
| Be careful not to nullify all your security work when you install your | ||
| actual project by using setuptools directly: for example, by calling | ||
| ``python setup.py install``, ``python setup.py develop``, or | ||
| ``easy_install``. Setuptools will happily go out and download, unchecked, | ||
| anything you missed in your requirements file—and it’s easy to miss things | ||
| as your project evolves. To be safe, install your project using pip and | ||
| :ref:`--no-deps <install_--no-deps>`. | ||
|
|
||
| Instead of ``python setup.py develop``, use... :: | ||
|
|
||
| pip install --no-deps -e . | ||
|
|
||
| Instead of ``python setup.py install``, use... :: | ||
|
|
||
| pip install --no-deps . | ||
|
|
||
|
|
||
| Hashes from PyPI | ||
| ~~~~~~~~~~~~~~~~ | ||
|
|
||
| PyPI provides an MD5 hash in the fragment portion of each package download URL, | ||
| like ``#md5=123...``, which pip checks as a protection against download | ||
| corruption. Other hash algorithms that have guaranteed support from ``hashlib`` | ||
| are also supported here: sha1, sha224, sha384, sha256, and sha512. Since this | ||
| hash originates remotely, it is not a useful guard against tampering and thus | ||
| does not satisfy the ``--require-hashes`` demand that every package have a | ||
| local hash. | ||
|
|
||
|
|
||
| .. _`editable-installs`: | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we actually support
#md5=in requirements files? I thought we didn't.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! If somebody bothered to provide a hash in the reqs file, it would get checked. I tried not to change existing behavior.