Skip to content

Commit 1da0ea1

Browse files
committed
Merge pull request #3231 from dstufft/hashes2
Continuation of #3137
2 parents 78c77b3 + b160661 commit 1da0ea1

30 files changed

+1288
-466
lines changed

CHANGES.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,11 @@
4848
building a package via ``setup.py``. This will alleviate concerns that
4949
projects with unusually long build times have with pip appearing to stall.
5050

51+
* Include the functionality of ``peep`` into pip, allowing hashes to be baked
52+
into a requirements file and ensuring that the packages being downloaded
53+
match one of those hashes. This is an additional, opt-in security measure
54+
that, when used, removes the need to trust the repository.
55+
5156

5257
**7.1.2 (2015-08-22)**
5358

docs/reference/index.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,5 +14,4 @@ Reference Guide
1414
pip_show
1515
pip_search
1616
pip_wheel
17-
18-
17+
pip_hash

docs/reference/pip_hash.rst

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
.. _`pip hash`:
2+
3+
pip hash
4+
------------
5+
6+
.. contents::
7+
8+
Usage
9+
*****
10+
11+
.. pip-command-usage:: hash
12+
13+
14+
Description
15+
***********
16+
17+
.. pip-command-description:: hash
18+
19+
20+
Overview
21+
++++++++
22+
``pip hash`` is a convenient way to get a hash digest for use with
23+
:ref:`hash-checking mode`, especially for packages with multiple archives. The
24+
error message from ``pip install --require-hashes ...`` will give you one
25+
hash, but, if there are multiple archives (like source and binary ones), you
26+
will need to manually download and compute a hash for the others. Otherwise, a
27+
spurious hash mismatch could occur when :ref:`pip install` is passed a
28+
different set of options, like :ref:`--no-binary <install_--no-binary>`.
29+
30+
31+
Options
32+
*******
33+
34+
.. pip-command-options:: hash
35+
36+
37+
Example
38+
********
39+
40+
Compute the hash of a downloaded archive::
41+
42+
$ pip download SomePackage
43+
Collecting SomePackage
44+
Downloading SomePackage-2.2.tar.gz
45+
Saved ./pip_downloads/SomePackage-2.2.tar.gz
46+
Successfully downloaded SomePackage
47+
$ pip hash ./pip_downloads/SomePackage-2.2.tar.gz
48+
./pip_downloads/SomePackage-2.2.tar.gz:
49+
--hash=sha256:93e62e05c7ad3da1a233def6731e8285156701e3419a5fe279017c429ec67ce0

docs/reference/pip_install.rst

Lines changed: 127 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -101,14 +101,15 @@ and the newline following it is effectively ignored.
101101

102102
Comments are stripped *before* line continuations are processed.
103103

104-
Additionally, the following Package Index Options are supported:
104+
The following options are supported:
105105

106106
* :ref:`-i, --index-url <--index-url>`
107107
* :ref:`--extra-index-url <--extra-index-url>`
108108
* :ref:`--no-index <--no-index>`
109109
* :ref:`-f, --find-links <--find-links>`
110110
* :ref:`--no-binary <install_--no-binary>`
111111
* :ref:`--only-binary <install_--only-binary>`
112+
* :ref:`--require-hashes <--require-hashes>`
112113

113114
For example, to specify :ref:`--no-index <--no-index>` and 2 :ref:`--find-links <--find-links>` locations:
114115

@@ -396,16 +397,16 @@ See the :ref:`pip install Examples<pip install Examples>`.
396397
SSL Certificate Verification
397398
++++++++++++++++++++++++++++
398399

399-
Starting with v1.3, pip provides SSL certificate verification over https, for the purpose
400-
of providing secure, certified downloads from PyPI.
400+
Starting with v1.3, pip provides SSL certificate verification over https, to
401+
prevent man-in-the-middle attacks against PyPI downloads.
401402

402403

403404
.. _`Caching`:
404405

405406
Caching
406407
+++++++
407408

408-
Starting with v6.0, pip provides an on by default cache which functions
409+
Starting with v6.0, pip provides an on-by-default cache which functions
409410
similarly to that of a web browser. While the cache is on by default and is
410411
designed do the right thing by default you can disable the cache and always
411412
access PyPI by utilizing the ``--no-cache-dir`` option.
@@ -441,14 +442,14 @@ Windows
441442

442443
.. _`Wheel cache`:
443444

444-
Wheel cache
445-
***********
445+
Wheel Cache
446+
~~~~~~~~~~~
446447

447-
Pip will read from the subdirectory ``wheels`` within the pip cache dir and use
448-
any packages found there. This is disabled via the same ``no-cache-dir`` option
449-
that disables the HTTP cache. The internal structure of that cache is not part
450-
of the pip API. As of 7.0 pip uses a subdirectory per sdist that wheels were
451-
built from, and wheels within that subdirectory.
448+
Pip will read from the subdirectory ``wheels`` within the pip cache directory
449+
and use any packages found there. This is disabled via the same
450+
``--no-cache-dir`` option that disables the HTTP cache. The internal structure
451+
of that is not part of the pip API. As of 7.0, pip makes a subdirectory for
452+
each sdist that wheels are built from and places the resulting wheels inside.
452453

453454
Pip attempts to choose the best wheels from those built in preference to
454455
building a new wheel. Note that this means when a package has both optional
@@ -461,19 +462,123 @@ When no wheels are found for an sdist, pip will attempt to build a wheel
461462
automatically and insert it into the wheel cache.
462463

463464

464-
Hash Verification
465-
+++++++++++++++++
466-
467-
PyPI provides md5 hashes in the hash fragment of package download urls.
465+
.. _`hash-checking mode`:
468466

469-
pip supports checking this, as well as any of the
470-
guaranteed hashlib algorithms (sha1, sha224, sha384, sha256, sha512, md5).
471-
472-
The hash fragment is case sensitive (i.e. sha1 not SHA1).
467+
Hash-Checking Mode
468+
++++++++++++++++++
473469

474-
This check is only intended to provide basic download corruption protection.
475-
It is not intended to provide security against tampering. For that,
476-
see :ref:`SSL Certificate Verification`
470+
Since version 8.0, pip can check downloaded package archives against local
471+
hashes to protect against remote tampering. To verify a package against one or
472+
more hashes, add them to the end of the line::
473+
474+
FooProject == 1.2 --hash:sha256=2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 \
475+
--hash:sha256=486ea46224d1bb4fb680f34f7c9ad96a8f24ec88be73ea8e5a6c65260e9cb8a7
476+
477+
(The ability to use multiple hashes is important when a package has both
478+
binary and source distributions or when it offers binary distributions for a
479+
variety of platforms.)
480+
481+
The recommended hash algorithm at the moment is sha256, but stronger ones are
482+
allowed, including all those supported by ``hashlib``. However, weaker ones
483+
such as md5, sha1, and sha224 are excluded to avoid giving a false sense of
484+
security.
485+
486+
Hash verification is an all-or-nothing proposition. Specifying a ``--hash``
487+
against any requirement not only checks that hash but also activates a global
488+
*hash-checking mode*, which imposes several other security restrictions:
489+
490+
* Hashes are required for all requirements. This is because a partially-hashed
491+
requirements file is of little use and thus likely an error: a malicious
492+
actor could slip bad code into the installation via one of the unhashed
493+
requirements. Note that hashes embedded in URL-style requirements via the
494+
``#md5=...`` syntax suffice to satisfy this rule (regardless of hash
495+
strength, for legacy reasons), though you should use a stronger
496+
hash like sha256 whenever possible.
497+
* Hashes are required for all dependencies. An error results if there is a
498+
dependency that is not spelled out and hashed in the requirements file.
499+
* Requirements that take the form of project names (rather than URLs or local
500+
filesystem paths) must be pinned to a specific version using ``==``. This
501+
prevents a surprising hash mismatch upon the release of a new version
502+
that matches the requirement specifier.
503+
* ``--egg`` is disallowed, because it delegates installation of dependencies
504+
to setuptools, giving up pip's ability to enforce any of the above.
505+
506+
.. _`--require-hashes`:
507+
508+
Hash-checking mode can be forced on with the ``--require-hashes`` command-line
509+
option::
510+
511+
$ pip install --require-hashes -r requirements.txt
512+
...
513+
Hashes are required in --require-hashes mode (implicitly on when a hash is
514+
specified for any package). These requirements were missing hashes,
515+
leaving them open to tampering. These are the hashes the downloaded
516+
archives actually had. You can add lines like these to your requirements
517+
files to prevent tampering.
518+
pyelasticsearch==1.0 --hash=sha256:44ddfb1225054d7d6b1d02e9338e7d4809be94edbe9929a2ec0807d38df993fa
519+
more-itertools==2.2 --hash=sha256:93e62e05c7ad3da1a233def6731e8285156701e3419a5fe279017c429ec67ce0
520+
521+
This can be useful in deploy scripts, to ensure that the author of the
522+
requirements file provided hashes. It is also a convenient way to bootstrap
523+
your list of hashes, since it shows the hashes of the packages it fetched. It
524+
fetches only the preferred archive for each package, so you may still need to
525+
add hashes for alternatives archives using :ref:`pip hash`: for instance if
526+
there is both a binary and a source distribution.
527+
528+
The :ref:`wheel cache <Wheel cache>` is disabled in hash-checking mode to
529+
prevent spurious hash mismatch errors. These would otherwise occur while
530+
installing sdists that had already been automatically built into cached wheels:
531+
those wheels would be selected for installation, but their hashes would not
532+
match the sdist ones from the requirements file. A further complication is that
533+
locally built wheels are nondeterministic: contemporary modification times make
534+
their way into the archive, making hashes unpredictable across machines and
535+
cache flushes. Compilation of C code adds further nondeterminism, as many
536+
compilers include random-seeded values in their output. However, wheels fetched
537+
from index servers are the same every time. They land in pip's HTTP cache, not
538+
its wheel cache, and are used normally in hash-checking mode. The only downside
539+
of having the the wheel cache disabled is thus extra build time for sdists, and
540+
this can be solved by making sure pre-built wheels are available from the index
541+
server.
542+
543+
Hash-checking mode also works with :ref:`pip download` and :ref:`pip wheel`. A
544+
:ref:`comparison of hash-checking mode with other repeatability strategies
545+
<Repeatability>` is available in the User Guide.
546+
547+
.. warning::
548+
Beware of the ``setup_requires`` keyword arg in :file:`setup.py`. The
549+
(rare) packages that use it will cause those dependencies to be downloaded
550+
by setuptools directly, skipping pip's hash-checking. If you need to use
551+
such a package, see :ref:`Controlling
552+
setup_requires<controlling-setup-requires>`.
553+
554+
.. warning::
555+
Be careful not to nullify all your security work when you install your
556+
actual project by using setuptools directly: for example, by calling
557+
``python setup.py install``, ``python setup.py develop``, or
558+
``easy_install``. Setuptools will happily go out and download, unchecked,
559+
anything you missed in your requirements file—and it’s easy to miss things
560+
as your project evolves. To be safe, install your project using pip and
561+
:ref:`--no-deps <install_--no-deps>`.
562+
563+
Instead of ``python setup.py develop``, use... ::
564+
565+
pip install --no-deps -e .
566+
567+
Instead of ``python setup.py install``, use... ::
568+
569+
pip install --no-deps .
570+
571+
572+
Hashes from PyPI
573+
~~~~~~~~~~~~~~~~
574+
575+
PyPI provides an MD5 hash in the fragment portion of each package download URL,
576+
like ``#md5=123...``, which pip checks as a protection against download
577+
corruption. Other hash algorithms that have guaranteed support from ``hashlib``
578+
are also supported here: sha1, sha224, sha384, sha256, and sha512. Since this
579+
hash originates remotely, it is not a useful guard against tampering and thus
580+
does not satisfy the ``--require-hashes`` demand that every package have a
581+
local hash.
477582

478583

479584
.. _`editable-installs`:

docs/user_guide.rst

Lines changed: 59 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@ Specifiers`
2323

2424
For more information and examples, see the :ref:`pip install` reference.
2525

26+
.. _PyPI: http://pypi.python.org/pypi
27+
2628

2729
.. _`Requirements Files`:
2830

@@ -71,7 +73,6 @@ In practice, there are 4 common uses of Requirements files:
7173
pkg2
7274
pkg3>=1.0,<=2.0
7375

74-
7576
3. Requirements files are used to force pip to install an alternate version of a
7677
sub-dependency. For example, suppose `ProjectA` in your requirements file
7778
requires `ProjectB`, but the latest version (v1.3) has a bug, you can force
@@ -591,44 +592,81 @@ From within a real python, where ``SomePackage`` *is* installed globally, and is
591592
Ensuring Repeatability
592593
**********************
593594

594-
Four things are required to fully guarantee a repeatable installation using requirements files.
595+
pip can achieve various levels of repeatability:
596+
597+
Pinned Version Numbers
598+
----------------------
599+
600+
Pinning the versions of your dependencies in the requirements file
601+
protects you from bugs or incompatibilities in newly released versions::
602+
603+
SomePackage == 1.2.3
604+
DependencyOfSomePackage == 4.5.6
595605

596-
1. The requirements file was generated by ``pip freeze`` or you're sure it only
597-
contains requirements that specify a specific version.
606+
Using :ref:`pip freeze` to generate the requirements file will ensure that not
607+
only the top-level dependencies are included but their sub-dependencies as
608+
well, and so on. Perform the installation using :ref:`--no-deps
609+
<install_--no-deps>` for an extra dose of insurance against installing
610+
anything not explicitly listed.
598611

599-
2. The installation is performed using :ref:`--no-deps <install_--no-deps>`.
600-
This guarantees that only what is explicitly listed in the requirements file is
601-
installed.
612+
This strategy is easy to implement and works across OSes and architectures.
613+
However, it trusts PyPI and the certificate authority chain. It
614+
also relies on indices and find-links locations not allowing
615+
packages to change without a version increase. (PyPI does protect
616+
against this.)
602617

603-
3. None of the packages to be installed utilize the setup_requires keyword. See
604-
:ref:`Controlling setup_requires<controlling-setup-requires>`.
618+
Hash-checking Mode
619+
------------------
620+
621+
Beyond pinning version numbers, you can add hashes against which to verify
622+
downloaded packages::
605623

606-
4. The installation is performed against an index or find-links location that is
607-
guaranteed to *not* allow archives to be changed and updated without a
608-
version increase. While this is safe on PyPI, it may not be safe for other
609-
indices. If you are working with an unsafe index, consider the `peep project
610-
<https://pypi.python.org/pypi/peep>`_ which offers this feature on top of pip
611-
using requirements file comments.
624+
FooProject == 1.2 --hash:sha256=2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824
612625

626+
This protects against a compromise of PyPI or the HTTPS
627+
certificate chain. It also guards against a package changing
628+
without its version number changing (on indexes that allow this).
629+
This approach is a good fit for automated server deployments.
613630

614-
.. _PyPI: http://pypi.python.org/pypi/
631+
Hash-checking mode is a labor-saving alternative to running a private index
632+
server containing approved packages: it removes the need to upload packages,
633+
maintain ACLs, and keep an audit trail (which a VCS gives you on the
634+
requirements file for free). It can also substitute for a vendor library,
635+
providing easier upgrades and less VCS noise. It does not, of course,
636+
provide the availability benefits of a private index or a vendor library.
615637

638+
For more, see :ref:`pip install\'s discussion of hash-checking mode <hash-checking mode>`.
616639

617640
.. _`Installation Bundle`:
618641

619-
Create an Installation Bundle with Compiled Dependencies
620-
********************************************************
642+
Installation Bundles
643+
--------------------
621644

622-
You can create a simple bundle that contains all of the dependencies you wish
623-
to install using::
645+
Using :ref:`pip wheel`, you can bundle up all of a project's dependencies, with
646+
any compilation done, into a single archive. This allows installation when
647+
index servers are unavailable and avoids time-consuming recompilation. Create
648+
an archive like this::
624649

625650
$ tempdir=$(mktemp -d /tmp/wheelhouse-XXXXX)
626651
$ pip wheel -r requirements.txt --wheel-dir=$tempdir
627652
$ cwd=`pwd`
628653
$ (cd "$tempdir"; tar -cjvf "$cwd/bundled.tar.bz2" *)
629654

630-
Once you have a bundle, you can then install it using::
655+
You can then install from the archive like this::
631656

632657
$ tempdir=$(mktemp -d /tmp/wheelhouse-XXXXX)
633658
$ (cd $tempdir; tar -xvf /path/to/bundled.tar.bz2)
634659
$ pip install --force-reinstall --ignore-installed --upgrade --no-index --no-deps $tempdir/*
660+
661+
Note that compiled packages are typically OS- and architecture-specific, so
662+
these archives are not necessarily portable across machines.
663+
664+
Hash-checking mode can be used along with this method to ensure that future
665+
archives are built with identical packages.
666+
667+
.. warning::
668+
Finally, beware of the ``setup_requires`` keyword arg in :file:`setup.py`.
669+
The (rare) packages that use it will cause those dependencies to be
670+
downloaded by setuptools directly, skipping pip's protections. If you need
671+
to use such a package, see :ref:`Controlling
672+
setup_requires<controlling-setup-requires>`.

pip/basecommand.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -280,6 +280,9 @@ def populate_requirement_set(requirement_set, args, options, finder,
280280
wheel_cache=wheel_cache):
281281
found_req_in_file = True
282282
requirement_set.add_requirement(req)
283+
# If --require-hashes was a line in a requirements file, tell
284+
# RequirementSet about it:
285+
requirement_set.require_hashes = options.require_hashes
283286

284287
if not (args or options.editables or found_req_in_file):
285288
opts = {'name': name}

0 commit comments

Comments
 (0)