Skip to content

pip doesn't properly parse git URL if branch name contains @ or # #10226

Open
@phdru

Description

@phdru

Description

Trying pip install git+https://example.com/repository@branch fails if branch contains characters @ or #, even percent-encoded.

Expected behavior

pip must parse percent-encoded special characters in branch name, split the branch name from the URL, clone the repository and checkout the named branch with special characters decoded. I.e.

pip install https://example.com/repository@master%40test

must clone https://example.com/repository and checkout master@test branch. The same for # character %-encoded as %23.

pip version

Any; tested with 21.1.3

Python version

Any; tested with Python 3.9

OS

Any; tested with Debian 10 buster

How to Reproduce

Here is a test program test-pip-git that creates a repository, tries pip download and cleanups:

#! /bin/sh
set -e

PERCENT_ENCODING=0
while getopts p: opt; do
    case $opt in
        p ) PERCENT_ENCODING="${OPTARG:-1}" ;;
    esac
done
shift `expr $OPTIND - 1`

if [ -z "$1" ]; then
    echo "Usage: $0 [-p1|2] test_char" >&2
    exit 1
fi

TEST_CHAR1="$1"
if [ $PERCENT_ENCODING -ge 1 ]; then

    py_ver=`python -c "import sys; print(sys.version_info[0])"`
    if [ $py_ver -eq 2 ]; then
        percent_encode() {
            python -c "import urllib; print(urllib.quote('$1'))"
        }
    elif [ $py_ver -eq 3 ]; then
        percent_encode() {
            python -c "import urllib.parse; print(urllib.parse.quote('$1'))"
        }
    else
        echo "Unknown python version" >&1
        exit 1
    fi
    TEST_CHAR2=`percent_encode "$1"`
    if [ $PERCENT_ENCODING -eq 2 ]; then
        TEST_CHAR2=`percent_encode "$TEST_CHAR2"`
    fi
else
    TEST_CHAR2="$1"
fi

rm -rf test-pip-git-repo test-pip-git-spec-char-0.0.1.zip
git init test-pip-git-repo
cd test-pip-git-repo

echo test >test
git add test
git commit -m test

git branch -M master # to fixed name
git checkout -b test${TEST_CHAR1}test # new branch

cat >setup.py <<EOF
#!/usr/bin/env python

from setuptools import setup

setup(
    name='test_pip_git_spec_char',
    version='0.0.1',
    description='Test pip+git+special characters',
    author='Oleg Broytman',
    author_email='[email protected]',
    keywords=['pip', 'git', '@', '!', '#', '/'],
    platforms='Any',
)
EOF

git add setup.py
git commit -m setup.py
git checkout master # make test branch non-current

cd ..
pip download git+file://`pwd`/test-pip-git-repo@test${TEST_CHAR2}test | grep '\(clone\|checkout\)' || : # ignore errors

rm -rf test-pip-git-repo test-pip-git-spec-char-0.0.1.zip

Output

./test-pip-git @

Running command git clone -q file:///home/phd/tmp/test-pip-git-repo@master /tmp/pip-req-build-v1v16zoe
fatal: '/home/phd/tmp/test-pip-git-repo@master' does not appear to be a git repository

pip clones incorrect repository test-pip-git-repo@master; the repo must be test-pip-git-repo.

./test-pip-git -p1 @

Running command git clone -q file:///home/phd/tmp/test-pip-git-repo@master /tmp/pip-req-build-__0b6wh5
fatal: '/home/phd/tmp/test-pip-git-repo@master' does not appear to be a git repository

The same incorrect repo.

./test-pip-git \!

Running command git clone -q file:///home/phd/tmp/test-pip-git-repo /tmp/pip-req-build-fsnq5vt_
Running command git checkout -b 'master!test' --track 'origin/master!test'

Just a test with another less special character. pip clones correct repository test-pip-git-repo and checks out correct branch master!test. Doesn't even require %-encoding. Test passed!

./test-pip-git \#

Running command git clone -q file:///home/phd/tmp/test-pip-git-repo /tmp/pip-req-build-i0wy10_1
ERROR: File "setup.py" not found

pip clones correct repository test-pip-git-repo but doesn't check out branch master#test. It just uses branch master and ignores everything after #.

./test-pip-git -p1 \#

Exactly the same problem.

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    C: vcspip's interaction with version control systems like git, svn and bzrtype: bugA confirmed bug or unintended behavior

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions