Skip to content

Conversation

@mmarchetti
Copy link
Contributor

Description

This PR adds a check for virtualenvs when bundling, to prevent accidentally uploading a virtualenv.

It also adds renv to the default exclusion list for R/reticulate users.

Testing Notes / Validation Steps

Create a virtualenv in the project directory; make sure its name is not in the default exclusion list. It doesn't need to be active. Deploy and then download the bundle from Connect to verify its contents.

Copy link

@colearendt colearendt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Very elegant!

@rstub
Copy link

rstub commented Aug 6, 2021

Does this also take care of the case where the virtual environment is not in a subdirectory but in the app directory itself?, i.e. a situation like

flask-api/bin/activate
flask-api/bin/python
[...]
flask-api/lib/...
[...]
flask-api/api.py
flask-ap/requirements.txt

@mmarchetti
Copy link
Contributor Author

No, it only handles the subdirectory case, where it can exclude the entire directory. This implementation doesn't know what's in a virtualenv, other than bin/python. To handle the case you've described, it would need to partition the files and directories in the deployment directory into "app files" vs. "python environment files".

@colearendt
Copy link

colearendt commented Aug 6, 2021

One thought I had - do virtualenvs ever have bin/python3 instead of bin/python (and no bin/python)?

@mmarchetti
Copy link
Contributor Author

Virtualenv creates python, python3, and (e.g.) python3.8 symlinks.

@rstub
Copy link

rstub commented Aug 6, 2021

Ok. Maybe you could add a warning if you find bin/python in the app's directory?

@kgartland-rstudio
Copy link
Contributor

Test Case 1

  • virtualenv in the top-level directory that is being deployed.

Local directory structure:

.
├── app.py
├── bin
├── data.csv
├── include
├── lib
├── pyvenv.cfg
├── requirements.txt
└── rsconnect-python

Results:
once deployed only the following files were uploaded (we excluded the virtualenv directories /bin, /lib, /include):
Deployed bundle structure:

.
├── app.py
├── data.csv
├── manifest.json
├── pyvenv.cfg
└── requirements.txt

previously we would have uploaded all files and directories.

Test Case 2

  • virtualenv in directory bologna:

Local directory structure:

.
├── app.py
├── bologna
├── data.csv
├── requirements.txt
└── rsconnect-python

Results:
again, the virtualenv directory was omitted as expected (previously it would have been included):
Deployed bundle structure:

.
├── app.py
├── data.csv
├── manifest.json
└── requirements.txt

Test Case 3

  • data file in a sub-directory (broccoli) and deployed:

Local directory structure:

.
├── app.py
├── bologna
├── broccoli
├── requirements.txt
└── rsconnect-python

Results:
here, the directory that contained my data file (broccoli) was included while the virtualenv directory (bologna) was excluded.
Deployed bundle structure:

.
├── app.py
├── broccoli
├── manifest.json
└── requirements.txt

Test Case 4

  • data file in a sub-directory named venv and deployed:

Local directory structure:

.
├── app.py
├── bologna
├── requirements.txt
├── rsconnect-python
└── venv

Results:
we do omit the data file when it is in a folder called venv
Deployed bundle structure:

.
├── app.py
├── manifest.json
└── requirements.txt

Test Case 5

  • data file in sub-directory named venv but explicitly include ./venv/* in deploy command, i.e.
    rsconnect deploy bokeh -n local ./ ./venv/*:
    Local directory structure:
.
├── app.py
├── bologna
├── requirements.txt
├── rsconnect-python
└── venv

Results:
we include the venv folder if it is explicitly added to the deploy command:

.
├── app.py
├── manifest.json
├── requirements.txt
└── venv

@kgartland-rstudio
Copy link
Contributor

The warning message only seems to display when bin and lib folders are in the top-level directory but does not display when they are in a subdirectory. Do we want to warn in either case?
top-level:

.
├── app.py
├── bin
├── bologna
├── include
├── lib
├── pyvenv.cfg
├── requirements.txt
├── rsconnect-python
└── data

Warning is shown:

Warning: The deployment directory appears to be a python virtual environment.
             Excluding the 'bin' and 'lib' directories.

sub-directory (where the env folders are under bologna):

.
├── app.py
├── bologna
├── pyvenv.cfg
├── requirements.txt
├── rsconnect-python
└── data

No warning is produced.

Additionally in the warning we only mention bin and lib directories, but we also exclude the include directory when using virtualenv.

@mmarchetti
Copy link
Contributor Author

We don't warn when detecting an environment in a subdirectory because we can safely exclude the whole directory. When it's the root directory (like you're deploying from inside the virtual environment directory itself) we don't know what to exclude, so we do what we can and warn that it might not be complete.

@mmarchetti mmarchetti merged commit 5f28219 into master Aug 11, 2021
@mmarchetti mmarchetti deleted the mm-exclude-envs branch August 11, 2021 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants