Skip to content

Conversation

@sotte
Copy link
Contributor

@sotte sotte commented Oct 7, 2019

This PR is the result of #1409. @fmassa thanks for the input!

The PR adds examples of each transform class to the sphinx docs. The result looks something like this:
image

Make _sample_image() public

@fmassa said:

Also, the Grace Hopper image is available under test/assets I believe, which is not installed with torchvision, so this would mean that we would need to move the image somewhere else. My thinking is that those two functions are very specific to the examples, and thus having them live in torchvision.utils as being public functions might not be best.

I would argue that it's quite convenient to be able to get sample image and therefore sample_image() should be public. skimage for example gives you quite a few sample images which I actually tend to use when using pytorch.
(That being said I'm also fine with a private _sample_image().)

Implementation of _sample_image() public

The current implementation is just a dummy implementation that load the image from /tmp :)
The Hopper image is part of the test/ folder and is not part of the resulting egg/wheel/whatever.
Normally I would just use https://docs.python.org/3.7/library/importlib.html#module-importlib.resources to load the file, but pytorch is not 3.7 only :) So we have to decide on where to put the file and how to load it. I'm no expert when it comes to packaging and I'm more than open for suggestion.

TODOs

  • Add examples to transform classes (as suggested)
  • Add helpers to load and display images
  • Split transform.rst into transform.rst and transform_functional.rst (as suggested)
  • Decide if _sample_image() really should de private
  • Add proper implementation for _sample_image() depending on how the actual sample image is going to be shipped.
  • Fix inconsistencies of of documented transforms.
  • Document all __call__ of TensorTransforms.
  • Clean git history.

Feedback welcome!

Closes #1409

sotte added 3 commits October 14, 2019 09:01
- All "Transforms on torch.*Tensor" document their __call__ in
  a consistent manner and __call__ is shown in sphinx.
- Fixed typos, punctuation, and naming.
- Fixed wrong format of doc strings.
@sotte sotte force-pushed the add_examples_to_transform_docs branch from 824a8fd to fa9e946 Compare October 14, 2019 08:21
@codecov-io
Copy link

codecov-io commented Oct 14, 2019

Codecov Report

Merging #1426 into master will decrease coverage by 0.1%.
The diff coverage is 25%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1426      +/-   ##
==========================================
- Coverage   64.08%   63.97%   -0.11%     
==========================================
  Files          80       80              
  Lines        6328     6343      +15     
  Branches      973      975       +2     
==========================================
+ Hits         4055     4058       +3     
- Misses       1986     1998      +12     
  Partials      287      287
Impacted Files Coverage Δ
torchvision/transforms/transforms.py 81.04% <ø> (ø) ⬆️
torchvision/utils.py 55.88% <25%> (-10.16%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ed5b2dc...7d6d38c. Read the comment docs.

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR @sotte!

I have a few comments, let me know what you think.

In particular, I'm thinking if we could just define the methods in the documentation somewhere, so that they do not live in the torchvision source.

@fmassa
Copy link
Member

fmassa commented Oct 18, 2019

Another thing, we should not forget to fix the random seeds (for both random and torch) to avoid the images being regenerated at every doc rebuild.

Report their input/output shapes.
@sotte
Copy link
Contributor Author

sotte commented Oct 19, 2019

Another thing, we should not forget to fix the random seeds (for both random and torch) to avoid the images being regenerated at every doc rebuild.

The examples are only re-executed if the example code changes, i.e. calling make html twice in a row does not rebuild the images. Therefore I'm not not sure if he even have to fix the seeds. @fmassa if only recreating the images on every make html is the issue, then we don't need to fix the seeds. What do you think.

@sotte
Copy link
Contributor Author

sotte commented Oct 19, 2019

We need to decide on the actual implementation of _example_image(), where the image is stored, and how the image should be packed into the wheel.

Just as reference: this is how skimage does it:

@fmassa
Copy link
Member

fmassa commented Oct 25, 2019

The examples are only re-executed if the example code changes, i.e. calling make html twice in a row does not rebuild the images

I might not understand how matplotlib caching works, but I would have expected that if I clean up the build folder and re-run again make html the images generated would be different? Which would mean the html would change everytime we update the documentation.

We need to decide on the actual implementation of _example_image(), where the image is stored, and how the image should be packed into the wheel.

thanks for the pointers. I'm not yet clear if we want to have those functions in the torchvision distribution, but the pointers are anyway very helpful.

I'm waiting on feedback from @jlin27 on what would be better.

@jlin27
Copy link

jlin27 commented Oct 25, 2019

Hi @fmassa @sotte

My vote is to keep _sample_image() private, given as you've all discussed, that it's specific for this example use case. Helps signal to users that they probably shouldn't be using it outside of this situation.

As for where to store the images, I'd suggest a static folder with a descriptive name. For example, in pytorch/tutorials, images are stored /static/_img (https://github.com/pytorch/tutorials/tree/master/_static/images), and just make sure to double-check that they are pulled in properly during build.

@sotte
Copy link
Contributor Author

sotte commented Oct 27, 2019

Seeds
@fmassa I think we were talking about different things. When you remove the build/ folder of the docs, of course the docs will be rebuild from scratch and the output might be different depending on the RNG. If you don't remove the build/ folder then sphinx is not going to re-execute the examples.
I set the random seed now and the examples should stay constant between complete rebuilds.

Private functions
I explicitly mention that the functions are not to be used.

Loading and packaging the image / things that should be simple but are quite tricky
@jlin27 thanks for the input! I think bundling the images is a bit more complex as in the pytorch tutorials case. We have to ship the assets.
In general I would add the image to a newly created torchvision.assets module so that it gets shipped with torchvision and so that I can access the file similar to how it's done with skimage. Something like this:

def _sample_image():
    """Private helper function to load a sample PIL image.

    This function might change and/or break. Don't depend on it.
    """
    import os.path
    from PIL import Image

    data_dir = os.path.abspath(os.path.dirname(__file__))
    return Image.open(os.path.join(data_dir, "assets", "grace_hopper_517x606.jpg"))

Then of course we have to actually include the folder

# MANIFEST.in
include README.rst
include LICENSE
include torchvision/assets/*

recursive-exclude * __pycache__
recursive-exclude * *.py[co]

and change the

# setup.py
setup(
    ...
    zip_safe=False,  # not zip safe because we load the image via a path
    include_package_data=True,
)

but after doing so I get the following error:

➤ python setup.py install 
Building wheel torchvision-0.5.0a0+5108c1e
running install
running bdist_egg
running egg_info
writing torchvision.egg-info/PKG-INFO
writing dependency_links to torchvision.egg-info/dependency_links.txt
writing requirements to torchvision.egg-info/requires.txt
writing top-level names to torchvision.egg-info/top_level.txt
reading manifest file 'torchvision.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no previously-included files matching '__pycache__' found under directory '*'
warning: no previously-included files matching '*.py[co]' found under directory '*'
writing manifest file 'torchvision.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
copying torchvision/version.py -> build/lib.linux-x86_64-3.7/torchvision
error: Error: setup script specifies an absolute path:

    /home/stefan/coding/torchvision/test/test_models.cpp

setup() arguments must *always* be /-separated paths relative to the
setup.py directory, *never* absolute paths.

I'm also not sure if we break things by making torchvision not zip_safe.

The packaging problem is still not solved and I'm more than open for input!

Ref
See the "note" box: https://python-packaging.readthedocs.io/en/latest/non-code-files.html

@fmassa
Copy link
Member

fmassa commented Nov 4, 2019

Hi @sotte ,

Sorry for the delay in replying, I had to fix a few issues with torchvision lately (including making it zip_safe=False :-), see #1536 )

About the error you are facing, from a quick search it might indicate that you might need to do a clean build for torchvision?

Apart from that, I think that the approach you are following is the right one, with the os.path.abspath(os.path.dirname(__file__)) and setting include_package_data.

I don't have much more experience with setuptools than that though

@sotte
Copy link
Contributor Author

sotte commented Nov 4, 2019

@fmassa no worries and thanks for the pointer. I'll give it a try and get back to you.

@sotte
Copy link
Contributor Author

sotte commented Nov 4, 2019

It's getting tricky :) Starting with a fresh build did not help.

Here is a log of what I did:

# Add the following to setup.py
#     zip_safe=False,
#     include_package_data=True,
python setup.py clean
python setup.py install

This yields the error:

...
warning: no previously-included files matching '__pycache__' found under directory '*'
warning: no previously-included files matching '*.py[co]' found under directory '*'
writing manifest file 'torchvision.egg-info/SOURCES.txt'
error: Error: setup script specifies an absolute path:

    /home/stefan/coding/torchvision/test/test_models.cpp

setup() arguments must *always* be /-separated paths relative to the
setup.py directory, *never* absolute paths.

Checking torchvision.egg-info/SOURCES.txt tells me that all *.cpp files are included with their absolute path:

$ cat torchvision.egg-info/SOURCES.txt | grep cpp                                                      
/home/stefan/coding/torchvision/test/test_models.cpp
/home/stefan/coding/torchvision/torchvision/csrc/vision.cpp
/home/stefan/coding/torchvision/torchvision/csrc/cpu/ROIAlign_cpu.cpp
/home/stefan/coding/torchvision/torchvision/csrc/cpu/ROIPool_cpu.cpp
/home/stefan/coding/torchvision/torchvision/csrc/cpu/nms_cpu.cpp
...

get_extensions() in setup.py seems to create absolute paths (line 80). I haven't had time to look into it more thoroughly.

@sotte
Copy link
Contributor Author

sotte commented Nov 4, 2019

Just to verify that I did not break anything in my branch: on master cd17484 when adding include_package_data=True I get the same error when calling python setup.py build.

The SOURCES.txt contained absolute paths which is not valid if you
include_package_data. Therefore we switched to relpath.
@sotte
Copy link
Contributor Author

sotte commented Nov 10, 2019

@fmassa So I had to change the setup.py quite a bit (and I don't think this should be necessary) to be able to build torchvision when include_package_data=True. Before my commit 7d6d38c the SOURCES.txt contained absolute paths which should not be the case, but seemingly only produces errors when include_package_data=True. I'm quite confident that I broke the build with that commit though :)

Please, anybody who is more knowledgeable with python packaging take a look at this!

That being said, the PR should be complete now. Feel free to build the docs yourself and let me know if it's ok.

@sotte sotte changed the title [WIP] Add examples to transform docs Add examples to transform docs Jan 9, 2020
@sotte
Copy link
Contributor Author

sotte commented Jan 9, 2020

@fmassa Please verify that it works on you end. If so I think we can merge this one (after rebasing).

@fmassa
Copy link
Member

fmassa commented Jan 9, 2020

Hi @sotte

Very sorry for the delay in coming back to you.

I'll get this reviewed and merged tomorrow, thanks a lot!

@sotte
Copy link
Contributor Author

sotte commented Jan 10, 2020

Absolutely no problem. Just let me know if you need anything else.

@sotte
Copy link
Contributor Author

sotte commented Mar 4, 2020

@fmassa friendly ping :)

Let me know what I can do to get this into masteri

@fmassa
Copy link
Member

fmassa commented Mar 30, 2020

Very sorry for the delay @sotte! Was very busy working on a paper submission.

I'm putting some time aside to review this tomorrow, thanks for the patience!

Copy link
Contributor

@vincentqb vincentqb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks good to me. The functions should definitely be private for now, since we can always decide to make them public at a later time.

The tests are failing to build though.

Building wheels for collected packages: torchvision
  Building wheel for torchvision (setup.py) ... error
  ERROR: Command errored out with exit status 1:

@sotte
Copy link
Contributor Author

sotte commented Jun 8, 2020

@fmassa I was not able to build when using abspath and (I think) I was getting error messages similar to the current ones of circleCI: https://app.circleci.com/pipelines/github/pytorch/vision/2750/workflows/978a424b-8d15-4271-9ff5-df6cc9fa1360/jobs/151904/steps

setup() arguments must *always* be /-separated paths relative to the
setup.py directory, *never* absolute paths.

@NicolasHug
Copy link
Member

Closing since #3652 has been merged.

Thanks a lot for the proposal and for initiating the work on this PR @sotte !

I opened #3688 as a follow-up

@NicolasHug NicolasHug closed this Apr 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add examples of transformation to docs

7 participants