Skip to content

Conversation

@stephenyan1231
Copy link
Contributor

@stephenyan1231 stephenyan1231 commented Sep 8, 2019

  • In __init__() method of class VideoClips, two arguments frame_rate and _precomputed_metadata are not yet exposed in dataset classes for hmdb51, ucf101 and kinetics400. This PR updates dataset classes to expose those 2 arguments.
  • Add a class VisionVideoDataset to abstract out API get_metadata(). The caller can retrieve video dataset metadata and decide how to cache it. This is dependent on a previous PR ([video reader] inception commit #1303)

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

I've made some inline comments, let me know what you think


def __init__(self, root, annotation_path, frames_per_clip, step_between_clips=1,
fold=1, train=True, transform=None):
frame_rate=None, precomputed_metadata=None, precomputed_metadata_filepath=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be ok exposing the _precomputed_metadata as a private argument in the constructor for now, and also expose the frame_rate, but I think that the precomputed_metadata_filepath and save_metadata_filepath should live outside of the dataset and be handled by the training code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. @fmassa

@codecov-io
Copy link

codecov-io commented Sep 17, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@04f70c1). Click here to learn what that means.
The diff coverage is 50%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #1310   +/-   ##
=========================================
  Coverage          ?   65.69%           
=========================================
  Files             ?       75           
  Lines             ?     5801           
  Branches          ?      888           
=========================================
  Hits              ?     3811           
  Misses            ?     1723           
  Partials          ?      267
Impacted Files Coverage Δ
torchvision/datasets/vision.py 46.87% <42.85%> (ø)
torchvision/datasets/ucf101.py 24.44% <50%> (ø)
torchvision/datasets/hmdb51.py 27.08% <50%> (ø)
torchvision/datasets/kinetics.py 33.33% <60%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 04f70c1...caa4a18. Read the comment docs.

@fmassa
Copy link
Member

fmassa commented Sep 17, 2019

One thing that bothers me with the current approach is that there is no validation whatsoever that the precomputed_metadata and the samples (that have been computed from the files) do actually match anyhow.

I see a few possibilities:

  • enrich the metadata field from Dataset to contain samples as well (or some function of it), and assert that they have the same number of elements as video_clips (and make it start with an underscore, so _precomputed_metadata instead of precomputed_metadata
  • entirely pickle the Dataset instead of adding support for it to construct from metadata. leave to the training code to validate that they actually match
  • add a @classmethod called from_metadata that constructs dataset from the metadata. This keeps the constructor simpler, but will also require that we store more information in the metadata, probably the root, classes and samples features. In this case, the from_metadata classmethod would take as arguments metadata, transforms and maybe frames_per_clip and step_between_clips.

Thoughts?

@stephenyan1231
Copy link
Contributor Author

Since both PR 1303 and current PR 1310 modify common files such as video_utils.py, I close current PR 1310, and merge changes into PR 1303 for easier develop and code review. @fmassa

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants