Skip to content

Conversation

@stephenyan1231
Copy link
Contributor

summary

For video model evaluation, we sample N clips from a video, and average clip predictions to get a video-level prediction.
Assume, we sample 2 clips per video. The test dataset, which has 4 videos {A,B,C,D} is illustrated below.

[A_0, A_1, B_0, B_1, C_0, C_1, D_0, D_1]

Assume we have 2 gpus. The existing DistributedSampler will distribute clips from the same video to different gpus, and make it difficult to average clip predictions.

GPU 0: 

       [A_0, B_0, C_0, D_0]

GPU 1: 

       [A_1, B_1, C_1, D_1]

We extend ShardDataset to support an optional argument group_size. When group_size=2, which will shard clips below.

GPU 0: 

        [A_0, A_1, B_0, B_1]

GPU 1: 

        [C_0, C_1, D_0, D_1]

This facilitates the averaging of clip predictions.

Unit test

python test/test_datasets_samplers.py

@codecov-io
Copy link

codecov-io commented Oct 22, 2019

Codecov Report

Merging #1512 into master will increase coverage by 0.31%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1512      +/-   ##
==========================================
+ Coverage   64.34%   64.66%   +0.31%     
==========================================
  Files          83       83              
  Lines        6454     6461       +7     
  Branches      992      992              
==========================================
+ Hits         4153     4178      +25     
+ Misses       2006     1984      -22     
- Partials      295      299       +4
Impacted Files Coverage Δ
torchvision/datasets/samplers/clip_sampler.py 79.54% <100%> (+23.98%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 937c83a...916801c. Read the comment docs.

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot for the PR Zhicheng!

@fmassa fmassa merged commit 355e9d2 into pytorch:master Oct 22, 2019
@fmassa fmassa mentioned this pull request Oct 31, 2019
fmassa pushed a commit that referenced this pull request Oct 31, 2019
* extend DistributedSampler to support group_size

* Fix lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants