Skip to content

Conversation

@NicolasHug
Copy link
Member

This PR ports the ImageNet tests to the new test infrastructure.

Addresses part of #3531

@NicolasHug NicolasHug changed the title WIP New tests for ImageNet dataset New tests for ImageNet dataset Mar 11, 2021
@NicolasHug
Copy link
Member Author

Thanks for the feedback @pmeier , I think the failures are unrelated and this is ready for reviews

Copy link
Contributor

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three minor comments. Otherwise LGTM!

special_kwargs, other_kwargs = self._split_kwargs(kwargs)
if "download" in self._HAS_SPECIAL_KWARG:
special_kwargs["download"] = False
special_kwargs["download"] = None if self.DATASET_CLASS.__name__ == 'ImageNet' else False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? With download=False the dataset will emit a warning, but should not behave anything different. Given that this warning is emitted for a long time now, I wonder if we can remove the download flag from ImageNet all together.

If we really want to avoid, I suggest we change L316 to honor a explicitly passed download

special_kwargs.setdefault("download", False)

and overwrite create_dataset in the ImageNetTestCase to always pass download=None

@contextlib.contextmanager
def create_dataset(self, *args, **kwargs):
    kwargs.setdefault("download", None)
    with super().create_dataset(*args, **kwargs) as (dataset, info):
        yield dataset, info

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

It's to avoid the warning

Given that this warning is emitted for a long time now, I wonder if we can remove the download flag from ImageNet all together.

Agreed. Thoughs @fmassa? The warning has been introduced in #1457 in v0.5

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and overwrite create_dataset in the ImageNetTestCase to always pass download=None

I went for a simpler solution which is to only override the default if download exists and if its default is True-y. This way, when the default is False or None, it doesn't get overridden.

This still avoids hardcoding the 'ImageNet' name which I believe was the main issue here

root=tmpdir,
name=tmpdir / 'train' / wnid / wnid,
file_name_fn=lambda image_idx: f"{wnid}_{image_idx}.JPEG",
num_examples=1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Instead of hard coding it here, maybe set num_examples within each branch and simply return this. Bonus: If you change the number of examples dependent on the split you get a little better "coverage" with little cost.


class ImageNetTestCase(datasets_utils.ImageDatasetTestCase):
DATASET_CLASS = datasets.ImageNet
REQUIRED_PACKAGES = ['scipy']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Although this should not make a difference here, one should always avoid mutable class attributes unless this explicitly needed.

@NicolasHug NicolasHug merged commit 240792d into pytorch:master Mar 11, 2021
@NicolasHug NicolasHug deleted the imagenet branch March 11, 2021 17:36
@NicolasHug
Copy link
Member Author

Thanks for the review!

facebook-github-bot pushed a commit that referenced this pull request Mar 18, 2021
Reviewed By: fmassa

Differential Revision: D27127989

fbshipit-source-id: c21ba8a29c71a4bb9efa4bb1ab8713c3a9809842
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants