-
Notifications
You must be signed in to change notification settings - Fork 7.2k
New tests for ImageNet dataset #3543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for the feedback @pmeier , I think the failures are unrelated and this is ready for reviews |
pmeier
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Three minor comments. Otherwise LGTM!
test/datasets_utils.py
Outdated
| special_kwargs, other_kwargs = self._split_kwargs(kwargs) | ||
| if "download" in self._HAS_SPECIAL_KWARG: | ||
| special_kwargs["download"] = False | ||
| special_kwargs["download"] = None if self.DATASET_CLASS.__name__ == 'ImageNet' else False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? With download=False the dataset will emit a warning, but should not behave anything different. Given that this warning is emitted for a long time now, I wonder if we can remove the download flag from ImageNet all together.
If we really want to avoid, I suggest we change L316 to honor a explicitly passed download
special_kwargs.setdefault("download", False)and overwrite create_dataset in the ImageNetTestCase to always pass download=None
@contextlib.contextmanager
def create_dataset(self, *args, **kwargs):
kwargs.setdefault("download", None)
with super().create_dataset(*args, **kwargs) as (dataset, info):
yield dataset, infoThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and overwrite create_dataset in the ImageNetTestCase to always pass download=None
I went for a simpler solution which is to only override the default if download exists and if its default is True-y. This way, when the default is False or None, it doesn't get overridden.
This still avoids hardcoding the 'ImageNet' name which I believe was the main issue here
test/test_datasets.py
Outdated
| root=tmpdir, | ||
| name=tmpdir / 'train' / wnid / wnid, | ||
| file_name_fn=lambda image_idx: f"{wnid}_{image_idx}.JPEG", | ||
| num_examples=1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Instead of hard coding it here, maybe set num_examples within each branch and simply return this. Bonus: If you change the number of examples dependent on the split you get a little better "coverage" with little cost.
test/test_datasets.py
Outdated
|
|
||
| class ImageNetTestCase(datasets_utils.ImageDatasetTestCase): | ||
| DATASET_CLASS = datasets.ImageNet | ||
| REQUIRED_PACKAGES = ['scipy'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Although this should not make a difference here, one should always avoid mutable class attributes unless this explicitly needed.
|
Thanks for the review! |
Reviewed By: fmassa Differential Revision: D27127989 fbshipit-source-id: c21ba8a29c71a4bb9efa4bb1ab8713c3a9809842
This PR ports the ImageNet tests to the new test infrastructure.
Addresses part of #3531