-
Notifications
You must be signed in to change notification settings - Fork 7.2k
fix padding for degenerate segmentation masks #6542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| def pad_segmentation_mask( | ||
| segmentation_mask: torch.Tensor, padding: Union[int, List[int]], padding_mode: str = "constant" | ||
| ) -> torch.Tensor: | ||
| num_masks, height, width = segmentation_mask.shape[-3:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why we replicated pad_image_tensor here if the only difference is that we set a fixed fill=0.
This reverts commit 18c5e4f.
pmeier
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the ones I fixed, there are more operators that use the .view(-1, ...) idiom.
img = img.view(-1, num_channels, height, width) img = img.view(-1, num_channels, height, width)
I've xfailed some tests for them. Plus, there are more bounding box ops that do this. But we don't need to worry about them, because .view(-1, ...) only fails if there is a 0 in the ... part. For bounding boxes we only do
>>> t = torch.empty(0, 4)
>>> t.view(-1, 4)
tensor([], size=(0, 4))The ambiguity only arises for segmentation masks, which use the channel dimension as number of objects and that can be 0.
>>> t = torch.empty(4, 0, 16, 16)
>>> t.view(-1, 0, 16, 16)
RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0, 16, 16] because the unspecified dimension size -1 can be any value and is ambiguousThat might be a good reason to have something in the kernel other than just calling the image kernel. But we can discuss this later.
| # transforms.RandomRotation(degrees=(-45, 45)), | ||
| # transforms.RandomAffine(degrees=(-45, 45)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They don't work since the new input data generation also adds degenerate shapes and that is not supported yet by the underlying kernels.
| masks = make_segmentation_mask((32, 24)) | ||
| ohe_masks = features.SegmentationMask(torch.randint(0, 2, size=(6, 32, 24))) | ||
| sample = [image, bboxes, label, ohe_label, masks, ohe_masks] | ||
| masks = make_segmentation_mask((32, 24), num_objects=6) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently only work with object detection masks. They were called ohe_masks here before. I've renamed them and removed the tests for what was masks before.
| size = size or torch.randint(16, 33, (2,)).tolist() | ||
| shape = (*extra_dims, 1, *size) | ||
| data = make_tensor(shape, low=0, high=num_categories, dtype=dtype) | ||
| def make_segmentation_mask(size=None, *, num_objects=None, extra_dims=(), dtype=torch.uint8): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functionality generated masks for segmentation tasks, which we currently don't support.
| incorrect_expected_segmentation_mask_setup = pytest.mark.xfail( | ||
| reason="This test fails because the expected result computation is wrong. Fix ASAP.", | ||
| strict=False, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to get the CI green for now. As the text implies we should get to this ASAP. But doing it now would slow down the reference training.
vfdev-5
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK to me
| padded_image = _FT.pad( | ||
| img=img.view(-1, num_channels, height, width), padding=padding, fill=fill, padding_mode=padding_mode | ||
| ) | ||
| left, right, top, bottom = _FT._parse_pad_padding(padding) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this part of code can be put into else part to avoid parsing twice for normal case
datumbox
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I'm pretty sure we don't need to clone but let's check this separately.
Summary: * fix padding for degenerate segmentation masks * extend test data degeneration to degenerate inputs * add even more degenerate shapes * simplify kernel * [SKIP CI] only GHA * add more degenerate segmentation masks * fix segmentation mask generation * xfail some tests * Revert "simplify kernel" This reverts commit 18c5e4f. * fix resize for degenerate inputs * [SKIP CI] CircleCI * fix RandomIoUCrop test * [SKIP CI] CircleCI * cleanup * [SKIP CI] CircleCI * add perf TODO comments * [SKIP CI] CircleCI Reviewed By: YosuaMichael Differential Revision: D39381957 fbshipit-source-id: 05a0da7ead77f33bc7ec7423271693a9aef6ad7e
Before this PR, padding degenerate segmentation masks failed:
I fixed the kernel and extended the kernel tests for all input types (images, segmentation masks, bounding boxes, ...) to also check the degenerate case.