Skip to content
This repository was archived by the owner on Sep 10, 2025. It is now read-only.

Conversation

@Nayef211
Copy link
Contributor

@Nayef211 Nayef211 commented Feb 4, 2022

Reference Issue: #1493

Summary

  • Added mocked unit test for SQuAD2 dataset
  • Parameterized both SQuAD tests within a single test class

Test

pytest test/datasets/test_squad.py

@Nayef211
Copy link
Contributor Author

Nayef211 commented Feb 4, 2022

Note this test is almost identical to #1574, so we can just review and resolve all comments on that PR first.

@parmeet
Copy link
Contributor

parmeet commented Feb 4, 2022

Note this test is almost identical to #1574, so we can just review and resolve all comments on that PR first.

Overall LGTM! I wonder to avoid repetition, if we could just parameterize test written in #1574 to work with both the datasets?

@Nayef211
Copy link
Contributor Author

Nayef211 commented Feb 4, 2022

Note this test is almost identical to #1574, so we can just review and resolve all comments on that PR first.

Overall LGTM! I wonder to avoid repetition, if we could just parameterize test written in #1574 to work with both the datasets?

I think that's a good idea. Let me try to do that in this PR!

@Nayef211
Copy link
Contributor Author

Nayef211 commented Feb 7, 2022

Note this test is almost identical to #1574, so we can just review and resolve all comments on that PR first.

Overall LGTM! I wonder to avoid repetition, if we could just parameterize test written in #1574 to work with both the datasets?

@parmeet lmk how this looks!

Copy link
Contributor

@parmeet parmeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@parmeet
Copy link
Contributor

parmeet commented Feb 7, 2022

Overall LGTM! I wonder to avoid repetition, if we could just parameterize test written in #1574 to work with both the datasets?

@parmeet lmk how this looks!

This looks good @Nayef211. I think we could follow similar approach for other datasets (AmazonReview, YelpPreview) which are same semantically but differ a bit in their content (which we are mocking anyway).

@Nayef211
Copy link
Contributor Author

Nayef211 commented Feb 7, 2022

This looks good @Nayef211. I think we could follow similar approach for other datasets (AmazonReview, YelpPreview) which are same semantically but differ a bit in their content (which we are mocking anyway).

I think that's a good point. My only concern then is that it would make the organization of tests a bit more difficult. Right now, looking at the file/class name of the test is enough to tell you what the file/class is testing. If we do semantically group datasets for testing, the previous statement would not hold true. Wdyt?

@Nayef211 Nayef211 merged commit d8a0df4 into pytorch:main Feb 7, 2022
@Nayef211 Nayef211 deleted the test/mock_squad2 branch February 7, 2022 19:21
@parmeet
Copy link
Contributor

parmeet commented Feb 7, 2022

If we do semantically group datasets for testing, the previous statement would not hold true. Wdyt?

It's a good point. I guess I din't fully think through it. So I guess the question is whether we want to maintain one test file per dataset or if it is ok to break this norm in case the datasets differ only in certain ways (like SQuaD1 and SQuaD2 differ only in version), AmazonReviewFull and Polarity differ only in num-classes and data-points? I am not sure if I have good answers to them, I was more coming from duplicating code standpoint :). I think it is fine to keep the status-quo as of now, and see it as improvement topic once all the tests are in.

@Nayef211
Copy link
Contributor Author

Nayef211 commented Feb 7, 2022

It's a good point. I guess I din't fully think through it. So I guess the question is whether we want to maintain one test file per dataset or if it is ok to break this norm in case the datasets differ only in certain ways (like SQuaD1 and SQuaD2 differ only in version), AmazonReviewFull and Polarity differ only in num-classes and data-points? I am not sure if I have good answers to them, I was more coming from duplicating code standpoint :). I think it is fine to keep the status-quo as of now, and see it as improvement topic once all the tests are in.

Gotcha, I think what you're suggesting also makes sense. As a compromise maybe we can group datasets that are very similar (i.e. AmazonReviewFull and AmazonReviewPolarity) but not group all datasets that are semantically the same (i.e. don't group YelpReviewFull with AmazonReviewFull). This way the file/class names will still be representative of what we are testing even if one file contains parameterized tests for multiple datasets. Let me add this as a follow-up item

@parmeet
Copy link
Contributor

parmeet commented Feb 7, 2022

Gotcha, I think what you're suggesting also makes sense. As a compromise maybe we can group datasets that are very similar (i.e. AmazonReviewFull and AmazonReviewPolarity) but not group all datasets that are semantically the same (i.e. don't group YelpReviewFull with AmazonReviewFull). This way the file/class names will still be representative of what we are testing even if one file contains parameterized tests for multiple datasets. Let me add this as a follow-up item

This sounds like a good middle ground for now :)

vcm2114 added a commit to vcm2114/text that referenced this pull request Feb 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants