Add SQuAD2 Mocked Unit Test #1575

Nayef211 · 2022-02-04T02:11:50Z

Reference Issue: #1493

Summary

Added mocked unit test for SQuAD2 dataset
Parameterized both SQuAD tests within a single test class

Test

pytest test/datasets/test_squad.py

Nayef211 · 2022-02-04T02:12:26Z

Note this test is almost identical to #1574, so we can just review and resolve all comments on that PR first.

parmeet · 2022-02-04T15:44:01Z

Note this test is almost identical to #1574, so we can just review and resolve all comments on that PR first.

Overall LGTM! I wonder to avoid repetition, if we could just parameterize test written in #1574 to work with both the datasets?

Nayef211 · 2022-02-04T16:49:39Z

Note this test is almost identical to #1574, so we can just review and resolve all comments on that PR first.

Overall LGTM! I wonder to avoid repetition, if we could just parameterize test written in #1574 to work with both the datasets?

I think that's a good idea. Let me try to do that in this PR!

Nayef211 · 2022-02-07T16:28:37Z

Note this test is almost identical to #1574, so we can just review and resolve all comments on that PR first.

Overall LGTM! I wonder to avoid repetition, if we could just parameterize test written in #1574 to work with both the datasets?

@parmeet lmk how this looks!

parmeet

LGTM!

parmeet · 2022-02-07T16:49:27Z

Overall LGTM! I wonder to avoid repetition, if we could just parameterize test written in #1574 to work with both the datasets?

@parmeet lmk how this looks!

This looks good @Nayef211. I think we could follow similar approach for other datasets (AmazonReview, YelpPreview) which are same semantically but differ a bit in their content (which we are mocking anyway).

Nayef211 · 2022-02-07T19:20:20Z

This looks good @Nayef211. I think we could follow similar approach for other datasets (AmazonReview, YelpPreview) which are same semantically but differ a bit in their content (which we are mocking anyway).

I think that's a good point. My only concern then is that it would make the organization of tests a bit more difficult. Right now, looking at the file/class name of the test is enough to tell you what the file/class is testing. If we do semantically group datasets for testing, the previous statement would not hold true. Wdyt?

parmeet · 2022-02-07T20:36:43Z

If we do semantically group datasets for testing, the previous statement would not hold true. Wdyt?

It's a good point. I guess I din't fully think through it. So I guess the question is whether we want to maintain one test file per dataset or if it is ok to break this norm in case the datasets differ only in certain ways (like SQuaD1 and SQuaD2 differ only in version), AmazonReviewFull and Polarity differ only in num-classes and data-points? I am not sure if I have good answers to them, I was more coming from duplicating code standpoint :). I think it is fine to keep the status-quo as of now, and see it as improvement topic once all the tests are in.

Nayef211 · 2022-02-07T20:42:52Z

It's a good point. I guess I din't fully think through it. So I guess the question is whether we want to maintain one test file per dataset or if it is ok to break this norm in case the datasets differ only in certain ways (like SQuaD1 and SQuaD2 differ only in version), AmazonReviewFull and Polarity differ only in num-classes and data-points? I am not sure if I have good answers to them, I was more coming from duplicating code standpoint :). I think it is fine to keep the status-quo as of now, and see it as improvement topic once all the tests are in.

Gotcha, I think what you're suggesting also makes sense. As a compromise maybe we can group datasets that are very similar (i.e. AmazonReviewFull and AmazonReviewPolarity) but not group all datasets that are semantically the same (i.e. don't group YelpReviewFull with AmazonReviewFull). This way the file/class names will still be representative of what we are testing even if one file contains parameterized tests for multiple datasets. Let me add this as a follow-up item

parmeet · 2022-02-07T22:04:22Z

Gotcha, I think what you're suggesting also makes sense. As a compromise maybe we can group datasets that are very similar (i.e. AmazonReviewFull and AmazonReviewPolarity) but not group all datasets that are semantically the same (i.e. don't group YelpReviewFull with AmazonReviewFull). This way the file/class names will still be representative of what we are testing even if one file contains parameterized tests for multiple datasets. Let me add this as a follow-up item

This sounds like a good middle ground for now :)

Added squad2 tests

ecb2cf8

pytorch-bot bot added the ciflow/default label Feb 4, 2022

facebook-github-bot added the cla signed label Feb 4, 2022

Nayef211 mentioned this pull request Feb 4, 2022

Revamp TorchText Dataset Testing Strategy #1493

Closed

27 tasks

nayef211 added 3 commits February 4, 2022 13:04

Merge branch 'main' into test/mock_squad2

9e89191

Merge branch 'main' into test/mock_squad2

b9b2f98

Parameterized squad1 and squad2 dataset tests

5a67578

Nayef211 mentioned this pull request Feb 7, 2022

Merge YelpReviewPolarity and YelpReviewFull Mocked Unit Tests #1567

Merged

parmeet approved these changes Feb 7, 2022

View reviewed changes

Nayef211 merged commit d8a0df4 into pytorch:main Feb 7, 2022

Nayef211 deleted the test/mock_squad2 branch February 7, 2022 19:21

vcm2114 added a commit to vcm2114/text that referenced this pull request Feb 8, 2022

followed pytorch#1575, merge YelpReviews mock unit tests

c56a3e1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add SQuAD2 Mocked Unit Test #1575

Add SQuAD2 Mocked Unit Test #1575

Uh oh!

Nayef211 commented Feb 4, 2022 •

edited

Loading

Uh oh!

Nayef211 commented Feb 4, 2022

Uh oh!

parmeet commented Feb 4, 2022

Uh oh!

Nayef211 commented Feb 4, 2022

Uh oh!

Nayef211 commented Feb 7, 2022

Uh oh!

parmeet left a comment

Uh oh!

parmeet commented Feb 7, 2022

Uh oh!

Nayef211 commented Feb 7, 2022

Uh oh!

parmeet commented Feb 7, 2022

Uh oh!

Nayef211 commented Feb 7, 2022 •

edited

Loading

Uh oh!

parmeet commented Feb 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add SQuAD2 Mocked Unit Test #1575

Add SQuAD2 Mocked Unit Test #1575

Uh oh!

Conversation

Nayef211 commented Feb 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test

Uh oh!

Nayef211 commented Feb 4, 2022

Uh oh!

parmeet commented Feb 4, 2022

Uh oh!

Nayef211 commented Feb 4, 2022

Uh oh!

Nayef211 commented Feb 7, 2022

Uh oh!

parmeet left a comment

Choose a reason for hiding this comment

Uh oh!

parmeet commented Feb 7, 2022

Uh oh!

Nayef211 commented Feb 7, 2022

Uh oh!

parmeet commented Feb 7, 2022

Uh oh!

Nayef211 commented Feb 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

parmeet commented Feb 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Nayef211 commented Feb 4, 2022 •

edited

Loading

Nayef211 commented Feb 7, 2022 •

edited

Loading