-
Notifications
You must be signed in to change notification settings - Fork 102
[BugFix] Fix het lazy stack ops #416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_common_ops | 1.0499ms | 1.0220ms | 978.4535 Ops/s | 999.1627 Ops/s | |
test_creation | 4.7999μs | 4.3883μs | 227.8811 KOps/s | 232.6772 KOps/s | |
test_creation_empty | 11.7058μs | 11.3125μs | 88.3977 KOps/s | 90.4767 KOps/s | |
test_creation_nested_1 | 21.1327μs | 20.5429μs | 48.6787 KOps/s | 49.3337 KOps/s | |
test_creation_nested_2 | 22.0027μs | 21.6266μs | 46.2394 KOps/s | 47.2263 KOps/s | |
test_clone | 28.4546μs | 27.4990μs | 36.3649 KOps/s | 37.5471 KOps/s | |
test_getitem[int] | 33.0192μs | 32.4615μs | 30.8057 KOps/s | 30.8907 KOps/s | |
test_getitem[slice_int] | 73.1895μs | 68.3819μs | 14.6238 KOps/s | 14.5562 KOps/s | |
test_getitem[range] | 66.4809μs | 65.8291μs | 15.1909 KOps/s | 15.2812 KOps/s | |
test_getitem[tuple] | 64.0397μs | 63.1665μs | 15.8312 KOps/s | 15.9300 KOps/s | |
test_getitem[list] | 72.5791μs | 56.6256μs | 17.6599 KOps/s | 17.7459 KOps/s | |
test_setitem_dim[int] | 73.1990μs | 47.1737μs | 21.1983 KOps/s | 21.8904 KOps/s | |
test_setitem_dim[slice_int] | 0.1142ms | 85.6788μs | 11.6715 KOps/s | 11.7827 KOps/s | |
test_setitem_dim[range] | 0.1356ms | 77.7719μs | 12.8581 KOps/s | 13.1120 KOps/s | |
test_setitem_dim[tuple] | 0.1192ms | 79.5123μs | 12.5767 KOps/s | 12.8940 KOps/s | |
test_setitem | 33.5835μs | 32.6713μs | 30.6079 KOps/s | 31.7410 KOps/s | |
test_set | 77.7109μs | 38.2407μs | 26.1502 KOps/s | 32.4972 KOps/s | |
test_set_shared | 0.1424ms | 0.1399ms | 7.1475 KOps/s | 7.0815 KOps/s | |
test_update | 35.7365μs | 34.1496μs | 29.2829 KOps/s | 29.7834 KOps/s | |
test_update_nested | 51.8783μs | 50.5533μs | 19.7811 KOps/s | 19.5251 KOps/s | |
test_set_nested | 42.5744μs | 41.6940μs | 23.9843 KOps/s | 24.7513 KOps/s | |
test_set_nested_new | 65.5511μs | 59.6680μs | 16.7594 KOps/s | 17.2075 KOps/s | |
test_select | 0.1008ms | 0.1002ms | 9.9768 KOps/s | 10.2213 KOps/s | |
test_creation[device0] | 1.1224ms | 0.4249ms | 2.3533 KOps/s | 2.3774 KOps/s | |
test_creation_from_tensor | 0.5498ms | 0.4027ms | 2.4830 KOps/s | 2.1453 KOps/s | |
test_add_one[memmap_tensor0] | 43.9334μs | 30.4702μs | 32.8190 KOps/s | 34.0314 KOps/s | |
test_contiguous[memmap_tensor0] | 8.7689μs | 8.1129μs | 123.2610 KOps/s | 127.5079 KOps/s | |
test_stack[memmap_tensor0] | 0.1544ms | 39.4094μs | 25.3747 KOps/s | 25.4780 KOps/s | |
test_reshape_pytree | 38.0665μs | 35.9130μs | 27.8451 KOps/s | 28.6798 KOps/s | |
test_reshape_td | 51.8483μs | 49.5414μs | 20.1851 KOps/s | 20.4285 KOps/s | |
test_view_pytree | 34.0395μs | 33.0535μs | 30.2540 KOps/s | 30.6666 KOps/s | |
test_view_td | 9.5569μs | 8.8542μs | 112.9404 KOps/s | 113.2393 KOps/s | |
test_unbind_pytree | 38.4465μs | 37.2545μs | 26.8424 KOps/s | 27.4741 KOps/s | |
test_unbind_td | 0.1540ms | 0.1500ms | 6.6673 KOps/s | 6.8372 KOps/s | |
test_split_pytree | 43.7794μs | 41.8514μs | 23.8940 KOps/s | 24.2901 KOps/s | |
test_split_td | 0.1196ms | 0.1177ms | 8.4942 KOps/s | 8.5346 KOps/s | |
test_add_pytree | 47.0733μs | 45.0446μs | 22.2002 KOps/s | 22.7341 KOps/s | |
test_add_td | 64.3831μs | 62.4276μs | 16.0186 KOps/s | 16.4166 KOps/s | |
test_distributed | 69.7990μs | 69.7990μs | 14.3269 KOps/s | 14.7931 KOps/s | |
test_tdmodule | 51.0990μs | 24.6616μs | 40.5488 KOps/s | 40.9115 KOps/s | |
test_tdmodule_dispatch | 41.2505ms | 55.7931μs | 17.9234 KOps/s | 19.4652 KOps/s | |
test_tdseq | 0.1786ms | 30.6602μs | 32.6155 KOps/s | 32.1887 KOps/s | |
test_tdseq_dispatch | 95.3990μs | 54.0545μs | 18.4999 KOps/s | 18.6912 KOps/s | |
test_instantiation_functorch | 1.6036ms | 1.5204ms | 657.7083 Ops/s | 662.6847 Ops/s | |
test_instantiation_td | 6.0842ms | 1.1985ms | 834.3755 Ops/s | 880.0086 Ops/s | |
test_exec_functorch | 0.1803ms | 0.1743ms | 5.7360 KOps/s | 5.8828 KOps/s | |
test_exec_td | 0.3038ms | 0.3011ms | 3.3215 KOps/s | 3.3892 KOps/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_common_ops | 1.7412ms | 1.4958ms | 668.5264 Ops/s | 646.9268 Ops/s | |
test_creation | 6.1950μs | 3.9354μs | 254.1028 KOps/s | 217.0127 KOps/s | |
test_creation_empty | 20.5381μs | 11.6174μs | 86.0781 KOps/s | 75.4562 KOps/s | |
test_creation_nested_1 | 36.6261μs | 23.2324μs | 43.0433 KOps/s | 38.2269 KOps/s | |
test_creation_nested_2 | 41.4152μs | 23.6129μs | 42.3497 KOps/s | 34.9997 KOps/s | |
test_clone | 45.4242μs | 33.0175μs | 30.2870 KOps/s | 27.9657 KOps/s | |
test_getitem[int] | 61.4114μs | 43.9763μs | 22.7395 KOps/s | 23.6805 KOps/s | |
test_getitem[slice_int] | 0.1119ms | 90.5720μs | 11.0409 KOps/s | 11.3994 KOps/s | |
test_getitem[range] | 0.1438ms | 0.1168ms | 8.5624 KOps/s | 8.6275 KOps/s | |
test_getitem[tuple] | 0.1011ms | 82.2942μs | 12.1515 KOps/s | 13.1245 KOps/s | |
test_getitem[list] | 0.1299ms | 0.1052ms | 9.5014 KOps/s | 9.5453 KOps/s | |
test_setitem_dim[int] | 2.1241ms | 71.9805μs | 13.8926 KOps/s | 14.8040 KOps/s | |
test_setitem_dim[slice_int] | 2.8672ms | 0.1285ms | 7.7814 KOps/s | 8.1470 KOps/s | |
test_setitem_dim[range] | 9.8842ms | 0.1392ms | 7.1861 KOps/s | 7.4844 KOps/s | |
test_setitem_dim[tuple] | 5.0997ms | 0.1100ms | 9.0939 KOps/s | 9.4267 KOps/s | |
test_setitem | 56.7643μs | 45.8520μs | 21.8093 KOps/s | 21.5682 KOps/s | |
test_set | 0.1151ms | 45.7297μs | 21.8676 KOps/s | 22.8544 KOps/s | |
test_set_shared | 0.3891ms | 0.2851ms | 3.5079 KOps/s | 3.1937 KOps/s | |
test_update | 0.1242ms | 46.7934μs | 21.3705 KOps/s | 20.8710 KOps/s | |
test_update_nested | 0.1193ms | 67.8972μs | 14.7282 KOps/s | 14.0989 KOps/s | |
test_set_nested | 90.0645μs | 58.7392μs | 17.0244 KOps/s | 17.2272 KOps/s | |
test_set_nested_new | 0.1089ms | 78.8169μs | 12.6876 KOps/s | 12.6587 KOps/s | |
test_select | 0.1709ms | 0.1252ms | 7.9867 KOps/s | 8.1071 KOps/s | |
test_creation[device0] | 1.6380ms | 0.6179ms | 1.6185 KOps/s | 1.6349 KOps/s | |
test_creation_from_tensor | 0.8791ms | 0.5796ms | 1.7253 KOps/s | 1.5934 KOps/s | |
test_add_one[memmap_tensor0] | 80.0394μs | 64.4643μs | 15.5125 KOps/s | 15.3387 KOps/s | |
test_contiguous[memmap_tensor0] | 16.1741μs | 11.6590μs | 85.7708 KOps/s | 73.6024 KOps/s | |
test_stack[memmap_tensor0] | 0.2761ms | 74.9104μs | 13.3493 KOps/s | 16.0926 KOps/s | |
test_reshape_pytree | 49.6243μs | 38.4782μs | 25.9887 KOps/s | 25.7622 KOps/s | |
test_reshape_td | 92.5585μs | 66.4194μs | 15.0558 KOps/s | 16.1451 KOps/s | |
test_view_pytree | 56.8453μs | 35.3719μs | 28.2710 KOps/s | 28.1142 KOps/s | |
test_view_td | 12.2931μs | 8.9310μs | 111.9692 KOps/s | 101.9729 KOps/s | |
test_unbind_pytree | 66.5284μs | 43.9337μs | 22.7616 KOps/s | 23.7802 KOps/s | |
test_unbind_td | 0.3461ms | 0.2153ms | 4.6439 KOps/s | 4.9311 KOps/s | |
test_split_pytree | 98.4437μs | 56.2513μs | 17.7774 KOps/s | 20.7696 KOps/s | |
test_split_td | 0.2361ms | 0.1648ms | 6.0661 KOps/s | 6.2999 KOps/s | |
test_add_pytree | 70.4255μs | 63.2403μs | 15.8127 KOps/s | 15.7246 KOps/s | |
test_add_td | 0.1833ms | 0.1113ms | 8.9843 KOps/s | 9.5092 KOps/s | |
test_distributed | 0.1625ms | 0.1625ms | 6.1538 KOps/s | 5.6721 KOps/s | |
test_tdmodule | 2.1849ms | 39.0934μs | 25.5798 KOps/s | 27.3572 KOps/s | |
test_tdmodule_dispatch | 4.5261ms | 82.1722μs | 12.1696 KOps/s | 12.4297 KOps/s | |
test_tdseq | 2.3490ms | 49.0917μs | 20.3700 KOps/s | 20.7986 KOps/s | |
test_tdseq_dispatch | 6.1115ms | 93.5635μs | 10.6879 KOps/s | 10.4861 KOps/s | |
test_instantiation_functorch | 2.2201ms | 1.8937ms | 528.0589 Ops/s | 551.3513 Ops/s | |
test_instantiation_td | 2.0751ms | 1.4785ms | 676.3503 Ops/s | 695.7380 Ops/s | |
test_exec_functorch | 0.3449ms | 0.2834ms | 3.5292 KOps/s | 3.7548 KOps/s | |
test_exec_td | 0.5581ms | 0.4600ms | 2.1740 KOps/s | 2.2161 KOps/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Note: prints of nested lazy stacks is broken