Skip to content

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented May 29, 2023

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 29, 2023
@github-actions
Copy link

github-actions bot commented May 29, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 47. Improved: $\large\color{#35bf28}0$. Worsened: $\large\color{#d91a1a}1$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_common_ops 1.1415ms 1.0883ms 918.8837 Ops/s 921.6080 Ops/s $\color{#d91a1a}-0.30\%$
test_creation 3.6270μs 3.3165μs 301.5221 KOps/s 313.4310 KOps/s $\color{#d91a1a}-3.80\%$
test_creation_empty 14.3630μs 13.4348μs 74.4336 KOps/s 73.7563 KOps/s $\color{#35bf28}+0.92\%$
test_creation_nested_1 23.2270μs 21.8796μs 45.7047 KOps/s 45.2899 KOps/s $\color{#35bf28}+0.92\%$
test_creation_nested_2 25.0484μs 23.5947μs 42.3824 KOps/s 41.8106 KOps/s $\color{#35bf28}+1.37\%$
test_clone 25.1844μs 22.1842μs 45.0770 KOps/s 46.5282 KOps/s $\color{#d91a1a}-3.12\%$
test_getitem[int] 38.0323μs 26.1085μs 38.3017 KOps/s 39.3570 KOps/s $\color{#d91a1a}-2.68\%$
test_getitem[slice_int] 54.3567μs 53.6907μs 18.6252 KOps/s 18.6347 KOps/s $\color{#d91a1a}-0.05\%$
test_getitem[range] 62.7072μs 60.3363μs 16.5738 KOps/s 17.0456 KOps/s $\color{#d91a1a}-2.77\%$
test_getitem[tuple] 50.9720μs 50.3546μs 19.8592 KOps/s 20.1337 KOps/s $\color{#d91a1a}-1.36\%$
test_getitem[list] 55.2926μs 52.5155μs 19.0420 KOps/s 19.4754 KOps/s $\color{#d91a1a}-2.23\%$
test_setitem_dim[int] 69.8010μs 38.7259μs 25.8225 KOps/s 26.0694 KOps/s $\color{#d91a1a}-0.95\%$
test_setitem_dim[slice_int] 0.1795ms 69.7486μs 14.3372 KOps/s 14.4014 KOps/s $\color{#d91a1a}-0.45\%$
test_setitem_dim[range] 0.1597ms 69.2900μs 14.4321 KOps/s 14.7210 KOps/s $\color{#d91a1a}-1.96\%$
test_setitem_dim[tuple] 0.1161ms 63.4519μs 15.7600 KOps/s 16.0637 KOps/s $\color{#d91a1a}-1.89\%$
test_setitem 32.3365μs 30.6841μs 32.5902 KOps/s 33.3474 KOps/s $\color{#d91a1a}-2.27\%$
test_set 31.9554μs 30.2205μs 33.0901 KOps/s 34.0207 KOps/s $\color{#d91a1a}-2.74\%$
test_set_shared 0.1757ms 0.1708ms 5.8557 KOps/s 5.8617 KOps/s $\color{#d91a1a}-0.10\%$
test_update 40.2806μs 38.0599μs 26.2744 KOps/s 26.4293 KOps/s $\color{#d91a1a}-0.59\%$
test_update_nested 56.7748μs 54.4389μs 18.3692 KOps/s 18.5498 KOps/s $\color{#d91a1a}-0.97\%$
test_set_nested 39.7825μs 37.9983μs 26.3170 KOps/s 26.9553 KOps/s $\color{#d91a1a}-2.37\%$
test_set_nested_new 53.6797μs 51.6599μs 19.3574 KOps/s 19.5335 KOps/s $\color{#d91a1a}-0.90\%$
test_select 83.7512μs 81.7090μs 12.2386 KOps/s 12.2377 KOps/s $+0.01\%$
test_creation[device0] 1.2853ms 0.5158ms 1.9388 KOps/s 1.9563 KOps/s $\color{#d91a1a}-0.90\%$
test_creation_from_tensor 0.6195ms 0.4895ms 2.0430 KOps/s 2.0932 KOps/s $\color{#d91a1a}-2.40\%$
test_add_one[memmap_tensor0] 37.0985μs 30.1962μs 33.1167 KOps/s 33.9227 KOps/s $\color{#d91a1a}-2.38\%$
test_contiguous[memmap_tensor0] 8.5431μs 8.1057μs 123.3707 KOps/s 122.8023 KOps/s $\color{#35bf28}+0.46\%$
test_stack[memmap_tensor0] 0.2145ms 47.3707μs 21.1101 KOps/s 21.1500 KOps/s $\color{#d91a1a}-0.19\%$
test_reshape_pytree 31.2234μs 28.1779μs 35.4888 KOps/s 34.7666 KOps/s $\color{#35bf28}+2.08\%$
test_reshape_td 43.2516μs 38.8159μs 25.7626 KOps/s 25.9871 KOps/s $\color{#d91a1a}-0.86\%$
test_view_pytree 27.4504μs 25.7009μs 38.9092 KOps/s 38.3218 KOps/s $\color{#35bf28}+1.53\%$
test_view_td 7.8381μs 6.9194μs 144.5213 KOps/s 141.5328 KOps/s $\color{#35bf28}+2.11\%$
test_unbind_pytree 32.8634μs 30.4851μs 32.8029 KOps/s 32.9325 KOps/s $\color{#d91a1a}-0.39\%$
test_unbind_td 0.1487ms 0.1453ms 6.8806 KOps/s 7.0480 KOps/s $\color{#d91a1a}-2.38\%$
test_split_pytree 36.0215μs 34.4705μs 29.0103 KOps/s 28.9960 KOps/s $\color{#35bf28}+0.05\%$
test_split_td 96.3703μs 93.2677μs 10.7218 KOps/s 10.6328 KOps/s $\color{#35bf28}+0.84\%$
test_add_pytree 40.3086μs 38.3737μs 26.0595 KOps/s 26.2963 KOps/s $\color{#d91a1a}-0.90\%$
test_add_td 65.8729μs 63.0711μs 15.8551 KOps/s 15.8741 KOps/s $\color{#d91a1a}-0.12\%$
test_distributed 90.5010μs 90.5010μs 11.0496 KOps/s 11.3766 KOps/s $\color{#d91a1a}-2.87\%$
test_tdmodule 0.1259ms 24.4623μs 40.8792 KOps/s 40.8698 KOps/s $\color{#35bf28}+0.02\%$
test_tdmodule_dispatch 0.2850ms 54.4256μs 18.3737 KOps/s 18.3552 KOps/s $\color{#35bf28}+0.10\%$
test_tdseq 0.1063ms 33.3355μs 29.9981 KOps/s 30.6992 KOps/s $\color{#d91a1a}-2.28\%$
test_tdseq_dispatch 0.1412ms 64.8085μs 15.4301 KOps/s 15.7099 KOps/s $\color{#d91a1a}-1.78\%$
test_instantiation_functorch 1.3453ms 1.2860ms 777.5826 Ops/s 775.1231 Ops/s $\color{#35bf28}+0.32\%$
test_instantiation_td 1.0617ms 1.0060ms 993.9941 Ops/s 989.3293 Ops/s $\color{#35bf28}+0.47\%$
test_exec_functorch 0.1798ms 0.1590ms 6.2885 KOps/s 6.3094 KOps/s $\color{#d91a1a}-0.33\%$
test_exec_td 0.3116ms 0.3003ms 3.3300 KOps/s 3.5920 KOps/s $\textbf{\color{#d91a1a}-7.30\%}$

@github-actions
Copy link

github-actions bot commented May 29, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 47. Improved: $\large\color{#35bf28}2$. Worsened: $\large\color{#d91a1a}15$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_common_ops 2.3517ms 2.0833ms 480.0018 Ops/s 501.1851 Ops/s $\color{#d91a1a}-4.23\%$
test_creation 10.7961μs 5.8669μs 170.4475 KOps/s 181.7439 KOps/s $\textbf{\color{#d91a1a}-6.22\%}$
test_creation_empty 31.7452μs 23.5504μs 42.4622 KOps/s 43.3477 KOps/s $\color{#d91a1a}-2.04\%$
test_creation_nested_1 53.9494μs 40.5469μs 24.6628 KOps/s 24.5242 KOps/s $\color{#35bf28}+0.57\%$
test_creation_nested_2 58.6544μs 42.6719μs 23.4346 KOps/s 25.2507 KOps/s $\textbf{\color{#d91a1a}-7.19\%}$
test_clone 56.9494μs 41.1243μs 24.3165 KOps/s 24.7048 KOps/s $\color{#d91a1a}-1.57\%$
test_getitem[int] 65.6149μs 50.5641μs 19.7769 KOps/s 20.7356 KOps/s $\color{#d91a1a}-4.62\%$
test_getitem[slice_int] 0.1439ms 0.1092ms 9.1576 KOps/s 9.3641 KOps/s $\color{#d91a1a}-2.21\%$
test_getitem[range] 0.1590ms 0.1281ms 7.8071 KOps/s 7.7821 KOps/s $\color{#35bf28}+0.32\%$
test_getitem[tuple] 0.1204ms 97.0062μs 10.3086 KOps/s 10.7997 KOps/s $\color{#d91a1a}-4.55\%$
test_getitem[list] 0.1332ms 0.1104ms 9.0547 KOps/s 8.9340 KOps/s $\color{#35bf28}+1.35\%$
test_setitem_dim[int] 3.9800ms 79.0555μs 12.6493 KOps/s 12.8133 KOps/s $\color{#d91a1a}-1.28\%$
test_setitem_dim[slice_int] 3.2109ms 0.1470ms 6.8049 KOps/s 6.9334 KOps/s $\color{#d91a1a}-1.85\%$
test_setitem_dim[range] 5.0931ms 0.1461ms 6.8453 KOps/s 6.9869 KOps/s $\color{#d91a1a}-2.03\%$
test_setitem_dim[tuple] 4.2353ms 0.1261ms 7.9302 KOps/s 8.2861 KOps/s $\color{#d91a1a}-4.29\%$
test_setitem 0.1180ms 66.1301μs 15.1217 KOps/s 15.7444 KOps/s $\color{#d91a1a}-3.95\%$
test_set 90.3876μs 65.0297μs 15.3776 KOps/s 15.7241 KOps/s $\color{#d91a1a}-2.20\%$
test_set_shared 0.3757ms 0.3256ms 3.0712 KOps/s 3.0220 KOps/s $\color{#35bf28}+1.63\%$
test_update 97.9387μs 78.8688μs 12.6793 KOps/s 13.6469 KOps/s $\textbf{\color{#d91a1a}-7.09\%}$
test_update_nested 0.1588ms 0.1135ms 8.8134 KOps/s 9.6925 KOps/s $\textbf{\color{#d91a1a}-9.07\%}$
test_set_nested 0.1001ms 79.5972μs 12.5633 KOps/s 13.2773 KOps/s $\textbf{\color{#d91a1a}-5.38\%}$
test_set_nested_new 0.1630ms 0.1069ms 9.3508 KOps/s 9.6333 KOps/s $\color{#d91a1a}-2.93\%$
test_select 0.2350ms 0.1669ms 5.9925 KOps/s 6.2101 KOps/s $\color{#d91a1a}-3.50\%$
test_creation[device0] 1.7828ms 0.6308ms 1.5853 KOps/s 1.5907 KOps/s $\color{#d91a1a}-0.34\%$
test_creation_from_tensor 0.7332ms 0.6073ms 1.6467 KOps/s 1.4426 KOps/s $\textbf{\color{#35bf28}+14.15\%}$
test_add_one[memmap_tensor0] 0.1419ms 68.1872μs 14.6655 KOps/s 16.2474 KOps/s $\textbf{\color{#d91a1a}-9.74\%}$
test_contiguous[memmap_tensor0] 22.4602μs 12.5393μs 79.7494 KOps/s 84.5198 KOps/s $\textbf{\color{#d91a1a}-5.64\%}$
test_stack[memmap_tensor0] 0.2837ms 71.8406μs 13.9197 KOps/s 9.2841 KOps/s $\textbf{\color{#35bf28}+49.93\%}$
test_reshape_pytree 94.0577μs 48.5732μs 20.5875 KOps/s 21.4045 KOps/s $\color{#d91a1a}-3.82\%$
test_reshape_td 0.1110ms 73.6442μs 13.5788 KOps/s 13.6931 KOps/s $\color{#d91a1a}-0.84\%$
test_view_pytree 58.6924μs 45.1451μs 22.1508 KOps/s 22.9494 KOps/s $\color{#d91a1a}-3.48\%$
test_view_td 30.2292μs 11.9233μs 83.8695 KOps/s 85.6318 KOps/s $\color{#d91a1a}-2.06\%$
test_unbind_pytree 0.1389ms 51.3375μs 19.4789 KOps/s 19.8983 KOps/s $\color{#d91a1a}-2.11\%$
test_unbind_td 0.3776ms 0.2874ms 3.4800 KOps/s 3.5457 KOps/s $\color{#d91a1a}-1.85\%$
test_split_pytree 0.1024ms 60.2421μs 16.5997 KOps/s 17.2994 KOps/s $\color{#d91a1a}-4.04\%$
test_split_td 0.2314ms 0.1809ms 5.5275 KOps/s 5.8331 KOps/s $\textbf{\color{#d91a1a}-5.24\%}$
test_add_pytree 75.8125μs 70.0533μs 14.2748 KOps/s 14.2521 KOps/s $\color{#35bf28}+0.16\%$
test_add_td 0.1895ms 0.1483ms 6.7444 KOps/s 6.7579 KOps/s $\color{#d91a1a}-0.20\%$
test_distributed 88.5010μs 88.5010μs 11.2993 KOps/s 13.2624 KOps/s $\textbf{\color{#d91a1a}-14.80\%}$
test_tdmodule 4.0400ms 48.0766μs 20.8001 KOps/s 22.3367 KOps/s $\textbf{\color{#d91a1a}-6.88\%}$
test_tdmodule_dispatch 66.4068ms 0.1088ms 9.1894 KOps/s 10.1797 KOps/s $\textbf{\color{#d91a1a}-9.73\%}$
test_tdseq 1.0240ms 61.8390μs 16.1710 KOps/s 16.5080 KOps/s $\color{#d91a1a}-2.04\%$
test_tdseq_dispatch 1.2015ms 0.1137ms 8.7944 KOps/s 9.2990 KOps/s $\textbf{\color{#d91a1a}-5.43\%}$
test_instantiation_functorch 3.0771ms 2.3482ms 425.8499 Ops/s 442.6968 Ops/s $\color{#d91a1a}-3.81\%$
test_instantiation_td 9.7531ms 1.8575ms 538.3602 Ops/s 588.6656 Ops/s $\textbf{\color{#d91a1a}-8.55\%}$
test_exec_functorch 0.4169ms 0.3260ms 3.0677 KOps/s 3.3902 KOps/s $\textbf{\color{#d91a1a}-9.51\%}$
test_exec_td 0.6515ms 0.5673ms 1.7626 KOps/s 1.8617 KOps/s $\textbf{\color{#d91a1a}-5.32\%}$

@vmoens vmoens merged commit aebcd08 into main Jun 2, 2023
@vmoens vmoens deleted the fix_apply_lazy_td branch June 21, 2023 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants