Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Composite.batch_size #2597

Merged
merged 2 commits into from
Nov 24, 2024
Merged

[Feature] Composite.batch_size #2597

merged 2 commits into from
Nov 24, 2024

Conversation

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Nov 23, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2597

Note: Links to docs will display an error until the docs builds have been completed.

❌ 13 New Failures, 5 Unrelated Failures

As of commit ddc878b with merge base 152bc81 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 23, 2024
[ghstack-poisoned]
@vmoens vmoens merged commit ddc878b into gh/vmoens/44/base Nov 24, 2024
40 of 58 checks passed
vmoens added a commit that referenced this pull request Nov 24, 2024
ghstack-source-id: 621884a559a71e80a4be36c7ba984fd08be47952
Pull Request resolved: #2597
@vmoens vmoens deleted the gh/vmoens/44/head branch November 24, 2024 08:16
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4230s 0.4202s 2.3796 Ops/s 2.2426 Ops/s $\textbf{\color{#35bf28}+6.11\%}$
test_transformed 0.6046s 0.5977s 1.6731 Ops/s 1.6590 Ops/s $\color{#35bf28}+0.85\%$
test_serial 1.3402s 1.3338s 0.7497 Ops/s 0.7271 Ops/s $\color{#35bf28}+3.12\%$
test_parallel 1.2979s 1.2852s 0.7781 Ops/s 0.7684 Ops/s $\color{#35bf28}+1.27\%$
test_step_mdp_speed[True-True-True-True-True] 0.2634ms 27.1115μs 36.8848 KOps/s 36.9594 KOps/s $\color{#d91a1a}-0.20\%$
test_step_mdp_speed[True-True-True-True-False] 66.5550μs 16.1549μs 61.9007 KOps/s 62.8139 KOps/s $\color{#d91a1a}-1.45\%$
test_step_mdp_speed[True-True-True-False-True] 89.6040μs 15.2992μs 65.3627 KOps/s 64.6951 KOps/s $\color{#35bf28}+1.03\%$
test_step_mdp_speed[True-True-True-False-False] 99.3900μs 8.9857μs 111.2880 KOps/s 112.7694 KOps/s $\color{#d91a1a}-1.31\%$
test_step_mdp_speed[True-True-False-True-True] 88.4760μs 29.0760μs 34.3926 KOps/s 34.4733 KOps/s $\color{#d91a1a}-0.23\%$
test_step_mdp_speed[True-True-False-True-False] 58.4090μs 17.7066μs 56.4761 KOps/s 57.2759 KOps/s $\color{#d91a1a}-1.40\%$
test_step_mdp_speed[True-True-False-False-True] 80.2600μs 17.4405μs 57.3379 KOps/s 57.2637 KOps/s $\color{#35bf28}+0.13\%$
test_step_mdp_speed[True-True-False-False-False] 48.3610μs 10.8030μs 92.5666 KOps/s 94.0365 KOps/s $\color{#d91a1a}-1.56\%$
test_step_mdp_speed[True-False-True-True-True] 89.1060μs 30.7924μs 32.4756 KOps/s 32.4166 KOps/s $\color{#35bf28}+0.18\%$
test_step_mdp_speed[True-False-True-True-False] 60.4940μs 19.4242μs 51.4822 KOps/s 52.1000 KOps/s $\color{#d91a1a}-1.19\%$
test_step_mdp_speed[True-False-True-False-True] 73.0370μs 17.0538μs 58.6381 KOps/s 58.2653 KOps/s $\color{#35bf28}+0.64\%$
test_step_mdp_speed[True-False-True-False-False] 38.3120μs 10.6798μs 93.6345 KOps/s 94.0626 KOps/s $\color{#d91a1a}-0.46\%$
test_step_mdp_speed[True-False-False-True-True] 86.9330μs 32.2576μs 31.0004 KOps/s 31.0981 KOps/s $\color{#d91a1a}-0.31\%$
test_step_mdp_speed[True-False-False-True-False] 74.3790μs 21.1210μs 47.3462 KOps/s 48.1733 KOps/s $\color{#d91a1a}-1.72\%$
test_step_mdp_speed[True-False-False-False-True] 59.7220μs 18.5670μs 53.8589 KOps/s 53.1714 KOps/s $\color{#35bf28}+1.29\%$
test_step_mdp_speed[True-False-False-False-False] 69.9710μs 12.3097μs 81.2368 KOps/s 81.6780 KOps/s $\color{#d91a1a}-0.54\%$
test_step_mdp_speed[False-True-True-True-True] 74.7100μs 30.4179μs 32.8753 KOps/s 32.6089 KOps/s $\color{#35bf28}+0.82\%$
test_step_mdp_speed[False-True-True-True-False] 0.6340ms 19.3840μs 51.5888 KOps/s 51.5539 KOps/s $\color{#35bf28}+0.07\%$
test_step_mdp_speed[False-True-True-False-True] 62.0660μs 19.4613μs 51.3839 KOps/s 51.1815 KOps/s $\color{#35bf28}+0.40\%$
test_step_mdp_speed[False-True-True-False-False] 58.3690μs 11.9116μs 83.9517 KOps/s 82.4079 KOps/s $\color{#35bf28}+1.87\%$
test_step_mdp_speed[False-True-False-True-True] 72.6260μs 32.0413μs 31.2097 KOps/s 30.4933 KOps/s $\color{#35bf28}+2.35\%$
test_step_mdp_speed[False-True-False-True-False] 77.3850μs 20.9763μs 47.6729 KOps/s 48.0387 KOps/s $\color{#d91a1a}-0.76\%$
test_step_mdp_speed[False-True-False-False-True] 2.8675ms 21.2863μs 46.9785 KOps/s 46.8264 KOps/s $\color{#35bf28}+0.32\%$
test_step_mdp_speed[False-True-False-False-False] 52.7080μs 13.5806μs 73.6343 KOps/s 73.7520 KOps/s $\color{#d91a1a}-0.16\%$
test_step_mdp_speed[False-False-True-True-True] 97.1120μs 34.0485μs 29.3698 KOps/s 29.7661 KOps/s $\color{#d91a1a}-1.33\%$
test_step_mdp_speed[False-False-True-True-False] 56.3150μs 22.6951μs 44.0623 KOps/s 43.8346 KOps/s $\color{#35bf28}+0.52\%$
test_step_mdp_speed[False-False-True-False-True] 80.7950μs 21.0386μs 47.5316 KOps/s 47.7173 KOps/s $\color{#d91a1a}-0.39\%$
test_step_mdp_speed[False-False-True-False-False] 56.8170μs 13.6240μs 73.3998 KOps/s 73.4802 KOps/s $\color{#d91a1a}-0.11\%$
test_step_mdp_speed[False-False-False-True-True] 98.2440μs 35.3452μs 28.2924 KOps/s 28.6348 KOps/s $\color{#d91a1a}-1.20\%$
test_step_mdp_speed[False-False-False-True-False] 67.9480μs 24.0456μs 41.5876 KOps/s 41.5787 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[False-False-False-False-True] 64.6410μs 22.4287μs 44.5857 KOps/s 44.7090 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[False-False-False-False-False] 56.6860μs 15.1795μs 65.8782 KOps/s 66.1637 KOps/s $\color{#d91a1a}-0.43\%$
test_values[generalized_advantage_estimate-True-True] 11.7213ms 9.7406ms 102.6634 Ops/s 101.8899 Ops/s $\color{#35bf28}+0.76\%$
test_values[vec_generalized_advantage_estimate-True-True] 38.5739ms 33.5705ms 29.7881 Ops/s 29.4150 Ops/s $\color{#35bf28}+1.27\%$
test_values[td0_return_estimate-False-False] 0.2677ms 0.2117ms 4.7227 KOps/s 5.0189 KOps/s $\textbf{\color{#d91a1a}-5.90\%}$
test_values[td1_return_estimate-False-False] 24.9613ms 24.5208ms 40.7817 Ops/s 40.7111 Ops/s $\color{#35bf28}+0.17\%$
test_values[vec_td1_return_estimate-False-False] 35.4216ms 33.6116ms 29.7516 Ops/s 29.5776 Ops/s $\color{#35bf28}+0.59\%$
test_values[td_lambda_return_estimate-True-False] 38.0106ms 34.9089ms 28.6460 Ops/s 27.8690 Ops/s $\color{#35bf28}+2.79\%$
test_values[vec_td_lambda_return_estimate-True-False] 37.5048ms 33.6783ms 29.6927 Ops/s 29.8687 Ops/s $\color{#d91a1a}-0.59\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.5740ms 8.3291ms 120.0613 Ops/s 116.7139 Ops/s $\color{#35bf28}+2.87\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.0569ms 1.7905ms 558.5138 Ops/s 550.1745 Ops/s $\color{#35bf28}+1.52\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.6595ms 0.3583ms 2.7909 KOps/s 2.7032 KOps/s $\color{#35bf28}+3.24\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 40.2249ms 38.2618ms 26.1357 Ops/s 23.3584 Ops/s $\textbf{\color{#35bf28}+11.89\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.9918ms 3.0592ms 326.8784 Ops/s 326.1022 Ops/s $\color{#35bf28}+0.24\%$
test_dqn_speed[False-None] 0.1153s 1.5064ms 663.8154 Ops/s 728.4292 Ops/s $\textbf{\color{#d91a1a}-8.87\%}$
test_dqn_speed[False-backward] 1.9465ms 1.8282ms 546.9743 Ops/s 539.9654 Ops/s $\color{#35bf28}+1.30\%$
test_dqn_speed[True-None] 0.8090ms 0.4692ms 2.1313 KOps/s 2.0967 KOps/s $\color{#35bf28}+1.65\%$
test_dqn_speed[True-backward] 0.9706ms 0.9169ms 1.0906 KOps/s 938.5640 Ops/s $\textbf{\color{#35bf28}+16.20\%}$
test_dqn_speed[reduce-overhead-None] 0.6176ms 0.4688ms 2.1333 KOps/s 2.1134 KOps/s $\color{#35bf28}+0.94\%$
test_dqn_speed[reduce-overhead-backward] 0.9648ms 0.9136ms 1.0946 KOps/s 1.1040 KOps/s $\color{#d91a1a}-0.85\%$
test_ddpg_speed[False-None] 3.7334ms 2.8020ms 356.8940 Ops/s 359.6083 Ops/s $\color{#d91a1a}-0.75\%$
test_ddpg_speed[False-backward] 4.0511ms 3.9213ms 255.0155 Ops/s 250.7696 Ops/s $\color{#35bf28}+1.69\%$
test_ddpg_speed[True-None] 1.5892ms 1.0084ms 991.7153 Ops/s 988.3484 Ops/s $\color{#35bf28}+0.34\%$
test_ddpg_speed[True-backward] 2.1046ms 1.9516ms 512.3971 Ops/s 512.0839 Ops/s $\color{#35bf28}+0.06\%$
test_ddpg_speed[reduce-overhead-None] 1.1860ms 1.0064ms 993.6465 Ops/s 980.0113 Ops/s $\color{#35bf28}+1.39\%$
test_ddpg_speed[reduce-overhead-backward] 2.0676ms 1.9455ms 514.0023 Ops/s 448.4877 Ops/s $\textbf{\color{#35bf28}+14.61\%}$
test_sac_speed[False-None] 9.6834ms 8.0686ms 123.9374 Ops/s 124.2194 Ops/s $\color{#d91a1a}-0.23\%$
test_sac_speed[False-backward] 12.1007ms 11.1666ms 89.5529 Ops/s 92.0054 Ops/s $\color{#d91a1a}-2.67\%$
test_sac_speed[True-None] 2.1318ms 1.8775ms 532.6216 Ops/s 520.8131 Ops/s $\color{#35bf28}+2.27\%$
test_sac_speed[True-backward] 3.9300ms 3.7212ms 268.7312 Ops/s 279.5880 Ops/s $\color{#d91a1a}-3.88\%$
test_sac_speed[reduce-overhead-None] 2.1421ms 1.8977ms 526.9591 Ops/s 527.1500 Ops/s $\color{#d91a1a}-0.04\%$
test_sac_speed[reduce-overhead-backward] 3.8096ms 3.6679ms 272.6333 Ops/s 275.0714 Ops/s $\color{#d91a1a}-0.89\%$
test_redq_speed[False-None] 14.3737ms 13.1677ms 75.9437 Ops/s 78.8714 Ops/s $\color{#d91a1a}-3.71\%$
test_redq_speed[False-backward] 23.7974ms 22.5607ms 44.3248 Ops/s 44.5788 Ops/s $\color{#d91a1a}-0.57\%$
test_redq_speed[True-None] 6.1494ms 5.1278ms 195.0152 Ops/s 204.0692 Ops/s $\color{#d91a1a}-4.44\%$
test_redq_speed[True-backward] 13.9843ms 12.9884ms 76.9919 Ops/s 81.8927 Ops/s $\textbf{\color{#d91a1a}-5.98\%}$
test_redq_speed[reduce-overhead-None] 5.9917ms 5.1709ms 193.3904 Ops/s 189.4164 Ops/s $\color{#35bf28}+2.10\%$
test_redq_speed[reduce-overhead-backward] 13.3569ms 13.0176ms 76.8192 Ops/s 75.4704 Ops/s $\color{#35bf28}+1.79\%$
test_redq_deprec_speed[False-None] 15.3696ms 13.4371ms 74.4206 Ops/s 73.3620 Ops/s $\color{#35bf28}+1.44\%$
test_redq_deprec_speed[False-backward] 21.8701ms 19.6343ms 50.9314 Ops/s 51.7771 Ops/s $\color{#d91a1a}-1.63\%$
test_redq_deprec_speed[True-None] 5.9054ms 3.8102ms 262.4566 Ops/s 265.2177 Ops/s $\color{#d91a1a}-1.04\%$
test_redq_deprec_speed[True-backward] 8.8095ms 8.6137ms 116.0942 Ops/s 123.0988 Ops/s $\textbf{\color{#d91a1a}-5.69\%}$
test_redq_deprec_speed[reduce-overhead-None] 8.0837ms 3.9576ms 252.6801 Ops/s 273.2517 Ops/s $\textbf{\color{#d91a1a}-7.53\%}$
test_redq_deprec_speed[reduce-overhead-backward] 9.1747ms 8.7355ms 114.4758 Ops/s 113.3285 Ops/s $\color{#35bf28}+1.01\%$
test_td3_speed[False-None] 8.4257ms 7.9155ms 126.3351 Ops/s 126.1455 Ops/s $\color{#35bf28}+0.15\%$
test_td3_speed[False-backward] 12.0504ms 10.4697ms 95.5136 Ops/s 95.8511 Ops/s $\color{#d91a1a}-0.35\%$
test_td3_speed[True-None] 1.8580ms 1.7586ms 568.6330 Ops/s 542.3586 Ops/s $\color{#35bf28}+4.84\%$
test_td3_speed[True-backward] 3.6461ms 3.4725ms 287.9764 Ops/s 290.9333 Ops/s $\color{#d91a1a}-1.02\%$
test_td3_speed[reduce-overhead-None] 1.9899ms 1.7573ms 569.0693 Ops/s 559.3852 Ops/s $\color{#35bf28}+1.73\%$
test_td3_speed[reduce-overhead-backward] 3.7857ms 3.5238ms 283.7865 Ops/s 295.5177 Ops/s $\color{#d91a1a}-3.97\%$
test_cql_speed[False-None] 36.6770ms 35.3622ms 28.2788 Ops/s 27.8643 Ops/s $\color{#35bf28}+1.49\%$
test_cql_speed[False-backward] 47.7618ms 46.0821ms 21.7004 Ops/s 21.4804 Ops/s $\color{#35bf28}+1.02\%$
test_cql_speed[True-None] 17.0020ms 16.0541ms 62.2894 Ops/s 64.4313 Ops/s $\color{#d91a1a}-3.32\%$
test_cql_speed[True-backward] 24.8100ms 23.4739ms 42.6005 Ops/s 44.5249 Ops/s $\color{#d91a1a}-4.32\%$
test_cql_speed[reduce-overhead-None] 17.5178ms 16.4076ms 60.9473 Ops/s 64.0887 Ops/s $\color{#d91a1a}-4.90\%$
test_cql_speed[reduce-overhead-backward] 25.6874ms 23.4813ms 42.5871 Ops/s 44.4418 Ops/s $\color{#d91a1a}-4.17\%$
test_a2c_speed[False-None] 10.6271ms 7.6509ms 130.7028 Ops/s 133.8615 Ops/s $\color{#d91a1a}-2.36\%$
test_a2c_speed[False-backward] 15.4442ms 15.0619ms 66.3926 Ops/s 70.3305 Ops/s $\textbf{\color{#d91a1a}-5.60\%}$
test_a2c_speed[True-None] 5.4645ms 4.8576ms 205.8649 Ops/s 232.6789 Ops/s $\textbf{\color{#d91a1a}-11.52\%}$
test_a2c_speed[True-backward] 11.9740ms 11.2693ms 88.7366 Ops/s 92.5906 Ops/s $\color{#d91a1a}-4.16\%$
test_a2c_speed[reduce-overhead-None] 5.0008ms 4.2413ms 235.7750 Ops/s 234.4248 Ops/s $\color{#35bf28}+0.58\%$
test_a2c_speed[reduce-overhead-backward] 11.9003ms 11.1707ms 89.5199 Ops/s 91.6222 Ops/s $\color{#d91a1a}-2.29\%$
test_ppo_speed[False-None] 8.1525ms 7.6527ms 130.6733 Ops/s 131.9475 Ops/s $\color{#d91a1a}-0.97\%$
test_ppo_speed[False-backward] 16.1237ms 15.4031ms 64.9221 Ops/s 66.0444 Ops/s $\color{#d91a1a}-1.70\%$
test_ppo_speed[True-None] 4.0718ms 3.8031ms 262.9436 Ops/s 265.0581 Ops/s $\color{#d91a1a}-0.80\%$
test_ppo_speed[True-backward] 10.2502ms 9.9003ms 101.0073 Ops/s 102.1986 Ops/s $\color{#d91a1a}-1.17\%$
test_ppo_speed[reduce-overhead-None] 4.3623ms 3.8037ms 262.9027 Ops/s 264.1937 Ops/s $\color{#d91a1a}-0.49\%$
test_ppo_speed[reduce-overhead-backward] 10.7194ms 9.9720ms 100.2812 Ops/s 103.0138 Ops/s $\color{#d91a1a}-2.65\%$
test_reinforce_speed[False-None] 7.7545ms 6.6563ms 150.2327 Ops/s 152.4972 Ops/s $\color{#d91a1a}-1.48\%$
test_reinforce_speed[False-backward] 12.8024ms 10.5322ms 94.9468 Ops/s 102.3251 Ops/s $\textbf{\color{#d91a1a}-7.21\%}$
test_reinforce_speed[True-None] 3.4743ms 2.8225ms 354.3005 Ops/s 359.0864 Ops/s $\color{#d91a1a}-1.33\%$
test_reinforce_speed[True-backward] 9.5984ms 8.8568ms 112.9071 Ops/s 113.5110 Ops/s $\color{#d91a1a}-0.53\%$
test_reinforce_speed[reduce-overhead-None] 3.4411ms 2.7856ms 358.9893 Ops/s 365.3238 Ops/s $\color{#d91a1a}-1.73\%$
test_reinforce_speed[reduce-overhead-backward] 9.2843ms 8.7356ms 114.4744 Ops/s 112.1207 Ops/s $\color{#35bf28}+2.10\%$
test_iql_speed[False-None] 33.4320ms 32.0906ms 31.1618 Ops/s 30.8923 Ops/s $\color{#35bf28}+0.87\%$
test_iql_speed[False-backward] 46.7417ms 45.0896ms 22.1780 Ops/s 21.9520 Ops/s $\color{#35bf28}+1.03\%$
test_iql_speed[True-None] 12.7041ms 10.9128ms 91.6355 Ops/s 87.8883 Ops/s $\color{#35bf28}+4.26\%$
test_iql_speed[True-backward] 23.0500ms 22.2734ms 44.8966 Ops/s 43.3959 Ops/s $\color{#35bf28}+3.46\%$
test_iql_speed[reduce-overhead-None] 12.6087ms 11.0759ms 90.2860 Ops/s 90.6262 Ops/s $\color{#d91a1a}-0.38\%$
test_iql_speed[reduce-overhead-backward] 23.5005ms 22.0703ms 45.3097 Ops/s 44.7259 Ops/s $\color{#35bf28}+1.31\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.8606ms 5.0481ms 198.0950 Ops/s 204.9194 Ops/s $\color{#d91a1a}-3.33\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8597ms 0.5262ms 1.9006 KOps/s 1.8978 KOps/s $\color{#35bf28}+0.15\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7478ms 0.5057ms 1.9775 KOps/s 1.9433 KOps/s $\color{#35bf28}+1.76\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.9793ms 4.7883ms 208.8421 Ops/s 203.1328 Ops/s $\color{#35bf28}+2.81\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.5951ms 0.5098ms 1.9615 KOps/s 1.8873 KOps/s $\color{#35bf28}+3.93\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6957ms 0.4876ms 2.0509 KOps/s 1.9980 KOps/s $\color{#35bf28}+2.65\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.2297ms 1.6414ms 609.2480 Ops/s 584.7531 Ops/s $\color{#35bf28}+4.19\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.1157ms 1.5826ms 631.8586 Ops/s 608.8224 Ops/s $\color{#35bf28}+3.78\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.6813ms 5.0634ms 197.4941 Ops/s 203.1919 Ops/s $\color{#d91a1a}-2.80\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.8893ms 0.6549ms 1.5270 KOps/s 1.4634 KOps/s $\color{#35bf28}+4.35\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9083ms 0.6365ms 1.5712 KOps/s 1.5345 KOps/s $\color{#35bf28}+2.39\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.4468ms 4.8842ms 204.7433 Ops/s 201.5425 Ops/s $\color{#35bf28}+1.59\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7222ms 0.5286ms 1.8920 KOps/s 1.8534 KOps/s $\color{#35bf28}+2.08\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 7.6818ms 0.5208ms 1.9201 KOps/s 1.9456 KOps/s $\color{#d91a1a}-1.31\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.3036ms 4.9541ms 201.8546 Ops/s 200.3398 Ops/s $\color{#35bf28}+0.76\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0239ms 0.5157ms 1.9392 KOps/s 1.8726 KOps/s $\color{#35bf28}+3.55\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7800ms 0.5042ms 1.9835 KOps/s 1.9426 KOps/s $\color{#35bf28}+2.10\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.3081ms 5.0634ms 197.4947 Ops/s 199.7277 Ops/s $\color{#d91a1a}-1.12\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.2695ms 0.6553ms 1.5261 KOps/s 1.4147 KOps/s $\textbf{\color{#35bf28}+7.87\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9049ms 0.6402ms 1.5621 KOps/s 1.4867 KOps/s $\textbf{\color{#35bf28}+5.07\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.5199s 14.4710ms 69.1035 Ops/s 234.5231 Ops/s $\textbf{\color{#d91a1a}-70.53\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 4.9472ms 2.2564ms 443.1890 Ops/s 429.9235 Ops/s $\color{#35bf28}+3.09\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 8.5852ms 1.4496ms 689.8541 Ops/s 689.2585 Ops/s $\color{#35bf28}+0.09\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.3762ms 4.2387ms 235.9231 Ops/s 226.4119 Ops/s $\color{#35bf28}+4.20\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 7.0672ms 2.3246ms 430.1735 Ops/s 437.8839 Ops/s $\color{#d91a1a}-1.76\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.6484ms 1.3443ms 743.8989 Ops/s 786.1514 Ops/s $\textbf{\color{#d91a1a}-5.37\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4780s 13.9280ms 71.7979 Ops/s 239.2105 Ops/s $\textbf{\color{#d91a1a}-69.99\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.7987ms 2.4922ms 401.2563 Ops/s 393.7211 Ops/s $\color{#35bf28}+1.91\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.6267ms 1.4705ms 680.0568 Ops/s 673.4217 Ops/s $\color{#35bf28}+0.99\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 11.4703ms 11.1634ms 89.5781 Ops/s 86.6967 Ops/s $\color{#35bf28}+3.32\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.9851ms 14.1826ms 70.5091 Ops/s 68.9988 Ops/s $\color{#35bf28}+2.19\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 20.6221ms 20.1382ms 49.6569 Ops/s 48.6058 Ops/s $\color{#35bf28}+2.16\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 14.5634ms 14.3924ms 69.4810 Ops/s 67.5596 Ops/s $\color{#35bf28}+2.84\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.3981ms 19.9051ms 50.2383 Ops/s 49.0421 Ops/s $\color{#35bf28}+2.44\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 15.9527ms 15.3995ms 64.9370 Ops/s 60.7070 Ops/s $\textbf{\color{#35bf28}+6.97\%}$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}17$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7288s 0.7282s 1.3732 Ops/s 1.3709 Ops/s $\color{#35bf28}+0.17\%$
test_transformed 0.9677s 0.9672s 1.0340 Ops/s 1.0298 Ops/s $\color{#35bf28}+0.40\%$
test_serial 2.1330s 2.1189s 0.4720 Ops/s 0.4862 Ops/s $\color{#d91a1a}-2.92\%$
test_parallel 2.0656s 1.9585s 0.5106 Ops/s 0.5205 Ops/s $\color{#d91a1a}-1.91\%$
test_step_mdp_speed[True-True-True-True-True] 0.1405ms 35.3172μs 28.3148 KOps/s 29.2987 KOps/s $\color{#d91a1a}-3.36\%$
test_step_mdp_speed[True-True-True-True-False] 67.1220μs 19.8761μs 50.3117 KOps/s 51.4132 KOps/s $\color{#d91a1a}-2.14\%$
test_step_mdp_speed[True-True-True-False-True] 55.2910μs 19.0557μs 52.4777 KOps/s 53.0027 KOps/s $\color{#d91a1a}-0.99\%$
test_step_mdp_speed[True-True-True-False-False] 58.4810μs 11.3292μs 88.2672 KOps/s 89.2302 KOps/s $\color{#d91a1a}-1.08\%$
test_step_mdp_speed[True-True-False-True-True] 88.7120μs 36.8298μs 27.1519 KOps/s 27.3733 KOps/s $\color{#d91a1a}-0.81\%$
test_step_mdp_speed[True-True-False-True-False] 51.8310μs 21.3745μs 46.7848 KOps/s 46.2573 KOps/s $\color{#35bf28}+1.14\%$
test_step_mdp_speed[True-True-False-False-True] 77.7220μs 21.0517μs 47.5020 KOps/s 47.0822 KOps/s $\color{#35bf28}+0.89\%$
test_step_mdp_speed[True-True-False-False-False] 57.0710μs 12.9905μs 76.9792 KOps/s 76.6275 KOps/s $\color{#35bf28}+0.46\%$
test_step_mdp_speed[True-False-True-True-True] 79.3510μs 38.2410μs 26.1499 KOps/s 26.1376 KOps/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[True-False-True-True-False] 69.5010μs 23.4098μs 42.7171 KOps/s 42.4104 KOps/s $\color{#35bf28}+0.72\%$
test_step_mdp_speed[True-False-True-False-True] 51.2810μs 20.7265μs 48.2475 KOps/s 47.9795 KOps/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[True-False-True-False-False] 52.3610μs 13.0381μs 76.6983 KOps/s 75.3582 KOps/s $\color{#35bf28}+1.78\%$
test_step_mdp_speed[True-False-False-True-True] 91.3320μs 40.5574μs 24.6564 KOps/s 24.5917 KOps/s $\color{#35bf28}+0.26\%$
test_step_mdp_speed[True-False-False-True-False] 54.9210μs 25.2301μs 39.6353 KOps/s 38.8064 KOps/s $\color{#35bf28}+2.14\%$
test_step_mdp_speed[True-False-False-False-True] 61.6320μs 23.0878μs 43.3130 KOps/s 43.5016 KOps/s $\color{#d91a1a}-0.43\%$
test_step_mdp_speed[True-False-False-False-False] 57.2320μs 14.9877μs 66.7212 KOps/s 66.6085 KOps/s $\color{#35bf28}+0.17\%$
test_step_mdp_speed[False-True-True-True-True] 72.6920μs 39.1870μs 25.5187 KOps/s 26.0358 KOps/s $\color{#d91a1a}-1.99\%$
test_step_mdp_speed[False-True-True-True-False] 54.5510μs 23.6594μs 42.2665 KOps/s 42.2571 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[False-True-True-False-True] 68.7720μs 24.2904μs 41.1685 KOps/s 40.0146 KOps/s $\color{#35bf28}+2.88\%$
test_step_mdp_speed[False-True-True-False-False] 47.1110μs 14.4997μs 68.9667 KOps/s 66.5906 KOps/s $\color{#35bf28}+3.57\%$
test_step_mdp_speed[False-True-False-True-True] 90.5520μs 41.3827μs 24.1647 KOps/s 24.8523 KOps/s $\color{#d91a1a}-2.77\%$
test_step_mdp_speed[False-True-False-True-False] 63.7510μs 25.6491μs 38.9877 KOps/s 39.0123 KOps/s $\color{#d91a1a}-0.06\%$
test_step_mdp_speed[False-True-False-False-True] 3.6106ms 26.8872μs 37.1925 KOps/s 38.1402 KOps/s $\color{#d91a1a}-2.49\%$
test_step_mdp_speed[False-True-False-False-False] 92.7220μs 16.5047μs 60.5887 KOps/s 60.1899 KOps/s $\color{#35bf28}+0.66\%$
test_step_mdp_speed[False-False-True-True-True] 75.0710μs 42.4983μs 23.5303 KOps/s 23.7375 KOps/s $\color{#d91a1a}-0.87\%$
test_step_mdp_speed[False-False-True-True-False] 77.8820μs 27.4095μs 36.4837 KOps/s 36.0908 KOps/s $\color{#35bf28}+1.09\%$
test_step_mdp_speed[False-False-True-False-True] 68.5110μs 26.5228μs 37.7035 KOps/s 38.3347 KOps/s $\color{#d91a1a}-1.65\%$
test_step_mdp_speed[False-False-True-False-False] 45.3710μs 16.7912μs 59.5549 KOps/s 60.6342 KOps/s $\color{#d91a1a}-1.78\%$
test_step_mdp_speed[False-False-False-True-True] 95.7620μs 43.5522μs 22.9609 KOps/s 22.8453 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[False-False-False-True-False] 57.6820μs 29.1028μs 34.3610 KOps/s 34.9369 KOps/s $\color{#d91a1a}-1.65\%$
test_step_mdp_speed[False-False-False-False-True] 56.9110μs 27.5645μs 36.2786 KOps/s 36.2470 KOps/s $\color{#35bf28}+0.09\%$
test_step_mdp_speed[False-False-False-False-False] 61.1120μs 18.0708μs 55.3380 KOps/s 54.6667 KOps/s $\color{#35bf28}+1.23\%$
test_values[generalized_advantage_estimate-True-True] 26.3625ms 25.3443ms 39.4566 Ops/s 39.6954 Ops/s $\color{#d91a1a}-0.60\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1182s 3.2625ms 306.5175 Ops/s 321.3091 Ops/s $\color{#d91a1a}-4.60\%$
test_values[td0_return_estimate-False-False] 0.1016ms 79.1813μs 12.6293 KOps/s 12.8691 KOps/s $\color{#d91a1a}-1.86\%$
test_values[td1_return_estimate-False-False] 58.4243ms 56.1143ms 17.8208 Ops/s 17.5677 Ops/s $\color{#35bf28}+1.44\%$
test_values[vec_td1_return_estimate-False-False] 1.3432ms 1.0831ms 923.2515 Ops/s 924.4489 Ops/s $\color{#d91a1a}-0.13\%$
test_values[td_lambda_return_estimate-True-False] 90.8848ms 88.7747ms 11.2645 Ops/s 11.3196 Ops/s $\color{#d91a1a}-0.49\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2992ms 1.0795ms 926.3337 Ops/s 924.2966 Ops/s $\color{#35bf28}+0.22\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.9942ms 24.8600ms 40.2252 Ops/s 40.7568 Ops/s $\color{#d91a1a}-1.30\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0497ms 0.7492ms 1.3348 KOps/s 1.3560 KOps/s $\color{#d91a1a}-1.56\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7477ms 0.6694ms 1.4938 KOps/s 1.5052 KOps/s $\color{#d91a1a}-0.75\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5328ms 1.4721ms 679.2810 Ops/s 681.3556 Ops/s $\color{#d91a1a}-0.30\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7292ms 0.6854ms 1.4590 KOps/s 1.4057 KOps/s $\color{#35bf28}+3.79\%$
test_dqn_speed[False-None] 7.7170ms 1.4410ms 693.9497 Ops/s 726.1379 Ops/s $\color{#d91a1a}-4.43\%$
test_dqn_speed[False-backward] 2.0812ms 2.0294ms 492.7533 Ops/s 499.5471 Ops/s $\color{#d91a1a}-1.36\%$
test_dqn_speed[True-None] 0.7055ms 0.5217ms 1.9168 KOps/s 1.8612 KOps/s $\color{#35bf28}+2.98\%$
test_dqn_speed[True-backward] 1.2334ms 1.1886ms 841.3562 Ops/s 838.3722 Ops/s $\color{#35bf28}+0.36\%$
test_dqn_speed[reduce-overhead-None] 0.5861ms 0.5383ms 1.8578 KOps/s 1.8498 KOps/s $\color{#35bf28}+0.43\%$
test_dqn_speed[reduce-overhead-backward] 1.1305ms 1.0500ms 952.3701 Ops/s 925.6872 Ops/s $\color{#35bf28}+2.88\%$
test_ddpg_speed[False-None] 3.6546ms 2.6919ms 371.4850 Ops/s 378.5182 Ops/s $\color{#d91a1a}-1.86\%$
test_ddpg_speed[False-backward] 4.4806ms 4.1094ms 243.3449 Ops/s 248.9693 Ops/s $\color{#d91a1a}-2.26\%$
test_ddpg_speed[True-None] 1.1818ms 1.1052ms 904.8501 Ops/s 942.4328 Ops/s $\color{#d91a1a}-3.99\%$
test_ddpg_speed[True-backward] 2.3582ms 2.2812ms 438.3563 Ops/s 438.5877 Ops/s $\color{#d91a1a}-0.05\%$
test_ddpg_speed[reduce-overhead-None] 1.1784ms 1.1040ms 905.7984 Ops/s 929.0156 Ops/s $\color{#d91a1a}-2.50\%$
test_ddpg_speed[reduce-overhead-backward] 2.0366ms 1.8053ms 553.9209 Ops/s 570.3648 Ops/s $\color{#d91a1a}-2.88\%$
test_sac_speed[False-None] 8.6145ms 8.0293ms 124.5440 Ops/s 131.7950 Ops/s $\textbf{\color{#d91a1a}-5.50\%}$
test_sac_speed[False-backward] 11.6510ms 11.0877ms 90.1903 Ops/s 92.0729 Ops/s $\color{#d91a1a}-2.04\%$
test_sac_speed[True-None] 1.5666ms 1.5062ms 663.9017 Ops/s 625.6620 Ops/s $\textbf{\color{#35bf28}+6.11\%}$
test_sac_speed[True-backward] 3.4078ms 3.3336ms 299.9773 Ops/s 316.5043 Ops/s $\textbf{\color{#d91a1a}-5.22\%}$
test_sac_speed[reduce-overhead-None] 22.5877ms 12.4449ms 80.3544 Ops/s 80.4863 Ops/s $\color{#d91a1a}-0.16\%$
test_sac_speed[reduce-overhead-backward] 1.5062ms 1.4643ms 682.9082 Ops/s 767.5098 Ops/s $\textbf{\color{#d91a1a}-11.02\%}$
test_redq_speed[False-None] 8.0523ms 7.2491ms 137.9484 Ops/s 136.8354 Ops/s $\color{#35bf28}+0.81\%$
test_redq_speed[False-backward] 12.1002ms 11.3699ms 87.9514 Ops/s 90.5016 Ops/s $\color{#d91a1a}-2.82\%$
test_redq_speed[True-None] 2.0171ms 1.9514ms 512.4583 Ops/s 499.5018 Ops/s $\color{#35bf28}+2.59\%$
test_redq_speed[True-backward] 3.8434ms 3.7439ms 267.0979 Ops/s 265.6549 Ops/s $\color{#35bf28}+0.54\%$
test_redq_speed[reduce-overhead-None] 2.0088ms 1.9504ms 512.7041 Ops/s 510.3883 Ops/s $\color{#35bf28}+0.45\%$
test_redq_speed[reduce-overhead-backward] 3.8090ms 3.7492ms 266.7266 Ops/s 268.4820 Ops/s $\color{#d91a1a}-0.65\%$
test_redq_deprec_speed[False-None] 9.1578ms 8.6358ms 115.7976 Ops/s 116.1769 Ops/s $\color{#d91a1a}-0.33\%$
test_redq_deprec_speed[False-backward] 12.3724ms 11.8482ms 84.4009 Ops/s 83.9221 Ops/s $\color{#35bf28}+0.57\%$
test_redq_deprec_speed[True-None] 2.3509ms 2.2639ms 441.7217 Ops/s 438.2761 Ops/s $\color{#35bf28}+0.79\%$
test_redq_deprec_speed[True-backward] 4.1332ms 4.0699ms 245.7073 Ops/s 257.4202 Ops/s $\color{#d91a1a}-4.55\%$
test_redq_deprec_speed[reduce-overhead-None] 2.4686ms 2.3045ms 433.9372 Ops/s 439.9478 Ops/s $\color{#d91a1a}-1.37\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.1304ms 4.0596ms 246.3310 Ops/s 245.1620 Ops/s $\color{#35bf28}+0.48\%$
test_td3_speed[False-None] 7.5740ms 7.4971ms 133.3846 Ops/s 134.4809 Ops/s $\color{#d91a1a}-0.82\%$
test_td3_speed[False-backward] 10.5802ms 10.0181ms 99.8195 Ops/s 100.6471 Ops/s $\color{#d91a1a}-0.82\%$
test_td3_speed[True-None] 1.5464ms 1.5333ms 652.1912 Ops/s 651.5101 Ops/s $\color{#35bf28}+0.10\%$
test_td3_speed[True-backward] 3.2688ms 3.1771ms 314.7485 Ops/s 311.9477 Ops/s $\color{#35bf28}+0.90\%$
test_td3_speed[reduce-overhead-None] 78.0453ms 24.6663ms 40.5411 Ops/s 38.8845 Ops/s $\color{#35bf28}+4.26\%$
test_td3_speed[reduce-overhead-backward] 1.3986ms 1.3588ms 735.9681 Ops/s 791.8702 Ops/s $\textbf{\color{#d91a1a}-7.06\%}$
test_cql_speed[False-None] 16.5798ms 15.6047ms 64.0833 Ops/s 65.2450 Ops/s $\color{#d91a1a}-1.78\%$
test_cql_speed[False-backward] 21.5304ms 20.9339ms 47.7695 Ops/s 48.9199 Ops/s $\color{#d91a1a}-2.35\%$
test_cql_speed[True-None] 3.1516ms 2.8500ms 350.8754 Ops/s 336.7094 Ops/s $\color{#35bf28}+4.21\%$
test_cql_speed[True-backward] 5.5521ms 5.1103ms 195.6825 Ops/s 194.6342 Ops/s $\color{#35bf28}+0.54\%$
test_cql_speed[reduce-overhead-None] 21.1090ms 12.8755ms 77.6672 Ops/s 77.1408 Ops/s $\color{#35bf28}+0.68\%$
test_cql_speed[reduce-overhead-backward] 1.6718ms 1.6086ms 621.6407 Ops/s 622.2290 Ops/s $\color{#d91a1a}-0.09\%$
test_a2c_speed[False-None] 3.1618ms 3.0724ms 325.4787 Ops/s 325.8290 Ops/s $\color{#d91a1a}-0.11\%$
test_a2c_speed[False-backward] 6.7382ms 6.1614ms 162.3019 Ops/s 160.5223 Ops/s $\color{#35bf28}+1.11\%$
test_a2c_speed[True-None] 1.0210ms 0.9748ms 1.0258 KOps/s 1.0174 KOps/s $\color{#35bf28}+0.82\%$
test_a2c_speed[True-backward] 2.9067ms 2.7345ms 365.7009 Ops/s 367.8637 Ops/s $\color{#d91a1a}-0.59\%$
test_a2c_speed[reduce-overhead-None] 0.3828s 12.0368ms 83.0785 Ops/s 89.3264 Ops/s $\textbf{\color{#d91a1a}-6.99\%}$
test_a2c_speed[reduce-overhead-backward] 1.1594ms 1.1244ms 889.4017 Ops/s 1.0202 KOps/s $\textbf{\color{#d91a1a}-12.82\%}$
test_ppo_speed[False-None] 3.6368ms 3.5306ms 283.2409 Ops/s 287.2764 Ops/s $\color{#d91a1a}-1.40\%$
test_ppo_speed[False-backward] 7.2207ms 6.8606ms 145.7592 Ops/s 151.6851 Ops/s $\color{#d91a1a}-3.91\%$
test_ppo_speed[True-None] 0.9703ms 0.9171ms 1.0904 KOps/s 1.0761 KOps/s $\color{#35bf28}+1.33\%$
test_ppo_speed[True-backward] 2.7626ms 2.6943ms 371.1529 Ops/s 397.9731 Ops/s $\textbf{\color{#d91a1a}-6.74\%}$
test_ppo_speed[reduce-overhead-None] 0.5352ms 0.4822ms 2.0736 KOps/s 1.9079 KOps/s $\textbf{\color{#35bf28}+8.69\%}$
test_ppo_speed[reduce-overhead-backward] 1.1626ms 1.1141ms 897.5731 Ops/s 897.5576 Ops/s $+0.00\%$
test_reinforce_speed[False-None] 2.2714ms 2.1267ms 470.2147 Ops/s 475.2312 Ops/s $\color{#d91a1a}-1.06\%$
test_reinforce_speed[False-backward] 3.2495ms 3.2111ms 311.4157 Ops/s 312.4943 Ops/s $\color{#d91a1a}-0.35\%$
test_reinforce_speed[True-None] 0.9988ms 0.8118ms 1.2319 KOps/s 1.2336 KOps/s $\color{#d91a1a}-0.14\%$
test_reinforce_speed[True-backward] 2.6077ms 2.5202ms 396.7880 Ops/s 395.8570 Ops/s $\color{#35bf28}+0.24\%$
test_reinforce_speed[reduce-overhead-None] 22.0569ms 11.7129ms 85.3760 Ops/s 86.8344 Ops/s $\color{#d91a1a}-1.68\%$
test_reinforce_speed[reduce-overhead-backward] 1.2284ms 1.1773ms 849.3817 Ops/s 852.6250 Ops/s $\color{#d91a1a}-0.38\%$
test_iql_speed[False-None] 9.2600ms 8.8148ms 113.4450 Ops/s 115.3566 Ops/s $\color{#d91a1a}-1.66\%$
test_iql_speed[False-backward] 13.3330ms 12.7669ms 78.3273 Ops/s 79.0279 Ops/s $\color{#d91a1a}-0.89\%$
test_iql_speed[True-None] 1.8845ms 1.8089ms 552.8237 Ops/s 590.6741 Ops/s $\textbf{\color{#d91a1a}-6.41\%}$
test_iql_speed[True-backward] 4.3803ms 4.3120ms 231.9117 Ops/s 231.6225 Ops/s $\color{#35bf28}+0.12\%$
test_iql_speed[reduce-overhead-None] 20.0163ms 11.4088ms 87.6519 Ops/s 88.8974 Ops/s $\color{#d91a1a}-1.40\%$
test_iql_speed[reduce-overhead-backward] 1.6080ms 1.5614ms 640.4352 Ops/s 719.5738 Ops/s $\textbf{\color{#d91a1a}-11.00\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.5067ms 6.0493ms 165.3089 Ops/s 163.2751 Ops/s $\color{#35bf28}+1.25\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.4944ms 0.3161ms 3.1639 KOps/s 3.5247 KOps/s $\textbf{\color{#d91a1a}-10.24\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6523ms 0.2984ms 3.3515 KOps/s 3.4154 KOps/s $\color{#d91a1a}-1.87\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0655ms 5.8434ms 171.1319 Ops/s 171.1237 Ops/s $+0.00\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0469ms 0.3100ms 3.2260 KOps/s 3.3834 KOps/s $\color{#d91a1a}-4.65\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5405ms 0.2971ms 3.3664 KOps/s 3.5680 KOps/s $\textbf{\color{#d91a1a}-5.65\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6299ms 1.3949ms 716.9132 Ops/s 806.4436 Ops/s $\textbf{\color{#d91a1a}-11.10\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5309ms 1.1975ms 835.0707 Ops/s 783.5364 Ops/s $\textbf{\color{#35bf28}+6.58\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1278ms 6.0238ms 166.0072 Ops/s 166.6383 Ops/s $\color{#d91a1a}-0.38\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.4185ms 0.4532ms 2.2064 KOps/s 2.1238 KOps/s $\color{#35bf28}+3.89\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6937ms 0.4620ms 2.1644 KOps/s 2.3393 KOps/s $\textbf{\color{#d91a1a}-7.48\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0353ms 5.9065ms 169.3060 Ops/s 171.0513 Ops/s $\color{#d91a1a}-1.02\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.8174ms 0.3321ms 3.0111 KOps/s 3.3911 KOps/s $\textbf{\color{#d91a1a}-11.21\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6014ms 0.3129ms 3.1956 KOps/s 3.7180 KOps/s $\textbf{\color{#d91a1a}-14.05\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9994ms 5.8005ms 172.3998 Ops/s 169.6865 Ops/s $\color{#35bf28}+1.60\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.2666ms 0.2626ms 3.8082 KOps/s 2.8996 KOps/s $\textbf{\color{#35bf28}+31.33\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4479ms 0.2435ms 4.1063 KOps/s 3.1989 KOps/s $\textbf{\color{#35bf28}+28.37\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1823ms 6.0711ms 164.7156 Ops/s 166.8835 Ops/s $\color{#d91a1a}-1.30\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.6511ms 0.4315ms 2.3176 KOps/s 2.1865 KOps/s $\textbf{\color{#35bf28}+5.99\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 9.2952ms 0.4218ms 2.3707 KOps/s 2.1854 KOps/s $\textbf{\color{#35bf28}+8.48\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.8164ms 5.2163ms 191.7080 Ops/s 191.5114 Ops/s $\color{#35bf28}+0.10\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 6.1672ms 1.9688ms 507.9214 Ops/s 455.9262 Ops/s $\textbf{\color{#35bf28}+11.40\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 8.6215ms 1.2499ms 800.0513 Ops/s 798.7173 Ops/s $\color{#35bf28}+0.17\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4843s 14.9000ms 67.1139 Ops/s 190.2218 Ops/s $\textbf{\color{#d91a1a}-64.72\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 9.1599ms 2.0159ms 496.0602 Ops/s 452.0293 Ops/s $\textbf{\color{#35bf28}+9.74\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.4700ms 1.1785ms 848.5622 Ops/s 806.5626 Ops/s $\textbf{\color{#35bf28}+5.21\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.4771ms 5.5360ms 180.6346 Ops/s 33.0834 Ops/s $\textbf{\color{#35bf28}+446.00\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.9879ms 2.1909ms 456.4372 Ops/s 544.0089 Ops/s $\textbf{\color{#d91a1a}-16.10\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.0286ms 1.3633ms 733.5291 Ops/s 720.1979 Ops/s $\color{#35bf28}+1.85\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.6219ms 12.4211ms 80.5085 Ops/s 76.9628 Ops/s $\color{#35bf28}+4.61\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 17.0662ms 16.2972ms 61.3602 Ops/s 57.7227 Ops/s $\textbf{\color{#35bf28}+6.30\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.0990ms 17.2290ms 58.0415 Ops/s 56.2087 Ops/s $\color{#35bf28}+3.26\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 17.4858ms 16.5962ms 60.2549 Ops/s 60.3073 Ops/s $\color{#d91a1a}-0.09\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 19.0850ms 17.4951ms 57.1590 Ops/s 57.3896 Ops/s $\color{#d91a1a}-0.40\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 18.9451ms 18.0541ms 55.3891 Ops/s 56.3475 Ops/s $\color{#d91a1a}-1.70\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants