-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: specialize dispatches for faster concrete array generation (#213)
* feat: specialize dispatches for faster concrete array generation * chore: apply formatting suggestion Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
- Loading branch information
1 parent
b6ee968
commit a17315c
Showing
2 changed files
with
37 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
a17315c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reactant.jl Benchmarks
ViT base (256 x 256 x 3 x 32)/forward/CUDA/Reactant (optimize = :after_enzyme)
1263749401
ns1322156738
ns0.96
ViT base (256 x 256 x 3 x 32)/forward/CUDA/Reactant
1254668396
ns1293942538
ns0.97
ViT base (256 x 256 x 3 x 32)/forward/CUDA/Reactant (optimize = :before_enzyme)
1218277318
ns1224868312
ns0.99
ViT base (256 x 256 x 3 x 32)/forward/CUDA/Reactant (optimize = :only_enzyme)
2376495016
ns2323944334
ns1.02
ViT base (256 x 256 x 3 x 32)/forward/CUDA/Lux
217726580
ns216612531
ns1.01
ViT base (256 x 256 x 3 x 32)/forward/CPU/Reactant (optimize = :after_enzyme)
7226166416
ns6954798003
ns1.04
ViT base (256 x 256 x 3 x 32)/forward/CPU/Reactant
5511150207
ns5103509804
ns1.08
ViT base (256 x 256 x 3 x 32)/forward/CPU/Reactant (optimize = :before_enzyme)
5102020848
ns5081171584
ns1.00
ViT base (256 x 256 x 3 x 32)/forward/CPU/Reactant (optimize = :only_enzyme)
6993217459
ns6720851214
ns1.04
ViT base (256 x 256 x 3 x 32)/forward/CPU/Lux
38085761917
ns36264215655
ns1.05
ViT small (256 x 256 x 3 x 4)/forward/CUDA/Reactant (optimize = :after_enzyme)
1208392095
ns1325295976
ns0.91
ViT small (256 x 256 x 3 x 4)/forward/CUDA/Reactant
1331979590
ns1316239703
ns1.01
ViT small (256 x 256 x 3 x 4)/forward/CUDA/Reactant (optimize = :before_enzyme)
1228565001
ns1223956642
ns1.00
ViT small (256 x 256 x 3 x 4)/forward/CUDA/Reactant (optimize = :only_enzyme)
2452231772
ns2499287891
ns0.98
ViT small (256 x 256 x 3 x 4)/forward/CUDA/Lux
8748209
ns8665141
ns1.01
ViT small (256 x 256 x 3 x 4)/forward/CPU/Reactant (optimize = :after_enzyme)
1578057500
ns1575352408
ns1.00
ViT small (256 x 256 x 3 x 4)/forward/CPU/Reactant
1557311922
ns1567227136
ns0.99
ViT small (256 x 256 x 3 x 4)/forward/CPU/Reactant (optimize = :before_enzyme)
1557684126
ns1566092027.5
ns0.99
ViT small (256 x 256 x 3 x 4)/forward/CPU/Reactant (optimize = :only_enzyme)
2769517816
ns2878841123
ns0.96
ViT small (256 x 256 x 3 x 4)/forward/CPU/Lux
3303048898.5
ns2685299362
ns1.23
ViT tiny (256 x 256 x 3 x 32)/forward/CUDA/Reactant (optimize = :after_enzyme)
1303432996
ns1239206197.5
ns1.05
ViT tiny (256 x 256 x 3 x 32)/forward/CUDA/Reactant
1292627349.5
ns1289136308
ns1.00
ViT tiny (256 x 256 x 3 x 32)/forward/CUDA/Reactant (optimize = :before_enzyme)
1312140581.5
ns1237433180
ns1.06
ViT tiny (256 x 256 x 3 x 32)/forward/CUDA/Reactant (optimize = :only_enzyme)
2608146101
ns2746675598
ns0.95
ViT tiny (256 x 256 x 3 x 32)/forward/CUDA/Lux
22645472
ns22719307
ns1.00
ViT tiny (256 x 256 x 3 x 32)/forward/CPU/Reactant (optimize = :after_enzyme)
2183323759
ns2131005675
ns1.02
ViT tiny (256 x 256 x 3 x 32)/forward/CPU/Reactant
2161824787
ns2126128561
ns1.02
ViT tiny (256 x 256 x 3 x 32)/forward/CPU/Reactant (optimize = :before_enzyme)
2150773246
ns2131061285
ns1.01
ViT tiny (256 x 256 x 3 x 32)/forward/CPU/Reactant (optimize = :only_enzyme)
3353554606
ns3402150262
ns0.99
ViT tiny (256 x 256 x 3 x 32)/forward/CPU/Lux
6032060527
ns5740208504
ns1.05
ViT tiny (256 x 256 x 3 x 4)/forward/CUDA/Reactant (optimize = :after_enzyme)
1315388210
ns1262392264.5
ns1.04
ViT tiny (256 x 256 x 3 x 4)/forward/CUDA/Reactant
1313576758.5
ns1258413265.5
ns1.04
ViT tiny (256 x 256 x 3 x 4)/forward/CUDA/Reactant (optimize = :before_enzyme)
1308732662.5
ns1270552917.5
ns1.03
ViT tiny (256 x 256 x 3 x 4)/forward/CUDA/Reactant (optimize = :only_enzyme)
2435356858
ns2586048537
ns0.94
ViT tiny (256 x 256 x 3 x 4)/forward/CUDA/Lux
6572926
ns7031315
ns0.93
ViT tiny (256 x 256 x 3 x 4)/forward/CPU/Reactant (optimize = :after_enzyme)
1416310529
ns1421898963
ns1.00
ViT tiny (256 x 256 x 3 x 4)/forward/CPU/Reactant
1409069455
ns1430099101
ns0.99
ViT tiny (256 x 256 x 3 x 4)/forward/CPU/Reactant (optimize = :before_enzyme)
1410196431
ns1422752680
ns0.99
ViT tiny (256 x 256 x 3 x 4)/forward/CPU/Reactant (optimize = :only_enzyme)
2620146990
ns2655576241
ns0.99
ViT tiny (256 x 256 x 3 x 4)/forward/CPU/Lux
1384443752
ns1274970277
ns1.09
ViT tiny (256 x 256 x 3 x 16)/forward/CUDA/Reactant (optimize = :after_enzyme)
1325713657.5
ns1274806307.5
ns1.04
ViT tiny (256 x 256 x 3 x 16)/forward/CUDA/Reactant
1268777827.5
ns1310390497
ns0.97
ViT tiny (256 x 256 x 3 x 16)/forward/CUDA/Reactant (optimize = :before_enzyme)
1294207842.5
ns1302121842
ns0.99
ViT tiny (256 x 256 x 3 x 16)/forward/CUDA/Reactant (optimize = :only_enzyme)
2374603722
ns2624706431
ns0.90
ViT tiny (256 x 256 x 3 x 16)/forward/CUDA/Lux
12110782.5
ns12297131
ns0.98
ViT tiny (256 x 256 x 3 x 16)/forward/CPU/Reactant (optimize = :after_enzyme)
1711411728
ns1734648342
ns0.99
ViT tiny (256 x 256 x 3 x 16)/forward/CPU/Reactant
1707811998
ns1716005516
ns1.00
ViT tiny (256 x 256 x 3 x 16)/forward/CPU/Reactant (optimize = :before_enzyme)
1709512803
ns1705670596
ns1.00
ViT tiny (256 x 256 x 3 x 16)/forward/CPU/Reactant (optimize = :only_enzyme)
2924854567
ns2930317875
ns1.00
ViT tiny (256 x 256 x 3 x 16)/forward/CPU/Lux
2927891069
ns3071789485.5
ns0.95
ViT small (256 x 256 x 3 x 16)/forward/CUDA/Reactant (optimize = :after_enzyme)
1270178508
ns1351363548
ns0.94
ViT small (256 x 256 x 3 x 16)/forward/CUDA/Reactant
1317660758
ns1300008762
ns1.01
ViT small (256 x 256 x 3 x 16)/forward/CUDA/Reactant (optimize = :before_enzyme)
1263311709
ns1285804476
ns0.98
ViT small (256 x 256 x 3 x 16)/forward/CUDA/Reactant (optimize = :only_enzyme)
2584191843
ns2521378317
ns1.02
ViT small (256 x 256 x 3 x 16)/forward/CUDA/Lux
27307540.5
ns27302342
ns1.00
ViT small (256 x 256 x 3 x 16)/forward/CPU/Reactant (optimize = :after_enzyme)
2190938487
ns2243314517
ns0.98
ViT small (256 x 256 x 3 x 16)/forward/CPU/Reactant
2166284687
ns2209743795
ns0.98
ViT small (256 x 256 x 3 x 16)/forward/CPU/Reactant (optimize = :before_enzyme)
2137987987
ns2196379717
ns0.97
ViT small (256 x 256 x 3 x 16)/forward/CPU/Reactant (optimize = :only_enzyme)
3415666738
ns3417637678
ns1.00
ViT small (256 x 256 x 3 x 16)/forward/CPU/Lux
6038343271.5
ns5737977502
ns1.05
ViT small (256 x 256 x 3 x 32)/forward/CUDA/Reactant (optimize = :after_enzyme)
1233854317
ns1239628004
ns1.00
ViT small (256 x 256 x 3 x 32)/forward/CUDA/Reactant
1299829181.5
ns1471450378
ns0.88
ViT small (256 x 256 x 3 x 32)/forward/CUDA/Reactant (optimize = :before_enzyme)
1226243251
ns1188148551.5
ns1.03
ViT small (256 x 256 x 3 x 32)/forward/CUDA/Reactant (optimize = :only_enzyme)
2393640923
ns2290586722
ns1.04
ViT small (256 x 256 x 3 x 32)/forward/CUDA/Lux
52646968
ns52692914.5
ns1.00
ViT small (256 x 256 x 3 x 32)/forward/CPU/Reactant (optimize = :after_enzyme)
3006477320
ns2982530184
ns1.01
ViT small (256 x 256 x 3 x 32)/forward/CPU/Reactant
2989128551
ns2990386476
ns1.00
ViT small (256 x 256 x 3 x 32)/forward/CPU/Reactant (optimize = :before_enzyme)
3003357676
ns3011396498
ns1.00
ViT small (256 x 256 x 3 x 32)/forward/CPU/Reactant (optimize = :only_enzyme)
4443262702
ns4338309706
ns1.02
ViT small (256 x 256 x 3 x 32)/forward/CPU/Lux
24545735518
ns11645205146
ns2.11
ViT base (256 x 256 x 3 x 16)/forward/CUDA/Reactant (optimize = :after_enzyme)
1288108103
ns1216846465
ns1.06
ViT base (256 x 256 x 3 x 16)/forward/CUDA/Reactant
1247053980
ns1268232903.5
ns0.98
ViT base (256 x 256 x 3 x 16)/forward/CUDA/Reactant (optimize = :before_enzyme)
1260403416
ns1322945944
ns0.95
ViT base (256 x 256 x 3 x 16)/forward/CUDA/Reactant (optimize = :only_enzyme)
2513765600
ns2578098657
ns0.98
ViT base (256 x 256 x 3 x 16)/forward/CUDA/Lux
70692019
ns70862545
ns1.00
ViT base (256 x 256 x 3 x 16)/forward/CPU/Reactant (optimize = :after_enzyme)
3164689242
ns3193348347
ns0.99
ViT base (256 x 256 x 3 x 16)/forward/CPU/Reactant
3166667974
ns3203590115
ns0.99
ViT base (256 x 256 x 3 x 16)/forward/CPU/Reactant (optimize = :before_enzyme)
3168332239
ns3154476044
ns1.00
ViT base (256 x 256 x 3 x 16)/forward/CPU/Reactant (optimize = :only_enzyme)
4510953172
ns4523619517
ns1.00
ViT base (256 x 256 x 3 x 16)/forward/CPU/Lux
12354970629
ns9115055641
ns1.36
ViT base (256 x 256 x 3 x 4)/forward/CUDA/Reactant (optimize = :after_enzyme)
1242550154
ns1289173742
ns0.96
ViT base (256 x 256 x 3 x 4)/forward/CUDA/Reactant
1270011702
ns1268715834.5
ns1.00
ViT base (256 x 256 x 3 x 4)/forward/CUDA/Reactant (optimize = :before_enzyme)
1308184956.5
ns1269718689.5
ns1.03
ViT base (256 x 256 x 3 x 4)/forward/CUDA/Reactant (optimize = :only_enzyme)
2564144412
ns2796130400
ns0.92
ViT base (256 x 256 x 3 x 4)/forward/CUDA/Lux
20737061
ns20728567
ns1.00
ViT base (256 x 256 x 3 x 4)/forward/CPU/Reactant (optimize = :after_enzyme)
1846241603
ns1845983114
ns1.00
ViT base (256 x 256 x 3 x 4)/forward/CPU/Reactant
1845891211
ns1840843333
ns1.00
ViT base (256 x 256 x 3 x 4)/forward/CPU/Reactant (optimize = :before_enzyme)
1838778303
ns1844902802
ns1.00
ViT base (256 x 256 x 3 x 4)/forward/CPU/Reactant (optimize = :only_enzyme)
3067201183
ns3070132545
ns1.00
ViT base (256 x 256 x 3 x 4)/forward/CPU/Lux
3142722042.5
ns3473525524.5
ns0.90
This comment was automatically generated by workflow using github-action-benchmark.