Remove Redundant Functions + 60% MD Speedup #98

GNiendorf · 2024-09-26T20:16:52Z

Below is run on the L40.

This PR Timing:

Master Timing:

MD creation time goes from 1.0ms to 0.4ms, meaning that the redundant matrix creation mentioned below represented the majority of the MD creation time on GPU... 😱 @slava77

GNiendorf · 2024-09-26T20:17:02Z

/run standalone

GNiendorf · 2024-09-26T20:19:46Z

RecoTracker/LSTCore/src/alpaka/MiniDoublet.h

-    float miniDeltaLooseTilted[3] = {0.4f, 0.4f, 0.4f};
-    float miniDeltaEndcap[5][15];
-
-    for (size_t i = 0; i < 5; i++) {


This matrix creation code was being run many times, representing a significant portion of the MD creation time it seems like.

github-actions · 2024-09-26T20:32:30Z

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

The full set of validation and comparison plots can be found here.

Here is a timing comparison:

   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
   avg     44.5    326.0    116.8     47.7     93.5    500.2    113.4    150.7    101.4      2.3    1496.7     951.9+/- 247.8     410.9   explicit_cache[s=4] (target branch)
   avg     48.1    324.0    116.0     46.6     92.5    509.4    114.3    150.4    101.5      2.5    1505.4     947.8+/- 246.7     409.2   explicit_cache[s=4] (this PR)

slava77 · 2024-09-26T21:47:37Z

MD creation time goes from 1.0ms to 0.4ms, meaning that the redundant matrix creation mentioned below represented the majority of the MD creation time on GPU

nice find/fix.
Thank you.

slava77 · 2024-09-26T21:50:41Z

apparently this is addressing SegmentLinking/TrackLooper#303

VourMa · 2024-09-27T08:12:57Z

Indeed, nice catch and clean up. I would even considering adding this to the CMSSW PR (especially if the renaming to adhere to CMSSW naming is real and not an artefact of the rebases). What do you think?

@ariostas Is the _devel branch fully up to date with the branch for the CMSSW PR?

ariostas · 2024-09-27T12:26:10Z

The _devel branch is not up to date since there are still some big changes coming. I actually just noticed that the _devel PR is a bit of a mess, so we'll have to figure that out at some point. But I agree that this should go into the _realfiles branch. This is great!

I think there's other PRs by Gavin that should also be included in _realfiles since there have been review comments about them. In particular, #39 and #54.

VourMa · 2024-09-27T12:34:44Z

I think there's other PRs by Gavin that should also be included in _realfiles since there have been review comments about them. In particular, #39 and #54.

I would be less eager to also push these ones, because:

They introduce a new (physics) feature (lower pT cut) which is not strictly needed at this stage.
They require new data files, with all the complication this brings to the cms-sw way of running things for PRs.
They define a clear-cut development, separated from the rest (up to a good degree).

So I think they would be great for a standalone PR.

ariostas · 2024-09-27T15:06:08Z

I would be less eager to also push these ones, because...

That's true. But Andrea has asked for occupancy tables instead of a bunch of if-else statements, and if we want to fix SDL_INF we'll have to update the data files anyway, so maybe we could just get that out of the way

slava77 · 2024-09-27T16:48:38Z

/run all

github-actions · 2024-09-27T17:01:12Z

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

The full set of validation and comparison plots can be found here.

Here is a timing comparison:

   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
   avg     44.1    325.7    115.6     45.1     91.0    502.0    113.6    149.4    100.7      2.2    1489.5     943.4+/- 247.6     406.8   explicit_cache[s=4] (target branch)
   avg     47.6    324.3    116.0     45.6     91.8    511.2    112.9    150.0    100.8      2.4    1502.5     943.8+/- 245.0     411.4   explicit_cache[s=4] (this PR)

slava77 · 2024-09-27T17:37:10Z

linter check isn't happy here either

github-actions · 2024-09-27T18:20:21Z

The PR was built and ran successfully with CMSSW. Here are some plots.

OOTB All Tracks

The full set of validation and comparison plots can be found here.

GNiendorf · 2024-09-27T18:46:31Z

@ariostas When I run the code format I see a bunch of suggestions for files not related to this PR:

slava77 · 2024-09-27T18:52:01Z

@ariostas When I run the code format I see a bunch of suggestions for files not related to this PR:

did you update to the new release or still reused the old release area with a new cmsrel?

GNiendorf · 2024-09-27T18:54:39Z

Oh no, I haven't updated. Let me do that and try again.

slava77 · 2024-09-27T18:54:59Z

/run all

github-actions · 2024-09-27T19:07:15Z

The PR was built and ran successfully in standalone mode. Here are some of the comparison plots.

The full set of validation and comparison plots can be found here.

Here is a timing comparison:

   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Reset    Event     Short             Rate
   avg     45.0    326.4    116.0     47.9     92.9    501.9    113.2    150.3    100.6      2.5    1496.7     949.8+/- 244.3     410.6   explicit_cache[s=4] (target branch)
   avg     47.2    323.1    115.9     46.9     91.5    506.0    113.4    148.9    101.7      2.3    1496.9     943.6+/- 249.2     410.0   explicit_cache[s=4] (this PR)

ariostas · 2024-09-27T19:20:32Z

When I run the code format I see a bunch of suggestions for files not related to this PR:

I have the same issue. I'm not sure how to get around it. I think they didn't format all files after clang-format got updated

github-actions · 2024-09-27T20:15:02Z

The PR was built and ran successfully with CMSSW. Here are some plots.

OOTB All Tracks

The full set of validation and comparison plots can be found here.

slava77 · 2024-09-27T22:47:17Z

the linter complaints are for parts of the files not touched by this PR

slava77 · 2024-09-27T22:50:14Z

Here is a timing comparison:

there is practically no change in the CPU backend: MD changes from 326.4 to 323.1; the other run was faster by around 1 s.
I guess the optimization works out differently (or something else is very slow)

remove redundant functions

8ec2910

GNiendorf mentioned this pull request Sep 26, 2024

Extensive Code Duplication #96

Open

GNiendorf commented Sep 26, 2024

View reviewed changes

GNiendorf changed the title ~~Remove Redundant Functions~~ Remove Redundant Functions + 60% MD Speedup Sep 26, 2024

GNiendorf marked this pull request as ready for review September 26, 2024 20:35

GNiendorf mentioned this pull request Sep 26, 2024

All Functions Are Global in LST Namespace #99

Open

code-format

f32ee1a

GNiendorf mentioned this pull request Sep 27, 2024

Fix rzChisquared NaN's on CPU Backend for T5 #97

Merged

slava77 approved these changes Sep 27, 2024

View reviewed changes

slava77 merged commit 3000f9b into CMSSW_14_1_0_pre3_LST_X_LSTCore_realfiles_batch1_devel Sep 27, 2024
2 of 3 checks passed

slava77 mentioned this pull request Nov 4, 2024

CMSSW Integration of LST cms-sw/cmssw#45117

Merged

3 tasks

ariostas mentioned this pull request Nov 6, 2024

Devel changes that need to be rebased #117

Open

15 tasks

ariostas mentioned this pull request Nov 20, 2024

Remove redundant functions + fix T5 duplicate cleaning + bug fixes (rebase PR98, PR97, PR111, and PR124) #129

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove Redundant Functions + 60% MD Speedup #98

Remove Redundant Functions + 60% MD Speedup #98

GNiendorf commented Sep 26, 2024 •

edited

Loading

GNiendorf commented Sep 26, 2024

GNiendorf Sep 26, 2024 •

edited

Loading

github-actions bot commented Sep 26, 2024

slava77 commented Sep 26, 2024

slava77 commented Sep 26, 2024

VourMa commented Sep 27, 2024

ariostas commented Sep 27, 2024

VourMa commented Sep 27, 2024

ariostas commented Sep 27, 2024

slava77 commented Sep 27, 2024

github-actions bot commented Sep 27, 2024

slava77 commented Sep 27, 2024

github-actions bot commented Sep 27, 2024

GNiendorf commented Sep 27, 2024

slava77 commented Sep 27, 2024 •

edited

Loading

GNiendorf commented Sep 27, 2024

slava77 commented Sep 27, 2024

github-actions bot commented Sep 27, 2024

ariostas commented Sep 27, 2024

github-actions bot commented Sep 27, 2024

slava77 commented Sep 27, 2024

slava77 commented Sep 27, 2024

Remove Redundant Functions + 60% MD Speedup #98

Remove Redundant Functions + 60% MD Speedup #98

Conversation

GNiendorf commented Sep 26, 2024 • edited Loading

GNiendorf commented Sep 26, 2024

GNiendorf Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

github-actions bot commented Sep 26, 2024

slava77 commented Sep 26, 2024

slava77 commented Sep 26, 2024

VourMa commented Sep 27, 2024

ariostas commented Sep 27, 2024

VourMa commented Sep 27, 2024

ariostas commented Sep 27, 2024

slava77 commented Sep 27, 2024

github-actions bot commented Sep 27, 2024

slava77 commented Sep 27, 2024

github-actions bot commented Sep 27, 2024

GNiendorf commented Sep 27, 2024

slava77 commented Sep 27, 2024 • edited Loading

GNiendorf commented Sep 27, 2024

slava77 commented Sep 27, 2024

github-actions bot commented Sep 27, 2024

ariostas commented Sep 27, 2024

github-actions bot commented Sep 27, 2024

slava77 commented Sep 27, 2024

slava77 commented Sep 27, 2024

GNiendorf commented Sep 26, 2024 •

edited

Loading

GNiendorf Sep 26, 2024 •

edited

Loading

slava77 commented Sep 27, 2024 •

edited

Loading