JIT: implied Boolean range assertions for local prop #109481

AndyAyersMS · 2024-11-02T15:39:01Z

If morph sees an assignment of 0/1 to a local, have it generate a [0..1] range assertion in addition to the constant assertion.

This helps morph propagate more Boolean ranges at merges, which allows morph to elide more casts.

Also, bump up the local assertion table size since morph is now generating more assertions.

If morph sees an assignment of 0/1 to a local, have it generate a [0..1] range assertion in addition to the constant assertion. This helps morph propagate more Boolean ranges at merges, which allows morph to elide more casts. Also, bump up the local assertion table size since morph is now generating more assertions.

dotnet-policy-service · 2024-11-02T15:39:43Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

AndyAyersMS · 2024-11-02T15:39:45Z

Split off from #109469. Seems beneficial on its own. Let's see how costly it is..

AndyAyersMS · 2024-11-02T17:56:19Z

Split off from #109469. Seems beneficial on its own. Let's see how costly it is..

Surprisingly costly... let's try smaller BVs.

AndyAyersMS · 2024-11-02T18:05:05Z

Might need to do some actual data modelling to get a better idea on how to rightsize these bit vectors.

The old formula I created used lvaCount which seems a bit wrongheaded. We should only be generating assertions for tracked locals. However it is not clear if lvaTrackedCount is better or even accurate ( it may be an underestimate).

We have metrics tracking (roughly) how many assertions we drop because of table size limits (rough because once the table is full, repeated attempts to add say x === 0 will fail and each will count as a miss, though it would only create one table entry). So we could start with that. But perhaps we'd need to just run with super-large tables and collect the never miss table data and the available measurements, and then look at the distribution, and try and find a good compromise where we get a formula based on the measurements that gets us most of the assertions in most of the cases.

AndyAyersMS · 2024-11-03T18:20:36Z

Here's some data from ASP.NET, with the assertion table size set to 4096. About 82000 methods. The current table sizing loses about 29000 assertions (with the change in this PR, which create more assertions than usual).

There is a decent linear trend of roughly 1.17 assertions per tracked local, but that is too conservative for many small methods.

The number of tracked locals is fairly predictably about 0.76 of the total number of locals so either measure can be used to build a sizing heuristic.

Adding sizing and a trend line to the original chart for the new heuristic (orange below), it does a good job of not missing too many assertions (about 3.5K get dropped). And (not shown) it is almost always picking bigger vectors that current main (bigger or same for 79.5K, smaller 2.5K methods) and perhaps not surprisingly about 32 bits longer on average.

Unfortunately pin is broken on my box so I can't dig in with inst count and see if it's the extra BV width or the extra assertion construction cost or the extra morph actions that is the TP culprit.

Zooming in to the smaller methods (less than 200 tracked locals)

On 64 bit hosts the cost of the BV gets rounded up to the nearest mutiple of 64, so those 96 bit vectors are actually 128 bits. So assuming we're looking at BV cost as the predominant TP impact, we can claw some back for the small cases (which are overwhelmingly the frequent cases) by limiting the BV size to 64. If we do this for cases with less than 24 live locals (see extreme detail below) we only lose assertions in a handful of methods:

AndyAyersMS · 2024-11-03T18:39:33Z

Proposed revised heuristic:

…replay

jakobbotsch · 2024-11-04T10:40:38Z

I wonder how it would affect things if we used a dynamically expansible (up to some maximum) bit vector implementation instead... Say use the lowest bit to indicate whether this is a pointer to a (length, words) tuple or just inline 63 (or 31) bits.

AndyAyersMS · 2024-11-20T18:00:02Z

I wonder how it would affect things if we used a dynamically expansible (up to some maximum) bit vector implementation instead... Say use the lowest bit to indicate whether this is a pointer to a (length, words) tuple or just inline 63 (or 31) bits.

Phoenix had a fairly sophisticated sparse BV setup, maybe we can just steal it.

I need to profile this to see if the BV size is the culprit -- will try one of my other boxes.

AndyAyersMS · 2024-11-22T18:51:26Z

Looks like BV cost is not the dominant factor

range\local-boolean-range.csproj]
  8796189684   9164784738  0.06%  |  public: struct GenTree * __cdecl Compiler::optAssertionProp_LclVar(unsigned __int64 *const &, struct GenTreeLclVarCommon *, struct Statement *)
           0    284474182  0.05%  |  `Compiler::fgAssertionGen'::`2'::<lambda_2>::operator()
   977146327   1196551328  0.04%  |  private: void __cdecl Compiler::fgAssertionGen(struct GenTree *)
  3532984967   3751969201  0.04%  |  public: unsigned short __cdecl Compiler::optAddAssertion(struct Compiler::AssertionDsc *)
   721753622    857188262  0.02%  |  public: static unsigned __int64 * __cdecl BitSetOps<unsigned __int64 *, 1, struct BitVecTraits *, struct BitVecTraits>::MakeCopy(struct BitVecTraits *, unsigned __int64 *)
 11823108036  11925253145  0.02%  |  public: void * __cdecl ArenaAllocator::allocateMemory(unsigned __int64)
  1017382342   1103530350  0.01%  |  public: static void __cdecl BitSetOps<unsigned __int64 *, 1, struct BitVecTraits *, struct BitVecTraits>::IntersectionD(struct BitVecTraits *, unsigned __int64 *&, unsigned __int64 *)
  2747803608   2806466352  0.01%  |  __security_check_cookie
   186332106    237772711  0.01%  |  public: static void __cdecl BitSetOps<unsigned __int64 *, 1, struct BitVecTraits *, struct BitVecTraits>::DiffD(struct BitVecTraits *, unsigned __int64 *&, unsigned __int64 *)
   952832850    984661705  0.01%  |  public: unsigned __int64 *& __cdecl Compiler::GetAssertionDep(unsigned int)

Instead it's the cost of making the new assertions and then later on filtering through them. I think we can improve the filtering part a little. Will do that separately as it should be no diff without the other changes here.

AndyAyersMS · 2024-11-23T01:44:16Z

Local TP of this PR plus #110091 vs baseline (win x64 libraries tests no TC) shows we come out ahead:

  8796189684   3168571960  -0.97%  |  public: struct GenTree * __cdecl Compiler::optAssertionProp_LclVar(unsigned __int64 *const &, struct GenTreeLclVarCommon *, struct Statement *)
  3540234974    305186960  -0.56%  |  public: struct GenTree * __cdecl Compiler::optCopyAssertionProp(struct Compiler::AssertionDsc *, struct GenTreeLclVarCommon *, struct Statement *)
   721753622   1331032982   0.10%  |  public: static unsigned __int64 * __cdecl BitSetOps<unsigned __int64 *, 1, struct BitVecTraits *, struct BitVecTraits>::MakeCopy(struct BitVecTraits *, unsigned __int64 *)
  1017382342   1505682806   0.08%  |  public: static void __cdecl BitSetOps<unsigned __int64 *, 1, struct BitVecTraits *, struct BitVecTraits>::IntersectionD(struct BitVecTraits *, unsigned __int64 *&, unsigned __int64 *)
   952832850   1311702304   0.06%  |  public: unsigned __int64 *& __cdecl Compiler::GetAssertionDep(unsigned int)
  1660842818   1967928213   0.05%  |  protected: void __cdecl JitExpandArray<struct ValueNumStore::Chunk *>::EnsureCoversInd(unsigned int)
           0    284474182   0.05%  |  `Compiler::fgAssertionGen'::`2'::<lambda_2>::operator()
 11823108036  12090888934   0.05%  |  public: void * __cdecl ArenaAllocator::allocateMemory(unsigned __int64)
   977146327   1196551328   0.04%  |  private: void __cdecl Compiler::fgAssertionGen(struct GenTree *)
  3532984967   3751969201   0.04%  |  public: unsigned short __cdecl Compiler::optAddAssertion(struct Compiler::AssertionDsc *)

though that perhaps had outsized benefits from #110091

AndyAyersMS · 2024-11-23T16:24:26Z

New diffs

Worst-case TP is about 0.3 which was paid for by #110091

SPMI failures are unrelated, likely something to do with the recent ISA additions.... opened #110106

[02:08:27] ISSUE: <ASSERT> #140 D:\a\_work\1\s\src\coreclr\jit\compiler.cpp (2281) - Assertion failed 'instructionSetFlags.Equals(EnsureInstructionSetFlagsAreValid(instructionSetFlags))' in 'PlatformBenchmarks.Startup:Configure(Microsoft.AspNetCore.Builder.IApplicationBuilder):this' during 'Pre-import' (IL size 1; hash 0x79211bb5; Optimization-Level-Not-Yet-Set)

@EgorBo PTAL
cc @dotnet/jit-contrib

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 2, 2024

dotnet-policy-service bot assigned AndyAyersMS Nov 2, 2024

smaller table

6d948b7

build-analysis bot mentioned this pull request Nov 2, 2024

The Operation will be canceled. The next steps may not contain expected logs. dotnet/dnceng#3008

Open

3 tasks

revise table size, add metrics, allow specifying details file during …

a4e00de

…replay

Merge branch 'main' into LocalBooleanRangeAssertions

84a2e01

AndyAyersMS mentioned this pull request Nov 22, 2024

JIT: filter local assertion set #110091

Merged

Merge branch 'main' into LocalBooleanRangeAssertions

149b0d9

build-analysis bot mentioned this pull request Nov 23, 2024

Build on Windows Fails sometimes with fatal error C1090: PDB API call failed #48070

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: implied Boolean range assertions for local prop #109481

JIT: implied Boolean range assertions for local prop #109481

AndyAyersMS commented Nov 2, 2024

dotnet-policy-service bot commented Nov 2, 2024

AndyAyersMS commented Nov 2, 2024

AndyAyersMS commented Nov 2, 2024

AndyAyersMS commented Nov 2, 2024

AndyAyersMS commented Nov 3, 2024

AndyAyersMS commented Nov 3, 2024

jakobbotsch commented Nov 4, 2024

AndyAyersMS commented Nov 20, 2024

AndyAyersMS commented Nov 22, 2024

AndyAyersMS commented Nov 23, 2024 •

edited

Loading

AndyAyersMS commented Nov 23, 2024 •

edited

Loading

JIT: implied Boolean range assertions for local prop #109481

Are you sure you want to change the base?

JIT: implied Boolean range assertions for local prop #109481

Conversation

AndyAyersMS commented Nov 2, 2024

dotnet-policy-service bot commented Nov 2, 2024

AndyAyersMS commented Nov 2, 2024

AndyAyersMS commented Nov 2, 2024

AndyAyersMS commented Nov 2, 2024

AndyAyersMS commented Nov 3, 2024

AndyAyersMS commented Nov 3, 2024

jakobbotsch commented Nov 4, 2024

AndyAyersMS commented Nov 20, 2024

AndyAyersMS commented Nov 22, 2024

AndyAyersMS commented Nov 23, 2024 • edited Loading

AndyAyersMS commented Nov 23, 2024 • edited Loading

AndyAyersMS commented Nov 23, 2024 •

edited

Loading

AndyAyersMS commented Nov 23, 2024 •

edited

Loading