Adapt Tarjan generic cycle detector for use in Crossgen2 #71426

trylek · 2022-06-29T11:52:12Z

This change modifies Crossgen2 to use the generic cycle detector
originally implemented for NativeAOT to trim infinite generic
expansion as observed in the LanguageExt public nuget package
and tracked by the issue

#66079

At this point I have only implemented a back-compat option to
opt out of the generic cycle detection in case it turns out to
cause problems for certain payloads. I haven't added the option
to set the cutoff level - in practice, LanguageExt compilation
still fails when the cutoff is set to anything higher than 0;
moreover for Crossgen2 skipping precompilation of certain methods
merely incurs slightly reduced performance, not runtime failures
as is the case with NativeAOT.

Thanks

Tomas

Fixes: #66079

/cc @dotnet/crossgen-contrib

MichalStrehovsky · 2022-06-30T00:07:09Z

LanguageExt compilation still fails when the cutoff is set to anything higher than 0

You might need to add a callout to the detection in more than one place. If anything higher than 0 doesn't work, I suspect the problem is that we don't have a callout in all the places that need it. I would at least try to look where the cycle is forming if we have a higher cutoff because that part of crossgen2 might be reachable if user code has a different kind of cycle.

src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/ReadyToRunCodegenCompilation.cs

davidwrighton

We need cleaner error handling here.

src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/ReadyToRunCodegenCompilation.cs

mangod9 · 2022-11-07T14:47:53Z

@trylek is this still being considered or should be closed?

trylek · 2022-11-07T19:04:43Z

It should be ultimately fixed but we can close it for now, I'll reopen this once we're done with the experiments and start working on .NET 8 features.

trylek · 2023-05-19T23:11:35Z

I believe I have finally managed to make this work for the repro case described in #66079. The main trick is that the depth cutoff heuristic doesn't work well in the case of LanguageExt.Core - the problem with this assembly is that many generic types have lots of type parameters and what basically happens is a form of exponential explosion not necessarily recursively repeating a given generic type within a single type parameter. To cater for this I have introduced a second heuristic - breadth cutoff - that corresponds to the total number of occurrences of types marked as poisoned w.r.t. potentially forming cycles within a generic type that cause the dependency to be identified as potentially cycle-forming. With the default value of 4 for the depth cutoff I have measured sizes of the compiled LanguageExt.Core assembly (Windows x64 release) variating the breadth cutoff:

Breadth cutoff	LanguageExt.Core Size
1	32 MB
2	49 MB
3	120 MB
4	592 MB

In light of these findings I set the default breadth cutoff to 2. After discussing the matter with Manish I have come to the conclusion that for now it's probably better to make this just an opt-in especially in light of the fact that the compilation takes longer due to the need to build the module tables of poisoned types. If at some point we decide to make this the default (possibly with an opt-out switch) there are several places where we might be able to optimize the algorithm - e.g. by parallelizing the initial poisoned types generation, by caching the results of IsDeepPossiblyCyclicInstantiation, by improving the quadratic algorithm for checking the depth cutoff and the like. I'm still working on validating the change in the lab but for now I believe that it should be feature complete, so to say.

trylek · 2023-05-19T23:16:54Z

One other aspect worth mentioning is that at this point the change is slightly hacky by manually bringing in bits of NativeAOT logic pertinent to the generic cycle detection along with bits of logging logic. I think that it would be useful to unify this between Crossgen2 and NativeAOT by either putting the relevant code in one of the existing common projects or by creating a new one shared between the two compilers.

MichalStrehovsky · 2023-05-22T03:20:54Z

Is it possible to write a test where crossgen would previously fail and now works? We have some NativeAOT specific testing in smoketests for generic recursions that were killing the NAOT compiler, mostly extracted from the real world patterns.

src/coreclr/tools/Common/TypeSystem/Common/TypeSystemContext.cs

src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/CompilerTypeSystemContext.Aot.cs

src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/ReadyToRunCompilerContext.cs

src/coreclr/tools/aot/ILCompiler.ReadyToRun/ILCompiler.ReadyToRun.csproj

src/coreclr/tools/aot/ILCompiler/ILCompilerRootCommand.cs

...ols/aot/ILCompiler.DependencyAnalysisFramework/ILCompiler.DependencyAnalysisFramework.csproj

I'll address the code reorg (GenericCycleDetection) and regression test in subsequent commits. Thanks Tomas

…sFramework

So I have a trivial depth cutoff test working; without enabling generic cycle detection Crossgen2 crashes after about an hour on my box with the Arithmetic Overflow as described in the issue dotnet#66079. As next step I'll work on the equivalent breadth test, that is somewhat more tricky. Thanks Tomas

davidwrighton

I'd like the handling of the fixups for compiled methods to be tweaked a bit. Also, do we know what impact this has on the performance of crossgen?

davidwrighton · 2023-06-09T21:24:48Z

...r/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRun/MethodWithGCInfo.cs

This is dropping the dependencies of the fixup. I'm not very comfortable with that, and agree with @MichalStrehovsky here. Instead, could we detect the cycle when adding the fixup to _fixups, and if we do so, mark the method with whatever details we do, when then entire method can't be compiled.

trylek · 2023-06-09T21:31:34Z

Thanks David for your feedback, I'll modify the fixup management based on your suggestion. The impact on Crossgen2 performance is quite detrimental, that's why it's just an opt-in for now, if we decided to turn it on by default, we would at the very least need to improve the perf quite a bit. For the LanguageExt assembly mentioned in the original issue, just the module analysis takes something like a minute on my devbox.

MichalStrehovsky · 2023-06-09T21:39:37Z

just the module analysis takes something like a minute on my devbox

Is that a debug build? From what I saw when enabling this, it usually takes single digits milliseconds to analyze an assembly.

trylek · 2023-06-09T22:48:56Z

Thanks David and Michal for your feedback. I have updated the PR along David's suggestion. For Michal's comment, I have instrumented CycleInfoHashtable to show time spent in the analysis and this is the result for the LanguageExt.Core assembly from the original issue repro:

Cycle analysis for LanguageExt.Core took 360382 msecs
Cycle analysis for System.Private.CoreLib took 45 msecs
Cycle analysis for LanguageExt.Core took 363084 msecs
Cycle analysis for LanguageExt.Core took 363232 msecs
Cycle analysis for LanguageExt.Core took 363388 msecs
Cycle analysis for LanguageExt.Core took 364608 msecs
Cycle analysis for LanguageExt.Core took 365848 msecs
Cycle analysis for LanguageExt.Core took 367601 msecs
Cycle analysis for LanguageExt.Core took 368732 msecs

Apparently one problem is that the internal Crossgen2 parallelism triggers initial analysis of the assembly on all the threads by virtue of the lock-free hashtable; I'm not sure if some form of locking would improve this. This is kind of the slowest of my devboxes but I doubt its Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz is a thousand times slower than Michal's box. I think one problem could be that the LanguageExt.Core assembly is a real large one, about 10 MB or on par with CoreLib, just much more complex in terms of heavy use of generic structs everywhere; please feel free to try it out in NativeAOT mode, I don't see how the Tarjan cycle detector could be so much faster there considering it basically just uses the ECMA / typesystem code that is shared between the two compilers.

trylek · 2023-06-09T23:04:03Z

I have experimentally added a somewhat absurd lock around the lock-free hashtable and it indeed made things go faster in general:

Cycle analysis for LanguageExt.Core took 96969 msecs
Cycle analysis for System.Private.CoreLib took 32 msecs
Emitting R2R PE file: D:\git\AndreSteenveld.CrossgenLanguageExt\obj\Release\net6.0\win-x64\R2R\LanguageExt.Core.dll

In light of this fact I think it might make sense to consider dropping the parallelized hashtable population in this particular case but technically such change would be somewhat orthogonal to my primary effort and I would feel more comfortable making it in a separate change if we agree it's the right way to go. Also my somewhat ham-fisted approach globally locks the hashtable whereas a more subtle fix could just lock its population for a particular module, I think that two different modules can be happily scanned in parallel.

trylek · 2023-06-09T23:07:14Z

(Just to clarify regarding Michal's above question, these are measurements on Windows.x64.Release build.)

MichalStrehovsky · 2023-06-09T23:51:40Z

Oh, if it's a 10 MB assembly, I'm not too concerned. This is absolutely huge by any standards. CoreLib is a good example of a "normal" big assembly and that's 45 ms, which is the expected ballpark.

The locking may or may not benefit things. Normal sized assemblies take single digits ms and that's comparable with jitting a couple methods. It would only help if we somehow figure out things will take longer (maybe from size of typespec table?)

trylek · 2023-06-10T00:38:26Z

Thanks Michal for your additional feedback. Looking at the compilation in the debugger, I see that LanguageExt.Core has | NumberOfRows | 0x000090ed | (about 37K) in the TypeSpec table; in contrast, SPC has just | NumberOfRows | 0x00000636 | (1.5K or so) in the same table. The interesting question is how to actually utilize this number in the compilation - we could possibly treat it as an indicator that we need to run the generic cycle analysis, to avoid parallelization in its initial scanning and / or for additional tweaks in the compilation pipeline. We'd also need to establish and measure additional heuristics like - is this supposed to be compared with a hardcoded constant, or somehow weighted as a percentage of the number of TypeDesc / MethodDesc table entries or something yet completely different.

As part of investigation of the bug dotnet#66079 and implementation of the fix dotnet#71426 I noticed that one aggravating factor is that Crossgen2 starts analyzing the same assembly multiple times on its various parallel threads; this busywork additionally makes the app compete for access to the same metadata, making the initial analysis even slower. In accordance with Michal's suggestion from the PR thread dotnet#71426 I propose to modify the generic cycle detector to run the initial analysis single-threaded if the module in question is expected to be "generics-heavy" using the number of TypeSpec rows in its ECMA metadata as an indicator of module generic complexity. I have written a simple managed app scanning all runtime framework assemblies, ASP.NET assemblies and assemblies used in internal CoreCLR testing. I found out that the largest number of TypeSpec rows is in FSharp.Core (3855), followed by Microsoft.CodeAnalysis (3148). Based on these findings I have set the initial value of the cutoff for single-threaded analysis to 5000. Thanks Tomas

trylek added the area-crossgen2-coreclr label Jun 29, 2022

trylek requested review from MichalStrehovsky and davidwrighton June 29, 2022 11:52

ghost assigned trylek Jun 29, 2022

MichalStrehovsky reviewed Jun 30, 2022

View reviewed changes

src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/ReadyToRunCodegenCompilation.cs Outdated Show resolved Hide resolved

davidwrighton requested changes Jul 1, 2022

View reviewed changes

src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/ReadyToRunCodegenCompilation.cs Outdated Show resolved Hide resolved

ghost added the needs-author-action An issue or pull request that requires more info or actions from the author. label Jul 1, 2022

trylek force-pushed the Crossgen2GenericCycleDetector branch from d456e05 to 47788a4 Compare July 11, 2022 12:18

ghost removed the needs-author-action An issue or pull request that requires more info or actions from the author. label Jul 11, 2022

trylek closed this Nov 7, 2022

ghost locked as resolved and limited conversation to collaborators Dec 8, 2022

trylek mentioned this pull request May 4, 2023

Crossgen2 work for .NET 8 #85736

Closed

46 tasks

trylek reopened this May 19, 2023

trylek force-pushed the Crossgen2GenericCycleDetector branch 2 times, most recently from b4f5253 to 8c77748 Compare May 19, 2023 22:56

build-analysis bot mentioned this pull request May 20, 2023

Tracking issue for CI build timeouts #76454

Closed

MichalStrehovsky reviewed May 22, 2023

View reviewed changes

dotnet unlocked this conversation May 22, 2023

trylek force-pushed the Crossgen2GenericCycleDetector branch 2 times, most recently from 0040ad7 to 42fe681 Compare May 23, 2023 22:03

MichalStrehovsky reviewed May 24, 2023

View reviewed changes

...ols/aot/ILCompiler.DependencyAnalysisFramework/ILCompiler.DependencyAnalysisFramework.csproj Outdated Show resolved Hide resolved

build-analysis bot mentioned this pull request May 24, 2023

Assert failure in GC/API/NoGCRegion/Callback_Svr test #86612

Closed

runfoapp bot mentioned this pull request May 24, 2023

Infra improvements for Helix #68176

Closed

build-analysis bot mentioned this pull request May 25, 2023

Could not load file or assembly 'Microsoft.CodeAnalysis.NetAnalyzers #84995

Closed

trylek added 12 commits June 9, 2023 19:50

Address Michal's local PR feedback (simple functionality shuffles)

9c34303

I'll address the code reorg (GenericCycleDetection) and regression test in subsequent commits. Thanks Tomas

Revert change to pre-existing command-line option naming to reduce churn

2722de8

Move cycle detector under GenericCycleDetection per Michal's PR feedback

d3e50b5

Delete no longer needed logging changes

09ce974

Revert incorrect addition of READYTORUN define from DependencyAnalysi…

ef8f33c

…sFramework

Adjust DepthTest so that it only uses the depth cutoff

f6dc9b6

Implement unit tests for generic cycle detection

f41ab02

Fix generic breadth test

d3a44b3

Address Michal's PR feedback, fix one unit test to avoid lab timeouts

96fc08b

Make Depth3Test unsupported on 32-bit platforms where it OOMS Crossgen2

1ec93be

Fix typo in MSBuild condition

dfd968b

trylek force-pushed the Crossgen2GenericCycleDetector branch from 0660b5f to dfd968b Compare June 9, 2023 17:50

davidwrighton requested changes Jun 9, 2023

View reviewed changes

ghost added the needs-author-action An issue or pull request that requires more info or actions from the author. label Jun 9, 2023

ghost removed the needs-author-action An issue or pull request that requires more info or actions from the author. label Jun 9, 2023

Reorganize handling of method fixups per David's PR feedback

7b3dab7

davidwrighton approved these changes Jun 13, 2023

View reviewed changes

trylek merged commit bc4cbb6 into dotnet:main Jun 13, 2023

trylek deleted the Crossgen2GenericCycleDetector branch June 13, 2023 00:49

trylek mentioned this pull request Jun 13, 2023

Perf optimization of module scanning for generic cycle detection #87532

Closed

ghost locked as resolved and limited conversation to collaborators Jul 13, 2023

Adapt Tarjan generic cycle detector for use in Crossgen2 #71426

Adapt Tarjan generic cycle detector for use in Crossgen2 #71426

Uh oh!

Conversation

trylek commented Jun 29, 2022

Uh oh!

MichalStrehovsky commented Jun 30, 2022

Uh oh!

Uh oh!

davidwrighton left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mangod9 commented Nov 7, 2022

Uh oh!

trylek commented Nov 7, 2022

Uh oh!

trylek commented May 19, 2023

Uh oh!

trylek commented May 19, 2023

Uh oh!

MichalStrehovsky commented May 22, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

davidwrighton left a comment

Choose a reason for hiding this comment

Uh oh!

davidwrighton Jun 9, 2023

Choose a reason for hiding this comment

Uh oh!

trylek commented Jun 9, 2023

Uh oh!

MichalStrehovsky commented Jun 9, 2023

Uh oh!

trylek commented Jun 9, 2023

Uh oh!

trylek commented Jun 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trylek commented Jun 9, 2023

Uh oh!

MichalStrehovsky commented Jun 9, 2023

Uh oh!

trylek commented Jun 10, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

trylek commented Jun 9, 2023 •

edited

Loading