Skip to content

TcSymbolUses - significant memory consumption and allocations #8021

@TIHan

Description

@TIHan

In IncrementalBuild.fs, this line:

let tcSymbolUses = sink.GetSymbolUses()

Creates a TcSymbolUses object:

/// Represents container for all name resolutions that were met so far when typechecking some particular file
///
/// This is a memory-critical data structure - allocations of this data structure and its immediate contents
/// is one of the highest memory long-lived data structures in typical uses of IDEs. Not many of these objects
/// are allocated (one per file), but they are large because the allUsesOfAllSymbols array is large.
type TcSymbolUses

Every-time a project gets analyzed in the IDE, our background compiler/incremental builder will call GetSymbolUses on each file and cache it in a TcAccumulator:

return {tcAcc with ...
                    tcSymbolUsesRev=tcSymbolUses :: tcAcc.tcSymbolUsesRev
                     ... }

The purpose of doing this is to allow 'find all refs' and rename to work across projects/files for symbols. The caching also allows you do a second find all refs on any other symbol very quickly; it's all in memory. This is becoming an issue.

There are two problems:

  1. GetSymbolUses allocates many large arrays of a 28 (32-bit) byte struct type just by the call itself. Thankfully it's now getting chunked thanks to @baronfel 's fix: TcSymbolUseData cleanup per #6084 #6089 - it used to just allocate one huge array (LOH problem - very bad!), which was even worse.
  2. Caching the result of GetSymbolUses in a TcAccumulator. All of those allocations are now rooted and live in Gen2.

I made a change to never call GetSymbolUses in IncrementalBuilder, which means no allocations or caching of any kind. The caveat being 'find all refs' and rename across projects/files do not work. My test is to load VisualFSharp.sln with one file open, service.fs, then I wait a bit to watch memory settle (VS performs GC if itself is idle) and perform a 'find all refs' on a bool type. 'Find all refs' will not work, but it will still type-check every project. Below are the comparisons with the change:

service.fs, on first load, peak memory usage:

No change:
first_load

With change:
new_first_load

Waited for memory to settle:

No change:
after_gc

With change:
new_after_gc

Peak memory usage for 'find all refs' on a bool type:

No change (notice the CPU usage - it wasn't even done):
find_all_refs

With change:
new_find_all_refs

After 'find all refs' completed, waited for memory to settle:

No change:
find_all_refs_after

With change:
new_find_all_refs_after

--

The peak memory usage without the change for 'find all refs' is getting close to dangerous territory. VS will do an extreme halt the world GC and even compact the LOH which is incredibly visible to the user and a poor experience.

With that said, currently caching all symbols in-memory is not viable for large projects. The amount of allocations and rooting going on are significant.

Possible solution:

Until we develop a better caching strategy and/or a smarter mechanism to determine symbols, I think the best approach is to never call GetSymbolUses. Instead, we should re-type-check all files, and in the type-check Sink, we only capture what we think is the symbol we are trying to find and later determine which symbols are the real ones. This would keep a ton if not the entire majority of allocations down and not root anything. The only downside is that the second time you do a 'find all refs', it will be just as fast as the first.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions