Skip to content

Improvements in string handling in FSharp.Core and/or FSC #9501

@abelbraaksma

Description

@abelbraaksma

Is your feature request related to a problem?
Strings are omnipresent and recent discussions and profiling have shown that improvements can be made here that may be applicable to other parts of the ecosystem.

I've created this issue to discuss some options.

Describe the solution you'd like
Several options have come forward and are up for discussion:

  • Find cases where StringBuilder is used and investigate if improvements are viable with StringBuilderCache (currently private in BCL, uses thread-local storage)
  • Where heavy string-copying is used locally, investigate use of ArrayPool, possibly cherry-picking from System.Buffer. The advantage here is that zeroing of memory is skipped. This cannot be used for shared instances of SB or strings.
  • On hot paths, check if using ReadOnlySpan and/or String.Create can help. This is .NET Standard 2.1, so we can only do that in FCS and tooling, I think
  • Where high GC pressure and/or LOH pressure is recognized we should perhaps consider using fixed, similar to how BCL does it. For string scenarios this can bring pressure down by 2x.
  • Use techniques shown in Performance of certain build-in functions, esp the ones related to Array manipulation #9390 with fixed-size char-arrays, possibly further improved with array pools.
  • Improve array manipulations by dipping into the ArrayPool technique and/or by speeding up array creation and copying in scenarios similar to Array.map/collect.
  • Investigate if we can use cpblk or initblk in certain scenarios. These have recently been highly improved in the CLR
  • Perhaps look into ZString, some techniques there (pooling, zero-allocation strings) may be helpful: https://github.com/Cysharp/ZString

Additional context
My idea is to discuss/brainstorm perf ideas (please check the prev. discussions) and the scope and then, where applicable, use separate issues for defined targets. One I've been working on silently has been Array.xxx functions, which are already highly optimized, but some measurements suggest there's room for further improvement.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions