Skip to content

Conversation

@Thorium
Copy link
Contributor

@Thorium Thorium commented Nov 19, 2023

I did some code cleanup without significant changes, and managed to cut around 10% of unit-test-execution times in my laptop (which might be coincidence).

I cloned fsharp-repo and my unit-tests on main branch fails for
FSharp.Compiler.Scripting.UnitTests.InteractiveTests.Script with nuget package that yields out of order dependencies works correctly So I have to do this PR to see CI results.

  • I tried to not touch functionality of anything.
  • I tried to keep the test-files as they are, so all the possible performance gain is from real code.
  • This came to touch quite a big amount of files, so I don't care if you merge or not, maybe just copy&paste some meaningful bits, whatever: I use F# daily, so any speedup would help.

Here are the main changes:

  • Instead of doing 2 dictionary lookups if whateverDict.ContainsKey x then Some whateverDict.[x] else None this uses .TryContainsKey to do the same business with only one lookup.
  • When your nested for-loops are having list.Contains, rather convert the nested list first to a Set for faster searching. So instead of for x in xs do if List.contains x ys this does let ySet = ys |> Set.ofList; for x in xs do if Set.contains x ySet (meanwhile still applicable, reverted to separate PR due to memory concerns)
  • I also changed some small DU to a Struct, and Some int to ValueSome int (reverted to avoid changing public api)
  • FSharpLint: Recommends to use isNull x over x = null because it's faster
  • FSharpLint: Recommends to use List.collect(fun x -> ...) over List.map(fun x -> ...) |> List.concat etc to avoid iterating lists multiple times.
  • FSharpLint: Recommends to use List.map myFunc over List.map(fun x -> myFunc x), and I'm not sure if it will cut some extra lambda from AST or not. For this reason it also prefers >> but I generally try to avoid too much point-free style.

I didn't include FSharpLint as tool because it gave also many false-positives like member-access setting to false: MyClass(x, MyMember = false), second parameter should not be replaced to false, because it's a named setter, not Boolean comparison.

@Thorium Thorium requested a review from a team as a code owner November 19, 2023 02:15
Some ContainsKey double lookups to TryContainsKey
Some nested list Contains to set Contains.
Some FSharpLint rules
also "dotnet fantomas . -r" but I bet it'll make merging harder.
@Thorium
Copy link
Contributor Author

Thorium commented Nov 19, 2023

This is "how fast F# code executes" in current main:
image

This is "how fast F# code executes" in this PR:
image

However there are 3 failing tests because this PR made trimmed FSharp.Core.dll smaller:

CheckTrim : Test failed with unexpected net8.0  -  trimmed FSharp.Core.dll length:
Expected:
	287744 Bytes
Actual:
	287232 Bytes

I think I need to update the test rather than trying to make the file larger.

@kerams
Copy link
Contributor

kerams commented Nov 19, 2023

Nice changes. However, I would not take the CI duration difference at face value. They sometimes wildly fluctuate between runs.

@Thorium
Copy link
Contributor Author

Thorium commented Nov 19, 2023

I find the more verbose case much easier to understand,

I know there is a tiny difference on readability, but the other one makes more complex and slower AST.

I would not take the CI duration difference at face value.

I know, but that was the only thing I could measure easily the build time, it was consistently faster also in my local machine.
Besides of people saving on build times on daily basis, the performance improvement is achieved by FSharp.Core working faster, not a build-time trick. And AOT saves half a megabyte.

@vzarytovskii
Copy link
Member

Nice changes. However, I would not take the CI duration difference at face value. They sometimes wildly fluctuate between runs.

Yeah, CI runs aren't trustworthy, they use different machines and containers.

@vzarytovskii
Copy link
Member

I find the more verbose case much easier to understand,

I know there is a tiny difference on readability, but the other one makes more complex and slower AST.

I am sure JIT will optimize branching. It sure can't be that significant.

@Thorium
Copy link
Contributor Author

Thorium commented Nov 19, 2023

Modifying 80 files, I don't want to make separate PR per each one. And I don't want to cause more work for you. These are all little independent changes without public API, so you could do merge so that you just unstage/ignore the parts you don't like.

@Thorium
Copy link
Contributor Author

Thorium commented Nov 19, 2023

I am sure JIT will optimize branching.

I like your optimism. :-)

@vzarytovskii
Copy link
Member

Modifying 80 files, I don't want to make separate PR per each one. And I don't want to cause more work for you. These are all little independent changes without public API, so you could do merge so that you just unstage/ignore the parts you don't like.

It looks alright, the only singular part I'm slightly worried about is set allocation and GC. Want to hear others' opinions.

@vzarytovskii
Copy link
Member

I am sure JIT will optimize branching.

I like your optimism. :-)

Hm, actually, JIT generates exactly the same code for both:

https://sharplab.io/#v2:EYLgtghglgdgNAGxAMwM5wC4gqsAfBAUwwAJkB7ckiE4EgYxIBMSBeAWACgSSplqSGABaEYggE4BXQlx6EEfWiQBkyhirUthoidNkl5qQmQgIj+rkVLAI4gXUYsO3AXjxLV6z0wsuuQA

@Thorium
Copy link
Contributor Author

Thorium commented Nov 19, 2023

Ok I think I found even better way to do the check. Now reverted some changes. Interesting to see if AOT size changes again.

But in longer run I think the unit-test testing exact binary byte-size of dll is not sustainable solution. It'll just make the 30min build fail twice.

@vzarytovskii
Copy link
Member

But in longer run I think the unit-test testing exact binary byte-size of dll is not sustainable solution. It'll just make the 30min build fail twice.

Its purpose is exactly that. To verify consciously that size change is desired.

@Thorium
Copy link
Contributor Author

Thorium commented Nov 19, 2023

But in longer run I think the unit-test testing exact binary byte-size of dll is not sustainable solution. It'll just make the 30min build fail twice.

Its purpose is exactly that. To verify consciously that size change is desired.

Can it be run locally quickly? E.g. build Release without testing then look the size?

@Thorium
Copy link
Contributor Author

Thorium commented Nov 19, 2023

It looks alright, the only singular part I'm slightly worried about is set allocation and GC. Want to hear others' opinions.

I'll split this PR to 2, to send the set allocations as a separate PR, so that can be explored more in detail.

@En3Tho
Copy link
Contributor

En3Tho commented Nov 19, 2023

@vzarytovskii Your sharplab points to x86 version of code. You have to choose x64 explicitly there. X86 code is generally worse

@vzarytovskii
Copy link
Member

@vzarytovskii Your sharplab points to x86 version of code. You have to choose x64 explicitly there. X86 code is generally worse

Either it doesn't work or x64 yields the same thing

@Thorium
Copy link
Contributor Author

Thorium commented Nov 19, 2023

@vzarytovskii As far as I can see all the issues were resolved and this would ready for merge.

@Thorium Thorium requested a review from vzarytovskii November 19, 2023 15:00
@vzarytovskii
Copy link
Member

I did some benchmarks (very rough ones), and the difference is within statistical error for the most part.

Both compilers are not ngen'd or crossgen'd, so difference in elapsed time will be even smaller in SDK.

Compiling compiler itself:
Before, elapsed, seconds: 26,0397-26,3572
After, elapsed, seconds: 25,3997-25,7576

Before, working set, mb: 3312
After, working set, mb: 3483

Before, GC, (Gen0, Gen1, Gen2), number of: 22, 13, 7
After, GC, (Gen0, Gen1, Gen2), number of: 23, 13, 7

So it's marginally faster, migth be GC-ing and allocating a bit more.

I am alright with merging it if folks are fine with trimming changes @0101 @T-Gro

@Thorium
Copy link
Contributor Author

Thorium commented Nov 20, 2023

I did some benchmarks (very rough ones)

Was this with the List.contains -> Set.contains or without? As I already removed that from this PR as non-beneficial.

@vzarytovskii
Copy link
Member

I did some benchmarks (very rough ones)

Was this with the List.contains -> Set.contains or without? As I already removed that from this PR as non-beneficial.

With just latest commit from the branch. So probably without. Not sure where do allocations come from, they might as well come from the differences in the code we compile (less lambda lifting).

@Thorium
Copy link
Contributor Author

Thorium commented Nov 20, 2023

Hm, actually, JIT generates exactly the same code for both:

Off-topic but look at this:
https://sharplab.io/#v2:EYLgtghglgdgNAGxAMwM5wC4gqsAfBAUwwAJkB7ckiE4EgXgFgAoEkqZakjAC0JjIQEqQizaEEHWtz4DkQkWJISR3AE4BXUaxIsipYBDVc6THTHIYAFDTx5aASiXPmQA

@smoothdeveloper
Copy link
Contributor

Thinking of infrastructure regarding the "compiler doesn't generate performant IL":

  • putting the sample code under a test with IL baselines, under a "optimization" folder
  • the test would have comments in a certain format (some json?) highlighting version of F# which generates imperfect/faulty IL, and the version it got fixed (whenever this occur)

Then we can track tests added like this to tickets in a csv fail which is maintained in the repository, and helps with the process of having a "Tracking" meta issue about all codegen / optimization related issue, and generating some release notes about code generation.

I can give this a look to sketch something, taking the comments and issues opened about those here, and maybe in the issue tracker, and add to the IL baseline tests and list the tests in a csv file to make a PR.

Copy link
Contributor

@psfinaki psfinaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in its current shape the changes are beneficial and benign. Up for merging.

@T-Gro
Copy link
Member

T-Gro commented Nov 21, 2023

I am also fine with merging, approved + marked my comments as resolved.

@vzarytovskii
Copy link
Member

Sigh...fantomas changes from latest Fantomas update...it's going now to affect majority of PRs :(

@vzarytovskii
Copy link
Member

@nojaf FYI - rather small example of what's usually happening when a lot of small formatting changes introduced.

# Conflicts:
#	src/Compiler/CodeGen/IlxGen.fs
#	src/Compiler/Driver/CompilerDiagnostics.fs
#	src/Compiler/Driver/GraphChecking/TrieMapping.fs
#	src/Compiler/Interactive/fsi.fs
#	src/Compiler/Service/ServiceParamInfoLocations.fs
#	src/Compiler/SyntaxTree/SyntaxTreeOps.fs
#	src/Compiler/Utilities/sformat.fs
#	src/FSharp.Core/QueryExtensions.fs
#	src/FSharp.Core/seq.fs
@vzarytovskii
Copy link
Member

@Thorium thanks for taking care of it. It was rather fast. Did you do manual resolution or with something more "smart"?

@Thorium
Copy link
Contributor Author

Thorium commented Nov 22, 2023

Manual, I know what are the changes and I see what Fantomas did, and there were actually not too many conflicts. Just had to force because I didn't want to deal with GitHub partial successful automation. I recommend Kdiff3 as mergetool.

@psfinaki psfinaki merged commit 2a25184 into dotnet:main Nov 22, 2023
@nojaf
Copy link
Contributor

nojaf commented Nov 23, 2023

@nojaf FYI - rather small example of what's usually happening when a lot of small formatting changes introduced.

Well, this was bound to happen for all open PRs. But again, I do not perceive this as a blocker to not update Fantomas. There will never be a good time when there are no open PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

9 participants