Skip to content

Conversation

@nojaf
Copy link
Contributor

@nojaf nojaf commented Jun 26, 2023

Sorting the finishers,
Before folding the first state,
Destinies entwined,
In cosmic symphony, they await.

Some of you may have seen the following error message when compiling FSharp.Compiler.Service.fsproj using the new --test:GraphBasedChecking flag.

MSBuild version 17.7.0-preview-23320-09+d30eef85d for .NET
C:\Users\nojaf\Projects\fsharp\src\Compiler\Driver\GraphChecking\Continuation.fsi(2,17): error FS0248: Two modules named 'Continuation' occur in two parts of this assembly [C:\Users\nojaf\Projects\fsharp\src\Compiler\FSharp.Compiler.Service.fsproj::TargetFramework=netstandard2.0]

Build FAILED.

C:\Users\nojaf\Projects\fsharp\src\Compiler\Driver\GraphChecking\Continuation.fsi(2,17): error FS0248: Two modules named 'Continuation' occur in two parts of this assembly [C:\Users\nojaf\Projects\fsharp\src\Compiler\FSharp.Compiler.Service.fsproj::TargetFramework=netstandard2.0]
    0 Warning(s)
    1 Error(s)

You can easily reproduce this problem when you add

if inp.FileName.EndsWith("Continuation.fsi") then
                Thread.Sleep 1000

in CheckOneInputWithCallback in ParseAndCheckInputs.fs.

It was initially discovered reliably by @safesparrow via

#r "nuget: CliWrap, 3.6.0"

open System
open System.IO
open CliWrap

task {
    for idx in [ 0..50 ] do
        printfn "Start %i" idx

        let dll =
            FileInfo(
                @"C:\Users\nojaf\Projects\fsharp\artifacts\obj\FSharp.Compiler.Service\Debug\netstandard2.0\FSharp.Compiler.Service.dll"
            )

        if dll.Exists then
            dll.Delete()

        let! _ =
            Cli
                .Wrap("dotnet") // Assumption is that you use a preview 7.0.4xx SDK (by removing the global.json)
                .WithArguments(
                    "build FSharp.Compiler.Service.fsproj"
                    + " -bl /p:OtherFlags=\"--test:ParallelIlxGen --test:ParallelOptimization --test:GraphBasedChecking --test:DumpCheckingGraph --parallelreferenceresolution\""
                    + " --no-dependencies --no-restore -noWarn:MSB4011 /v:quiet /p:BUILDING_USING_DOTNET=true"
                )
                .WithWorkingDirectory(@"C:\Users\nojaf\Projects\fsharp\src\Compiler")
                .WithStandardOutputPipe(PipeTarget.ToDelegate(printfn "%s"))
                .WithStandardErrorPipe(PipeTarget.ToDelegate(printfn "%s"))
                .ExecuteAsync()
                .Task

        ()
}

In this example, it would fail after a couple of attempts. And show the error FS0248: Two modules named 'Continuation' occur in two parts of this assembly message from:

errorR(Error(FSComp.SR.tastTwoModulesWithSameNameInAssembly(textOfPath path2), entity2.Range))

After some investigation, we noticed that when preparing a certain TcState to be used in CheckOneInputWithCallback, the order of the folded finishers is random.
Initially, this didn't seem to matter. And to me, it is still somewhat unclear whether it does.

To recapitulate what happens before CheckOneInputWithCallback to construct the TcState:
combineResults gets called for the next node we wish to process

let combineResults
(emptyState: 'State)
(deps: ProcessedNode<'Item, 'State * Finisher<'ChosenItem, 'State, 'FinalFileResult>>[])
(transitiveDeps: ProcessedNode<'Item, 'State * Finisher<'ChosenItem, 'State, 'FinalFileResult>>[])
(sortResultsToAdd: 'Item -> 'Item -> int)
(folder: 'State -> Finisher<'ChosenItem, 'State, 'FinalFileResult> -> 'State)
: 'State =
match deps with
| [||] -> emptyState
| _ ->
// Instead of starting with empty state,
// reuse state produced by the dependency with the biggest number of transitive dependencies.
// This is to reduce the number of folds required to achieve the final state.
let biggestDependency =
let sizeMetric (node: ProcessedNode<_, _>) = node.Info.TransitiveDeps.Length
deps |> Array.maxBy sizeMetric
let firstState = biggestDependency.Result |> fst
// Find items not already included in the state.
// Note: Ordering is not preserved due to reusing results of the biggest child
// rather than starting with empty state
let itemsPresent =
set
[|
yield! biggestDependency.Info.TransitiveDeps
yield biggestDependency.Info.Item
|]
let resultsToAdd =
transitiveDeps
|> Array.filter (fun dep -> itemsPresent.Contains dep.Info.Item = false)
|> Array.distinctBy (fun dep -> dep.Info.Item)
|> Array.sortWith (fun a b -> sortResultsToAdd a.Info.Item b.Info.Item)
|> Array.map (fun dep -> dep.Result |> snd)
// Fold results not already included and produce the final state
let state = Array.fold folder firstState resultsToAdd
state

Here, the firstState is picked by looking at the biggest dependency, in order to reduce the number of folds that are still required to have all dependencies information of the current node in the TcState.

In the concrete case of Driver\GraphChecking\FileContentMapping.fs there are four nodes left to fold over the firstState in order to get the correct starting state to begin processing FileContentMapping.fs:

  • NodeToTypeCheck.ArtificialImplFile "SyntaxTree\SyntaxTreeOps.fsi"
  • NodeToTypeCheck.ArtificialImplFile "Driver\GraphChecking\Continuation.fsi"
  • NodeToTypeCheck.PhysicalFile "SyntaxTree\SyntaxTreeOps.fsi"
  • NodeToTypeCheck.PhysicalFile "Driver\GraphChecking\Continuation.fsi"

(What is an ArtificialImplFile again? It is adding the results of a typed-check signature file to the TcEnv of implementation files. Mimicking as if we type-checked the actual implementation file).

Currently, this order is random. And so far that didn't seem to matter. Eventually, once everything is folded, all information is available in both TcEnv in TcState and all is fine.

However, given the Two modules named 'Continuation' occur in two parts of this assembly error, the order appears to be relevant after all.

The old setting --test:ParallelCheckingWithSignatureFilesOn, always did respect this order. It used the same gimmick of adding signature information to the TcEnv of implementation files. But it always respected the file order.

What currently feels wrong to me here, is that we fold the information of an implementation file to the TcState.TcEnvFromImpls of implementation files (be it a copy of the signature file) first before the actual physical signature is merged on top of the TcState.TcEnvFromSignatures of signature files. This means that NodeToTypeCheck.ArtificialImplFile of X is folded before the NodeToTypeCheck.PhysicalFile of X. If X has both the ArtificialImplFile (copy of signature data for implementation TcEnv) and PhysicalFile (real signature), that should always be folded first.
This is not the expected way this should happen. For the same reason, your signature file always needs to be above your implementation in your project.

My educated guess is that this might impact the CcuThunk of the TcState and thus leads us to the error we see. I have yet to find real evidence for this. Nor can I really put my finger on why this happens for Continuation.fsi and no other files. Speculation is that both Continuation.fs and Continuation.fsi are very small and this might be related.

Once I sorted the order of finishers, I was no longer able to reproduce the problem.

On a side note, the graph-based node code can be a bit confusing to debug at times. That is why I added the NodeToTypeCheck to the Finisher. I think we might benefit from a few more additions to improve the debugging experience.

@safesparrow I would be in favour of removing the generics in TypeCheckingGraphProcessing.processTypeCheckingGraph, I don't think they add that much, to be honest. Thoughts?

Lastly, when you think about it. Continuation.fsi would also just be a private module for FileContentMapping.fs. As it really is only used there and contains two helper functions.

@nojaf nojaf requested a review from a team as a code owner June 26, 2023 11:22
@nojaf nojaf marked this pull request as draft June 26, 2023 11:26
@safesparrow
Copy link
Contributor

After some investigation, we noticed that when preparing a certain TcState to be used in CheckOneInputWithCallback, the order of the folded finishers is random.
Initially, this didn't seem to matter. And to me, it is still somewhat unclear whether it does.

I think we do have a good grasp on why processing artificial files before signature files leads to an error - signature files check their contents for duplicates against CcuSig, but add results to tcsRootSigs.
ArtificialImplFiles also check against CcuSig, but add to CcuSig as well.

So processing of a signature file does not affect further duplicate checks, because it doesn't modify CcuSig.

Processing of ArtificialImplFile (or real impl file) does add its symbols to CcuSig and causes signature file to detect duplicates.

This is equivalent to processing an implementation file before its signature (ignoring the fact this would fail at earlier stages).

So it makes sense that order should be maintained.

The one thing I wasn't sure of was why this wasn't spotted before, given it's a generic issue. This is still not clear to me.
Ordering can be different on each run due to using unordered dictionaries in graph/node handling - but why this doesn't error for all other files, I'm not sure.

The change is definitely the right thing to do regardless.

Copy link
Contributor

@0101 0101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I just started investigating this error in Transparent Compiler. Will try to apply the changes there and see if it helps. Even though I'm not using processTypeCheckingGraph or combineResults, it looks like the underlying issue might be the same.

@vzarytovskii
Copy link
Member

Nice! I just started investigating this error in Transparent Compiler. Will try to apply the changes there and see if it helps. Even though I'm not using processTypeCheckingGraph or combineResults, it looks like the underlying issue might be the same.

Oh, is it the one when we have wrong data after unpicking?

@safesparrow
Copy link
Contributor

@safesparrow I would be in favour of removing the generics in TypeCheckingGraphProcessing.processTypeCheckingGraph, I don't think they add that much, to be honest. Thoughts?

I don't mind either way.
The generics were used in an attempt to separate the scheduling from the type-checking internals, AST types etc., to make them easier to use in tests - but this part of the code doesn't have tests atm. And the abstraction does make it more difficult to reason about the code for little benefit.

@0101
Copy link
Contributor

0101 commented Jun 27, 2023

Oh, is it the one when we have wrong data after unpicking?

No that's a separate issue. Most likely unrelated to this.

@safesparrow I would be in favour of removing the generics in TypeCheckingGraphProcessing.processTypeCheckingGraph, I don't think they add that much, to be honest. Thoughts?

I don't mind either way. The generics were used in an attempt to separate the scheduling from the type-checking internals, AST types etc., to make them easier to use in tests - but this part of the code doesn't have tests atm. And the abstraction does make it more difficult to reason about the code for little benefit.

Can we keep it for now in case we'll be able to re-use it in Transparent Compiler? Where we'd potentially need to drag more data through it. Not sure yet if it's possible but would be nice if it was unified.

@nojaf
Copy link
Contributor Author

nojaf commented Jun 27, 2023

Can we keep it for now in case we'll be able to re-use it in Transparent Compiler?

I genuinely think it would be great if we could address that situation when it arises. We can easily identify and streamline the code when we notice any duplication.

@nojaf nojaf marked this pull request as ready for review June 27, 2023 09:41
@vzarytovskii vzarytovskii merged commit 778df95 into dotnet:main Jun 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants