-
Notifications
You must be signed in to change notification settings - Fork 833
Description
This may be too large in scope for #1103, but I believe it's worth considering.
Thomas Wies gave a presentation at Compose Conference 2016 titled "Improving Type Error Localization for Languages with Type Inference". The presentation video is about 50 minutes long, and the corresponding paper is 18 pages long counting the bibliography, so I'll try my best to summarize them. I certainly won't be able to do it justice, though, so watching the 50-minute video is recommended at some point. (Beware: there's an audio problem at 24:54 in that video, in which a beep that lasts for about 40 seconds is accidentally superimposed over the audo. Turn your headphone volume down before that point to spare your ears; at 25:35 the beep goes away and Thomas resumes his presentation).
What
What - an error message that could be improved, and a code example to trigger it
The code example that Thomas Wies starts discussing around 10:45 of the video (the slide comes on a few seconds later), adapted to something that will compile in F#, is basically the following:
type Point = float * float
let example2 (lst:Point list): Point list =
// ... a real example would have some more code here ...
let rec loop lst x y dir acc =
if lst = [] then
acc
else
printfn "Debugging message"
loop lst 0.0 0.0 0.0 [(0.0,0.0)] |> List.revThe compiler produces two error messages from this code. The first error is on the [(0.0,0.0)] argument when loop is called, and the error message is:
This expression was expected to have type
unit
but here has type
'a list
Why
Why the error message is confusing
An F# newbie who wrote that code would be confused by that error message. This value is getting passed in as the acc parameter, which he intended to be treated as a list of Points. Why would acc be considered to have the unit type? He doesn't have any () values anywhere in his function, so where is that coming from? (An experienced programmer would, of course, remember that printfn returns unit, but an F# newbie probably still thinks of printfn as a "statement" and doesn't realize that his printfn invocation is causing the return type of loop to be considered unit.)
Further, there is other information in the code that might allow the compiler to infer that the acc parameter, and the return type of loop, are both intended to be a list. The result of loop is piped into List.rev, suggesting that it should have been a list. And passing [(0.0,0.0)] to loop as the value of acc also suggests that the programmer's intent is to use a list.
How
How the error message could be improved
Thomas Wies's presentation suggests a weight-based heuristic, whose details I won't go into here in order to save space. But essentially, every place where the types are mismatched should be looked at, and a weight assigned to each. Then at each place, the compiler would consider "Here the expression has type A, but elsewhere it has type B. What if this expression had had type B instead -- would that cause the type mismatch to go away?" And of all the possible "switch type A to type B" or "switch type B to type A" conversions, the one with the smallest weight (which would usually mean the one that involved the fewest type changes to the code) would be chosen as the error message.
In this code, that heuristic would identify several places in the code where there's a type mismatch between 'a list (or in some cases Point list) and unit:
- The
accparameter ofloop - The
thenbranch whereaccis returned fromloop, thereby constrainingloop's return type to be the same asacc - The
elsebranch where the return value ofprintfnis returned fromloop, thereby constrainingloop's return type to be the same as whatprintfnreturns - The place where
[(0.0,0.0)]is passed toloopas the value of theaccparameter in this function call - The place where
List.revis called with the return value ofloop
The compiler can infer that all five of these locations must have the same type 'T. Locations 1 and 2 do not constrain 'T. Location 3 constrains 'T to be unit, location 4 constrains it to be Point list (or, more accurately, (float * float) list, but I'll call it Point list from now on), and location 5 constrains it to be 'a list. Following a basic weight-based heuristic, we could assign a weight of 0 to locations 1 and 2 since they impose no constraints, and a weight of 1 to the three other locations. Then the compiler would consider each option in turn:
- Changing the type at location 3 to be
Point listwould satisfy all constraints, since'a listis compatible withPoint list. So this change gets a score of 1. - Changing the type at location 4 to be
unitand changing no other locations would not be enough to satisfy all constraints. - Changing the type at location 5 to be
unitand changing no other locations would not be enough to satisfy all constraints. - Changing the type at locations 4 and 5 to be
unitwould satisfy all constraints. That involves making two changes at locations with weight 1, so this change gets a score of 2.
The lowest-score change that would satisfy all constraints is to change the type at location 3, so that is the one the compiler would base the error message on. And the error message would then identify the expression printfn "Debugging message" as the source of the error, and give the error message:
This expression was expected to have type
Point list
but here has type
unit
Drawbacks
Why we might not want to do this
This seems like it would be a non-trivial change to implement in the compiler -- certainly it seems it would require more work than most of the other suggestions from #1103. Although 39:12 in Thomas Weis's presentation, he starts talking about how this idea could be implemented, and he asserts that "it doesn't require all that much work to actually build this into a compiler." So maybe this would be feasible to implement -- if not in F# 4.1, then perhaps in 4.2 or 5.0. And if this could make it into F# 4.1, it would certainly make for a much friendlier experience for newbies.
So I believe this idea is at least worth considering: watch Wies's presentation, read his paper, and decide whether the benefit would be worth the work.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status