Skip to content

Conversation

@saul
Copy link
Contributor

@saul saul commented Mar 8, 2017

This PR adds a flag to the %A format specifier: @. This flag escapes control characters and other characters that need to be escaped in F# source code string literals.

Before

image

After

image


This allows us to directly copy strings that have been formatted with %@A straight into code - and they'll 'just work'.

Remaining work:

  • Print strings with %@A as verbatim strings if they are 'path like' (i.e only contain backslashes as their escaped characters)
  • Write test for %+A and %@A (and a combination of the two)
  • Write test showing that FSI outputs the 'it' variable as %@A by default
  • Fix current tests

@smoothdeveloper
Copy link
Contributor

It makes a lot of sense :)

Can we get an idea if performance changed / improved on large strings?

I'm seeing we had List.rev but were using indexer on the string rather than Seq.cast.

@saul
Copy link
Contributor Author

saul commented Mar 8, 2017

Neither check nor conv were used in the old implementation. This implementation will definitely be slower as it is actually doing something, whether this is something we care about I'll leave that for @dsyme to decide.

@saul
Copy link
Contributor Author

saul commented Mar 8, 2017

Locally (uncommitted) I'm trying out formatting "path like" strings (no escaped characters other than backslashes) as verbatim strings. I'd say the vast majority of times that I've seen verbatim strings, it's for paths (in C# and F#):

image

@smoothdeveloper
Copy link
Contributor

@saul I see that the code is not printing the whole string, but is it correct that formatString can be used in other parts?

I've made another implementation of the function:

  let formatString2 (s:string) =
      let builder = System.Text.StringBuilder(s.Length)
      builder.Append("\"") |> ignore
      for c in s do
          builder.Append(formatChar false c) |> ignore
      builder.Append("\"") |> ignore
      string builder

and it seems to run lot faster (tried on a 4mb xml file):

image

I think using StringBuilder is sensible here.

@saul
Copy link
Contributor Author

saul commented Mar 8, 2017

That's a shame - I was hoping that writing it in idiomatic F# would be performant enough. I'll try your 'large string' benchmark tomorrow.

Thanks for looking into this @smoothdeveloper

@smoothdeveloper
Copy link
Contributor

@saul I'm sure you can turn my code into idiomatic F#, Seq.iter and making local function for builder.Append, we will have best of both worlds :)

@dsyme
Copy link
Contributor

dsyme commented Mar 9, 2017

My suggestion is that we should use %@A or some other new qualifier to activate this, since otherwise it will likely be a breaking change.

@KevinRansom
Copy link
Contributor

I thought that I had already commented that this would be a breaking change, lots of code uses %A to output readable text, and undoubtedly works around or perhaps relies on the current quirky behavior, so an alternate formatting char would be useful.

@saul
Copy link
Contributor Author

saul commented Mar 9, 2017

@dsyme @KevinRansom What about if a string is nested within a record? How would we print it with %A then?

Are you suggesting that we keep all string formatting the same unless a different flag has been supplied? How do we do %+A with this new string formatting flag then?

Personally I think it makes more sense if we accept it as a breaking change - if people want the old behaviour it's very easily replicated with sprintf "\"%O\"" s.

I think another flag just adds even more complexity to printf. I did a straw poll of seasoned F# developers at my office today and about 10% of them knew what %+A did. I think the discoverability of flags is shoddy and it would be best if we were correct by default.

@KevinRansom
Copy link
Contributor

@saul

FSharp has many thousands of active customers, a breaking change in this will affect a portion of them in some way. The pain addressed by this PR is probably insufficient to justify a language breaking change. Certainly if C# had this feature there is no way it would get past the breaking change committee. I have seen code-reviews were ToString() modifications were rejected. I'm sure that we can figure out a non-breaking way of adding the type of formatting you are suggesting.

@saul
Copy link
Contributor Author

saul commented Mar 9, 2017

Thanks @KevinRansom, that's fine - I'll implement it as a '@' flag for now and update this PR in due course :)

Also @smoothdeveloper I did some perf tests today, and here are the results:

$ fsi formatString.fsx --optimize+
52 msecs (imperative for loop with string builder)
51 msecs (imperative for loop with string builder (double capacity))
53 msecs (String.iter (piped) with string builder (double capacity))
52 msecs (String.iter (composition) with string builder (double capacity))
53 msecs (String.iter (composition) with string builder (quad capacity))
51 msecs (String.iter (composition) with string builder (double capacity, string constructor))
188 msecs (functional cast, map and String.concat)
74 msecs (functional cast, map and String.Join)

This is the average of 25 runs of each benchmark, parsing a .dll file and a .json file. The first 6 are very similar, with each run they tend to swap numbers. I think I'll go with the String.iter solution and update this PR.

@saul
Copy link
Contributor Author

saul commented Mar 9, 2017

Latest commit:

image

Current behaviour:

  • FSI prints 'it' with strings escaped
  • %A works as before
  • %@A dictates that a string should be escaped

I'm going to mark this WIP as I still have some work to do (will update original post).

@saul saul changed the title Escape control characters when %A printf'ing strings Add flag to the %A printf specifier that escapes control characters Mar 9, 2017
@saul saul changed the title Add flag to the %A printf specifier that escapes control characters [WIP] Add flag to the %A printf specifier that escapes control characters Mar 9, 2017
@cartermp
Copy link
Contributor

👍, I like it with %@A since it's consistent with the @ string literals. Do we think we'll need a quick RFC for this? I think it's a small enough feature that it doesn't necessarily need a new RFC, but it would be good to track somewhere so that we can go back around F# 4.2 timeframe and list it as a new thing in the FSharp.Core.

let o =
Microsoft.FSharp.Text.StructuredPrintfImpl.FormatOptions.Default
|> fun o ->
if useZeroWidth then { o with PrintWidth = 0}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use let bindings - that's normal style in FSharp.Core. If you like give all the o different names.

fieldInfo.GetValue(obj)

let formatChar isChar c =
match c with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure but perhaps we should be escaping some entire classes of Unicode characters as well. See https://en.wikipedia.org/wiki/Template:General_Category_(Unicode). Though I suppose it's not strictly necessary - the rule is that F# should be able to parse what it prints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I right in saying that F# files are parsed as UTF-8? In which case I don't think we need to escape anything else other than what's there already.

Copy link
Contributor

@dsyme dsyme Mar 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The relevant part of the spec will be the part on strings - and yes, I believe the ones you've got are the only characters that must be escaped. Even some of those need not be - like \t in strings - but I think you're aiming for a string that looks "good" rather than full of strange characters.


[<Test>]
member __.``Standard characters are correctly escaped``() =
test "%A" "\n" "\"\\n\""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests will need updating. Also cover +@A and @+A etc.

@saul
Copy link
Contributor Author

saul commented Mar 10, 2017

@dsyme See my latest commit - how do I add the new SprintfAFormatSpecifiers.fs and FsiEscapesStringsByDefault.fs QA tests to the fsharpqa suite?

@cartermp Where should I raise the RFC? Do you want me to do it in https://github.com/fsharp/fslang-suggestions?

@KevinRansom
Copy link
Contributor

Add the test cases to this file: visualfsharp/tests/fsharpqa/Source/Printing/env.lst

Something like:

	SOURCE=SprintfAFormatSpecifiers.fs			# SprintfAFormatSpecifiers.fs
	SOURCE=FsiEscapesStringsByDefault.fs			# FsiEscapesStringsByDefault.fs

Kevin

@saul
Copy link
Contributor Author

saul commented Mar 11, 2017

Thanks @KevinRansom, I'll make that change when I'm back.

The more pressing issue is that the tests don't compile as they're being compiled by F# 4.1 which doesn't support the @ flag. Do I need to make these QA tests instead of unit tests too?

@cartermp
Copy link
Contributor

cartermp commented Mar 11, 2017

@saul I think just an issue here to track the work + state the basic design of the format string is good enough.

Edit: Yup, RFC

@dsyme
Copy link
Contributor

dsyme commented Mar 11, 2017

The more pressing issue is that the tests don't compile as they're being compiled by F# 4.1 which doesn't support the @ flag. Do I need to make these QA tests instead of unit tests too?

That's odd. Have you run build.cmd proto on your machine to update the Proto compiler? Or is this on CI (which rebuilds the proto compiler automatically)?

@KevinRansom
Copy link
Contributor

KevinRansom commented Mar 11, 2017

@dsyme I will take a look, right now.
The tests should be built with the compiler made by the build not the proto-compiler.

@dsyme
Copy link
Contributor

dsyme commented Mar 11, 2017

@cartermp Where should I raise the RFC? Do you want me to do it in https://github.com/fsharp/fslang-suggestions?

Yes, add the RFC there, thanks!

@KevinRansom
Copy link
Contributor

KevinRansom commented Mar 11, 2017

Sadly ci_part1 works fine on my machine .... doh!!! that was sauls master branch!!! trying again with the real deal.

@KevinRansom
Copy link
Contributor

KevinRansom commented Mar 11, 2017

When building FSharp.Core.dll for the portable profiles we get this:

  C:\Users\kevinr.REDMOND\AppData\Local\Temp\.NETPortable,Version=v4.5,Profile=Profile7.AssemblyAttributes.fs


c:\saul\visualfsharp\src\utils\sformat.fs(940,71): error FS0001: This expression was expected to have type�    'char'    �but here has type�
    'obj' [c:\saul\visualfsharp\src\fsharp\FSharp.Core\FSharp.Core.fsproj]

c:\saul\visualfsharp\src\utils\sformat.fs(944,45): error FS0001: This expression was expected to have type�    'obj'    �but here has type�
   'char' [c:\saul\visualfsharp\src\fsharp\FSharp.Core\FSharp.Core.fsproj]
Done Building Project "c:\saul\visualfsharp\src\fsharp\FSharp.Core\FSharp.Core.fsproj" (Build target(s)) -- FAILED.

Done Building Project "c:\saul\visualfsharp\build-everything.proj" (default targets) -- FAILED.


Build FAILED.

I think there may be an incompatability in the portable profile that causes us to prefer IEnumerable, rather than IEnumerable<char>

Yep!!! profile 7 version of string does not support IEnumerable<char> .... wow!!!!!
extracted using reflector from profile7 System.Runtime

.class public auto ansi sealed beforefieldinit String
    extends System.Object
    implements System.Collections.IEnumerable, System.IComparable, System.IComparable`1<string>, System.IEquatable`1<string>

extracted using reflector from net45 mscorlib

.class public auto ansi serializable sealed beforefieldinit String
    extends System.Object
    implements System.IComparable, System.ICloneable, System.IConvertible, System.IComparable`1<string>, System.Collections.Generic.IEnumerable`1<char>, System.Collections.IEnumerable, System.IEquatable`1<string>
{

Just testing the fix, I'll send a PR when it's done.

@KevinRansom
Copy link
Contributor

Now it looks like a regression here:

Compiled with built compiler:

c:\saul\visualfsharp\tests\fsharpqa\Source\Printing>\saul\visualfsharp\release\net40\bin\fsc WidthForAFormatter.fs
Microsoft (R) F# Compiler version 4.1
Copyright (c) Microsoft Corporation. All Rights Reserved.

WidthForAFormatter.fs(9,13): error FS0741: Unable to parse format string ''A' format does not support precision'

c:\saul\visualfsharp\tests\fsharpqa\Source\Printing>uedit64 WidthForAFormatter.fs

Compiled with released compiler:

c:\saul\visualfsharp\tests\fsharpqa\Source\Printing>fsc WidthForAFormatter.fs
Microsoft (R) F# Compiler version 4.1
Copyright (c) Microsoft Corporation. All Rights Reserved.

c:\saul\visualfsharp\tests\fsharpqa\Source\Printing>

I have pushed the portable7 fix to your fork.

I will let you take a look at the regression.

Kevin

@saul
Copy link
Contributor Author

saul commented Mar 12, 2017

Thanks @KevinRansom - I've committed a fix for that regression and added the new test to the suite.

If the tests go green then it's ready.

@KevinRansom
Copy link
Contributor

@saul a couple more test failures that look they are related to the change.

+++ Conformance\ObjectOrientedTypeDefinitions\InterfaceTypes (E_InheritInterface.fs) +++

*** The following necessary lines were never matched:
***	\(10,21-10,22\):.+error FS1207:.+Interfaces inherited by other interfaces should be declared using 'inherit \.\.\.' instead of 'interface \.\.\.'$


*** The following necessary lines were incorrectly matched:

Unexpected Compiler Output 
FAIL
+++ Printing (FsiEscapesStringsByDefault.fs) +++
Microsoft (R) F# Compiler version 4.1
Copyright (c) Microsoft Corporation. All Rights Reserved.

*** The following necessary lines were never matched:
***	val s : string = "Complex\n\"string"


*** The following necessary lines were incorrectly matched:

Unexpected Compiler Output 
FAIL



@saul
Copy link
Contributor Author

saul commented Mar 14, 2017

Thanks @KevinRansom - I believe I know what the second error is, but I'm afraid I've no idea how I could have caused the first one.

I'll get around to fixing the latter ASAP!

@KevinRansom
Copy link
Contributor

@dotnet-bot test this please

@saul
Copy link
Contributor Author

saul commented Aug 23, 2017

@dotnet-bot test this please

@KevinRansom
Copy link
Contributor

@dotnet-bot test this please

@cartermp
Copy link
Contributor

cartermp commented Feb 5, 2018

@KevinRansom Now that we don't build PCLs anymore, are the concerns there no longer valid?

Also conflicts.

@KevinRansom
Copy link
Contributor

@saul do you think we should still pursue this PR?

The conflict is easy to resolve PrintfTests.fs has moved from below src to below test.

@KevinRansom
Copy link
Contributor

@saul,

please reopen this PR when you are ready to pursue it further.

Kevin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants