-
Notifications
You must be signed in to change notification settings - Fork 4.9k
When encoder is null, use JavaScriptEncoder.Default to check for NeedsEscaping #42023
Conversation
scalablecory
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the SIMD code was moved into the default encoder? lgtm
| } | ||
| return encoder == null ? | ||
| JavaScriptEncoder.Default.FindFirstCharacterToEncodeUtf8(value) : | ||
| encoder.FindFirstCharacterToEncodeUtf8(value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be:
(encoder ?? JavaScriptEncoder.Default).FindFirstCharacterToEncodeUtf8(value)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. This also produces smallest asm.
| idx = encoder == null ? | ||
| JavaScriptEncoder.Default.FindFirstCharacterToEncode(ptr, value.Length) : | ||
| encoder.FindFirstCharacterToEncode(ptr, value.Length); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the goto getting us here now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not much perf benefit on its own, but we do produce smaller disassembly without having multiple return points in a fixed block, which could help with inlining.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really? it seems like the difference would be a jmp vs a ret; is there some stack space cleanup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, re-checked the assembly diff, and removing the goto is better in terms of asm size.
// Some implementations of JavascriptEncoder.FindFirstCharacterToEncode may not accept
// null pointers and gaurd against that. Hence, check up-front and fall down to return -1.
if (value.IsEmpty)
{
return -1; // All characters are allowed.
}
fixed (char* ptr = value)
{
return (encoder ?? JavaScriptEncoder.Default).FindFirstCharacterToEncode(ptr, value.Length);
}versus
public static unsafe int NeedsEscaping(ReadOnlySpan<char> value, JavaScriptEncoder encoder)
{
fixed (char* ptr = value)
{
int idx = 0;
// Some implementations of JavascriptEncoder.FindFirstCharacterToEncode may not accept
// null pointers and gaurd against that. Hence, check up-front and fall down to return -1.
if (value.IsEmpty)
{
idx = -1; // All characters are allowed.
goto Return;
}
idx = (encoder ?? JavaScriptEncoder.Default).FindFirstCharacterToEncode(ptr, value.Length);
Return:
return idx;
}
}See diff (3 fewer instructions total):
https://www.diffchecker.com/uqTkCcBl
Yep. |
|
Unrelated test failure: https://github.com/dotnet/corefx/issues/41968 |
| using System.Numerics; | ||
| using System.Runtime.CompilerServices; | ||
| using System.Runtime.InteropServices; | ||
| using System.Runtime.CompilerServices; // Do not remove. Needed for Int32LsbToHexDigit when !BUILDING_INBOX_LIBRARY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when !BUILDING_INBOX_LIBRARY
Put into #if !BUILDING_INBOX_LIBRARY to make this (code-) explicit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I will do this separately (for master) since it is primarily a cosmetic change and I want to cherry-pick this change for 3.1 (was waiting on CI to be green for a while :p).
…sEscaping (dotnet#42023) * When encoder is null, use JavaScriptEncoder.Default to check for NeedsEscaping. * Remove unnecessary unsafe keyword and add comment to using directive. * Address feedback. * Remove gotos and move the IsEmpty check outside the fixed block.
…sEscaping (dotnet/corefx#42023) * When encoder is null, use JavaScriptEncoder.Default to check for NeedsEscaping. * Remove unnecessary unsafe keyword and add comment to using directive. * Address feedback. * Remove gotos and move the IsEmpty check outside the fixed block. Commit migrated from dotnet/corefx@daadeef
See related PRs #41845, #41933
Now that the built-in
JavaScriptEncoderimplementations have been optimized, S.T.Json can leverage those rather than having its own (duplicate) logic. This is true, at least for theNeedsEscapingcheck.This PR effectively reverts the optimizations made in #41845 since those went down to S.T.E.W.
There is no behavioral change here.
Ideally, we'd get rid of the encoding logic as well, when encoder == null, but the S.T.E.W implementation needs to be made faster first (for the common cases) to avoid the regression.
Something like the following change (which treats null as JavaScriptEncoder.Deafult across the board)
ahsonkhan@ced1ee0
corefx/src/System.Text.Json/src/System/Text/Json/Writer/JsonWriterHelper.Escaping.cs
Lines 301 to 330 in 623703d
cc @steveharter, @GrabYourPitchforks, @gfoidl