Skip to content

Conversation

@jozkee
Copy link
Member

@jozkee jozkee commented Feb 27, 2020

Fixes #30524

Introduces the mechanism that is going to be used for extending the supported types on TKey in a Dictionary<TKey, TValue>. The purpose of this PR is to give a taste of the way that we are going to tackle the problem.

Adds support for int, enum, Guid and object (When object is one of the supported types during runtime).

In subsequent PRs we should be able to extend this support to the rest of dictionaries supported by the JsonSerializer i.e. IDIctionary<TKey, TValue>, ImmutableDictionary<TKey, TValue>, etc.

I hold off of adding support for dictionaries annotated with the JsonExtensionData attribute but those can also be included in the previous paragraph.

@jozkee jozkee requested a review from ahsonkhan as a code owner February 27, 2020 06:09
}
else
{
state.Current.PolymorphicJsonPropertyInfo = state.Current.DeclaredJsonPropertyInfo!.RuntimeClassInfo.ElementClassInfo!.PolicyProperty;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steveharter This was causing me problems since it causes that the JsonClassInfo takes its value from PolymorphicJsonPropertyInfo when processing the JsonExtensionData dictionary, I removed the line and ran the tests and nothing broke, so maybe you might know better what was the purpose behind this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GrabYourPitchforks
Copy link
Member

A reminder: it is not safe to deserialize a hash bucket-style collection (e.g., Dictionary<TKey, TValue>, HashSet<TKey>) unless TKey = string. Such collections are subject to denial-of-service attacks if populated with attacker-chosen keys.

This shouldn't prevent a "please allow me to deserialize my desired dictionary type" feature from going through. But:

  • Please double-check that System.Text.Json does not instantiate any objects of type Dictionary<[anything other than string], TValue> unless the applications object model explicitly calls for such.

  • We should consider documenting that deserializing to these collection types is not a safe operation and that web service developers should not include them in object models dealing with untrusted data.

@Clockwork-Muse
Copy link
Contributor

A reminder: it is not safe to deserialize a hash bucket-style collection (e.g., Dictionary<TKey, TValue>, HashSet<TKey>) unless TKey = string.

... string being safe because it uses randomized hashing, not due to any other properties, right?

Which means that a dictionary taking a (correctly implemented) IEqualityComparer for other types should also be safe.

Side note: should we look into providing a wrapping comparer to perform randomized hashing?

@GrabYourPitchforks
Copy link
Member

Yes, it's because Dictionary<string, ...> benefits from randomized hashing (both in .NET Core and in .NET Full Framework). Making other key types resilient against hash collisions has come up before, but prototyping a solution usually uncovers compat and reliability problems that stunts making much progress.

}

[Fact]
public static void TestNotSuportedExceptionIsThrown()
Copy link
Member Author

@jozkee jozkee Feb 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this PR, NotSupportedExceptions are less often thrown since I don't check for the dictionary being supported until we actually call the dictionary KeyConverter. Is this behavior acceptable?

@steveharter Does your changes in #32669 related to delayed initialization also affect this behavior?

I can see that I need to fix this to avoid behavior changes but I just wanted to point this out since maybe this can be an acceptable or reasonable change.

The following table shows how this behave on master vs the feature branch.

private class UnsupportedDictionaryWrapper
{
    public Dictionary<int[], int> Dictionary { get; set; }
}

private class UnsupportedDictionaryWrapper_Wrapper
{
    public UnsupportedDictionaryWrapper Wrapper { get; set; }
}

Throws NotSupportedException?

Type JSON Branch: master Branch: feature
Dictionary<int[], int> yes yes
Dictionary<int[], int> {} yes yes
Dictionary<int[], int> null yes no
UnsupportedDictionaryWrapper yes no
UnsupportedDictionaryWrapper {} yes no
UnsupportedDictionaryWrapper null yes no
UnsupportedDictionaryWrapper {"Dictionary":{}} yes yes
UnsupportedDictionaryWrapper {"Dictionary":null} yes no
UnsupportedDictionaryWrapper_Wrapper no no
UnsupportedDictionaryWrapper_Wrapper {} no no
UnsupportedDictionaryWrapper_Wrapper null no no
UnsupportedDictionaryWrapper_Wrapper {"Wrapper":{}} yes no
UnsupportedDictionaryWrapper_Wrapper {"Wrapper":null} no no
UnsupportedDictionaryWrapper_Wrapper {"Wrapper":{"Dictionary":{}} yes yes
UnsupportedDictionaryWrapper_Wrapper {"Wrapper":{"Dictionary":null} yes no
List<Dictionary<int[], int>> no no
List<Dictionary<int[], int>> null no no
List<Dictionary<int[], int>> [] yes no
List<Dictionary<int[], int>> [""] yes no
List<Dictionary<int[], int>> [null] yes no
List<Dictionary<int[], int>> [{}] yes yes

cc @layomia

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delayed validation makes sense to me. We do that in other cases as well for perf (including validating collection elements, elements of elements, etc).

As long as we continue to have tests for unsupported cases we should be fine. We should also check the Exception messages for the proper substrings (might require #32669)

// Support JSON Path on exceptions.
public byte[]? JsonPropertyName; // This is Utf8 since we don't want to convert to string until an exception is thown.
public string? JsonPropertyNameAsString; // This is used for dictionary keys and re-entry cases that specify a property name.
internal byte[]? DictionaryKeyName; // This will contain the Utf8 Json property name that represents a dictionary key; used to defer parsing on async/re-entry.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that I can re-use JsonPropertyName for this purpose, however, JsonPropertyName will no longer be exclusive for exceptions.
@steveharter @layomia

ThrowHelper.ThrowJsonException_DeserializeUnableToConvertValue(TypeToConvert);
}

state.Current.JsonPropertyNameAsString = reader.GetString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need to set JsonPropertyNameAsString?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since DefaultDictionaryConverter.OnTryRead is still being used by other dictionaries like ImmutableDictionary or IDictionary; right now it is still needed.

We can remove it once we spread the TKey changes to those converters as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep the code clean/consistent (and to avoid the unnecessary GetString() call here), we should extend key support for all the dictionary converters.


namespace System.Text.Json.Serialization.Converters
{
internal sealed class ObjectKeyConverter : KeyConverter<object>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we just use JsonDocument, how valuable is this?

What does GetHashCode() return from JsonDocument?

I think this scenario would be valuable once we add true polymorphic deserialization (and not just use JsonDocument).

Copy link
Member Author

@jozkee jozkee Feb 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand the concern here. What this converter does is take the JSON property name as a string and create a JsonElement when the TKey is object in order to emulate the behavior of the ObjectConverter

What does the GetHashCode() method have to do here?

Copy link
Contributor

@steveharter steveharter Feb 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string-based key creates a JsonElement which is used as a key in a dictionary, so GetHashCode() is called on the JsonElement by the dictionary during get\add operations.

Since JsonElement uses a default GetHashCode() implementation I don't see any value of having it as a key. If instead we supported polymorphic deserialization, the resulting object (not JsonElement) could have its own implementation of GetHashCode() and thus be useful as a key.

Also since JsonElement is a value type, the Equals() implementation won't be ideal since it will call Equals() on each field (slow). In summary, JsonElement was not designed to support being a dictionary key.

Copy link
Member Author

@jozkee jozkee Mar 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should not support such type, otherwise we would fail on round-tripping if someone have a property Dictionary<object, object>.

What does GetHashCode() return from JsonDocument?

I think you mean JsonElement. It does return different values even for the same string so yes, you might have problems on get/add operations on the dictionary https://dotnetfiddle.net/pVlMi3

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should wait until we have an option to return boxed primitives over JsonElement before we consider allowing typeof(object) as a dictionary key.

Copy link
Contributor

@ahsonkhan ahsonkhan Mar 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should wait until we have an option to return boxed primitives over JsonElement before we consider allowing typeof(object) as a dictionary key.

Why? We have object property support already, why not extend that for dictionary keys now with the same semantics? We still need to flesh out the semantics/behavior when folks don't use that option, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to flesh out the semantics/behavior when folks don't use that option, right?

That's fair.

For reasons called out in https://github.com/dotnet/runtime/pull/32909/files/53bfb9636b04b157b945ca59a1a6e958a791e00b#r385742410, calling .ContainsKey on the dictionary would only work if you "know" the specific JsonElement you are querying. The dictionary would only be useful when iterating over all key-value pairs. Lookup would be busted in general.

{
get
{
if (_keyConverter == null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if a different approach would be to add new internal methods to JsonConverter to support keys. That way, we can just use the existing converters and don't need a new variable here, or add a new static ConcurrentDictionary.

Copy link
Member Author

@jozkee jozkee Feb 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, you mean virtual methods that are implemented only for Dictionary converters, right?
But then how do you determine which method call for each specific TKey depending on its type?
You would have to use delegates (or something alike) for each type, i.e. one for int, one for Guid, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking we could add two new virtual methods to JsonConverter<T> like:

internal virtual string ReadAsString(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
{ throw new NotSupportedException(); }

internal virtual void WriteAsString(Utf8JsonWriter writer, TValue value, JsonSerializerOptions options)
{ throw new NotSupportedException(); }

which would be implemented by the Int32Converter, GuidConverter, etc.

* Throw InvalidOperationException on StringKeyConverter
@jozkee
Copy link
Member Author

jozkee commented Mar 2, 2020

CI issue is unrelated and is being tracked on #2280.

ThrowHelper.ThrowJsonException_DeserializeUnableToConvertValue(TypeToConvert);
}

state.Current.JsonPropertyNameAsString = reader.GetString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep the code clean/consistent (and to avoid the unnecessary GetString() call here), we should extend key support for all the dictionary converters.

where TCollection : IDictionary
{
protected override void Add(object? value, JsonSerializerOptions options, ref ReadStack state)
protected override void Add(string _, object? value, JsonSerializerOptions options, ref ReadStack state)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would should use the key parsed with the key converter in DictionaryDefaultConverter instead of JsonPropertyNameAsString.

Comment on lines +23 to +24
// If we need to apply the policy, we are forced to get a string since that is the only type that ConvertName can take as argument.
else if (options.DictionaryKeyPolicy != null && !state.Current.IgnoreDictionaryKeyPolicy)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would one want a naming policy applied to anything that is not a string e.g. Guid, Uri, Int32? I think we should only apply this on strings and wait for user feedback before extending it. We'll also avoid extra ConvertName calls.


namespace System.Text.Json.Serialization.Converters
{
internal sealed class ObjectKeyConverter : KeyConverter<object>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should wait until we have an option to return boxed primitives over JsonElement before we consider allowing typeof(object) as a dictionary key.

@jozkee jozkee closed this Jun 17, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 10, 2020
@jozkee jozkee deleted the ExtendTKeySupport branch February 3, 2021 17:44
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New Asp.NET Core 3.0 Json doesn't serialize Dictionary<Key,Value>

7 participants