Skip to content

CSHARP-3985: Support multiple SerializerRegistries #1592

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

papafe
Copy link
Contributor

@papafe papafe commented Jan 13, 2025

@rstam rstam self-requested a review January 14, 2025 17:11
@papafe
Copy link
Contributor Author

papafe commented Jan 30, 2025

This is still a super quick and dirty proof of concept, just to verify we can actually create a custom domain and move it along down the call stack. Of course this broke down several things, and works only in certain cases, but it's a starting point.
The main idea here is:

  • The serialization domain (IBsonSerializationDomain) can be setup at the client/database/collection level
  • The serialization domain is also added to the serialization context (BsonSerializationContext/BsonDeserializationContext). This way it is passed down wherever it is needed
  • In order to pass the serialization domain between the collection level and the BsonSerializationContext, the serialization domain is passed down in the MessageEncoderSettings,
  • From MessageEncoderSettings, the domain is then added to the BsonBinaryWriterSettings in the MessageBinaryEncoderBase.
  • Finally, the writer settings are read in the serialization context constructor.

(This works similar for `BsonSerializationContext)

I'm not trying to say that this is the perfect way, but I just tried to come up with the "most direct" way to go from the collection to the serialization contexts, so it's mostly a proof of concept.

Also, I still did not touch the conventions, the discriminators and the class maps, that are also statics and need to be part of the serialization domain.

@papafe papafe requested review from rstam and sanych-sun January 31, 2025 11:14
Copy link
Contributor

@rstam rstam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review every single file. I tried to pick the most relevant ones.

Overall this looks like the right direction.

But... it's going to be a pain to finish and to review.

@papafe papafe marked this pull request as ready for review May 26, 2025 14:57
@papafe papafe requested a review from a team as a code owner May 26, 2025 14:57
@papafe papafe marked this pull request as draft May 26, 2025 14:58
@papafe papafe removed the request for review from a team May 26, 2025 15:04
@papafe
Copy link
Contributor Author

papafe commented Jul 3, 2025

Now the PR is open for reviews. There are still a couple of open questions, but given the size of this I thought it would be better to start getting feedback the earliest possible. The next section will be both a summary and partially a guide on how to read the PR itself.

The aim of this PR is to move away from the global serialization settings we use right now (BsonSerializer, BsonClassMap, ....) and move towards a more flexible system that uses instance serialization settings.
This should allow, among the rest:

  • To have different serialization settings depending on the client/database/collection
  • To improve the testability of serialization/deserialization, as we'll move away from global state
    As a first step in that direction, this PR tries to remove all the use of the global state internally, without changes for the developers using the API. We decided not to make the changes public as we would like to take this occasion for a revamp/simplification/reorganisation of the serialization API.

The main idea behind the PR is to create a serialization domain, represented by the IBsonsSerializationDomain interface, and to find a way to pass this interface around so that all the places in our code in which we refer to the global state, we use instead this instantiated domain.
IBsonSerializationDomain is an interface that contains all the global state that was contained in static classes, namely: BsonSerializer, BsonClassMap, ConventionRegistry, BsonDefaults. The methods and properties of BsonSerializer are directly represented in the interface, while for the other static classes, those are included as properties in IBsonSerializationDomain.
One thing to note is that I tried not to modify the current global serialization API, even though there are various improvements that could be done, including removing unnecessary methods, better grouping of methods by functionality (for example creating a DiscriminatorRegistry) and reducing the use of locks. I decided again doing any king of improvement as this PR is already quite broad, and having changes would have further complicated it. Nevertheless, this is something that can be done in follow up PR, mostly after we have taken a decision on the shape of the public API.

In order for the domain to be accessible where it is needed, I've added a domain property in multiple classes, among which:

  • MongoClientSettings/MongoDatabaseSettings/MongoClientSettings
  • BsonSerializationContext/BsonDeserializationContext
  • BsonReaderSettings/BsonWriterSettings
  • TranslationContext
  • MongoQueryProvider
  • MessageEncoderSettings

These new properties allow to pass down the domain where it is needed. This also required enriching interfaces with methods or properties that use the domain. For public interface, this was obtained by creating an internal interface that derives from the public one, and that contains the new method that take the domain as input, as you can see with IMemberMapConventionInternal for example. In order to keep the implementation of those interfaces hidden from the public API, those interfaces have been implemented explicitly.
Also, in order to keep compatibility with the current API, I have created the static InternalExtensions (in both the Bson and Driver package) to create internal extension methods that select the appropriate method to call depending if a class implements the internal (enriched) interface, or only the public one. For example, take a look at InternalExtensions.ApplyInternal.

Another thing to note is that I did not remove calls to the global state (mostly to BsonSerializer) everywhere I could, as I think in some cases it is the most appropriate. In those cases (for example inside the Authentication.AWS or Driver.Encryption packages ) I think BsonSerializer is used as a container for "default serialization", for example when deserializing keys for encryption. I think in the future we need to create a "default" serialization domain that can't be modified and only contains default serializers/deserializers to be used in those cases. At the moment the use of BsonSerializer is possibly risky, as we do not know how developers are modifying the global settings, even though we don't expect them to modify the most common serializers.

In order to keep the changes private and let Drivers and other packages access the internals of Bson, I had to use the InternalsVisibleToAttribute.

As a final note, I've left lots of comments in the code, in order to remember why certain decisions were taken, and a couple of questions left. Those comments have specific tags so they can be found and removed.

@papafe papafe requested a review from rstam July 3, 2025 15:27
@papafe papafe marked this pull request as ready for review July 3, 2025 15:27
@papafe papafe requested a review from BorisDog July 3, 2025 15:27
/// <summary>
/// Gets the settings of the reader.
/// </summary>
BsonReaderSettings Settings { get; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was just an omission.

In any case I don't think we need this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@@ -62,7 +62,7 @@ public interface IBsonReader : IDisposable
/// <summary>
/// Pops the settings.
/// </summary>
void PopSettings();
void PopSettings(); //TODO Why do we have push and pop methods? They are not used. We should remove them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something for a different PR maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was mostly a reminder. I'll remove this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -25,7 +25,7 @@ namespace MongoDB.Bson.IO
/// <summary>
/// Represents a BSON reader for some external format (see subclasses).
/// </summary>
public abstract class BsonReader : IBsonReader
public abstract class BsonReader : IBsonReaderInternal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IBsonReaderInternal not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -77,6 +79,16 @@ public BsonReaderSettings FrozenCopy()
}
}

internal IBsonSerializationDomain SerializationDomain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be a reader setting.

Classes in this directory are low level I/O classes that just do I/O.

Serialization is one level up and is built on TOP of these I/O classes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

/// <summary>
/// //TODO
/// </summary>
internal IBsonSerializationDomain SerializationDomain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be a writer setting.

Classes in this directory are low level I/O classes that just do I/O.

Serialization is one level up and is built on TOP of these I/O classes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@@ -0,0 +1,317 @@
using System;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add copyright.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

{
return type.IsGenericType && type.GetGenericTypeDefinition() == typeof(Nullable<>);
}
// Commented out because there is an identical method in Bson assembly (and also in this assembly...).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See CSHARP-5632.

The duplication between Bson and Driver was because they were internal.

Be careful if you remove any of the duplication here. I don't think all the duplicate methods are 100% identical and there might be subtle reasons why (but I don't know them if there are).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. It seems that the method works the same.

{
var discriminator = discriminatorConvention.GetDiscriminator(nominalType, actualType);
var discriminator = discriminatorConvention.GetDiscriminatorInternal(nominalType, actualType, serializationDomain);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's going the VERY difficult and fragile to remember NOT to call GetDiscriminator and to call GetDiscriminatorInternal (or whatever we decide to call it) instead...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this. One of the things I was thinking, is that maybe we could use something like:

#if DEBUG
[Obsolete("Please use GetDiscriminatorInternal)]
#endif
public void GetDiscriminator() { }

Not necessarily this, but something to remind us not to use this internally (while eventually allowing tests).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possibility is to use something like https://www.nuget.org/packages/Microsoft.CodeAnalysis.BannedApiAnalyzers/ that will raise an error while compiling. What do you think?


namespace MongoDB.Driver.Support
{
internal static class InternalExtensions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to see this in a file called IPipelineStageDefinitionExtensions.cs.

I also would not mix extension methods for different types in the same file (that didn't happen here though, but it did in some other file).

<Compile Include="..\MongoDB.Shared\SequenceComparer.cs" Link="Shared\SequenceComparer.cs" />
<Compile Include="..\MongoDB.Shared\Hasher.cs" Link="Shared\Hasher.cs" />
</ItemGroup>
<!-- The followings have been removed because they are accessed directly through MongoDB.Bson, otherwise there will be a double definition -->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean "INDIRECTLY through MongoDB.Bson"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InternalsVisibleTo sucks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, "indirectly". And yes, but that's the only way to go to keep all the new things internal :)

/// <returns>All registered class maps.</returns>
public IEnumerable<BsonClassMap> GetRegisteredClassMaps()
{
_serializationDomain.ConfigLock.EnterReadLock();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to synchronize such external collection via shared lock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I did not give much thought to the whole synchronization/locking that we currently have. I didn't want to make even more changes than we currently have so I just tried to keep what we were doing before.

{
BsonSerializer.ConfigLock.EnterReadLock();
var configLock = context.SerializationDomain!.ConfigLock;
configLock.EnterReadLock();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_frozen check can be done outside the lock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the way that it's done currently, and I didn't modify it so it's easier to compare it with what we currently have. I'd suggest to not touch it for now.

__dynamicArraySerializerWasSet = true;
__dynamicArraySerializer = value;
}
get => BsonSerializer.DefaultSerializationDomain.BsonDefaults.DynamicArraySerializer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for this class to be immutable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could when we have builders and we'll have only setters.


namespace MongoDB.Bson
{
internal class BsonDefaultsDomain : IBsonDefaults
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for this class to be immutable?
Also do we need to maintain Defaults entities under domain, as opposed to just Settings (which are initialized to defaults)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before, we can decide to make this immutable when this is created with builders.
Regarding the second point, would you like the class to be renamed to Settings and initialize those from BsonDefaults that are kept static?

/// //TODO
/// </summary>
/// <returns></returns>
internal static IBsonSerializationDomain CreateSerializationDomain() => new BsonSerializationDomain();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future cleanup: No need for factory method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I added a comment that it can be removed.

@@ -20,6 +20,7 @@

namespace MongoDB.Bson.Serialization
{
//DOMAIN-API We should consider making all our serialization provider classes sealed or internal.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4.0 ticket?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the comments left with //DOMAIN-API are related to changes that should/can't be done now and should be addressed later.


/* QUESTION I removed the part where we set the dynamic serializers from the BsonDefaults, and delay it until we have a serialization domain (when we build the DeserializationContext).
* This is technically changing the public behaviour, but it's in a builder, I do not thing it will affect anyone. Same done for the serialization context.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be an observable effect for this change?
When BsonDeserializationContext instance is returned by Build, dynamic serializers are set in both cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually not necessary anymore since we're passing the domain here, so I've removed it.

@@ -118,7 +118,8 @@ public object DeserializeValue(BsonValue value)
var tempDocument = new BsonDocument("value", value);
using (var reader = new BsonDocumentReader(tempDocument))
{
var context = BsonDeserializationContext.CreateRoot(reader);
//QUESTION Is it correct we only need a default domain here?
var context = BsonDeserializationContext.CreateRoot(reader, BsonSerializer.DefaultSerializationDomain);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, should the relevant domain be passed along with the _serializer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case my doubt is that if we actually here need what I called a "standard" domain before.

internal class ConventionRegistryDomain : IConventionRegistryDomain
{
private readonly List<ConventionPackContainer> _conventionPacks = [];
private readonly object _lock = new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be made lock-free with a simple array. Or concurrent dictionary (lock-free for reads).
This is minor as this is not a hot path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I tried to keep the code as close as possible as it was before to have:

  • less contentious points
  • less things to review given the size of this PR

// Assert.Equal(expectedVal, toString);
// }

//The first section demonstrates that the class maps are also separated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also need more explicit testing of multiple domains working side by side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. The testing at the moment has been minimal to verify the basic functionalities work.

@papafe papafe added the chore Label to hide PR from generated Release Notes label Aug 19, 2025
@papafe
Copy link
Contributor Author

papafe commented Aug 20, 2025

Some notes about the tags used in comments:

  • //DOMAIN-API used for changes that can't be done now due to being in the public API, but something we should probably do for next major
  • //QUESTION questions that should be answered
  • //FP TODO mostly TODOs for myself
  • //EXIT mostly indicates conveniency constructors. I've added lots of constructors that take the domain as input, but I didn't want to add even more changes by modifying the test file to add the default domain in the call, as this would result in even more changes to review. For this reason I've added this tag to signal those methods/constructors that can eventually be addressed in another PR.

@papafe papafe requested review from rstam and BorisDog August 20, 2025 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore Label to hide PR from generated Release Notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants