Skip to content

Conversation

@osa1
Copy link
Member

@osa1 osa1 commented May 7, 2025

Currently when generating the ProtobufEnum value for an enum value on the
wire, we call the function valueOfFunc of FieldInfo of the enum field,
which directly indexes the enum class's _byValue map.

With this PR we avoid the closure call by making _byValue public and calling
it directly.

Wasm benchmarks:

  • Before: protobuf_PackedEnumDecoding(RunTimeRaw): 48200.0 us
  • After: protobuf_PackedEnumDecoding(RunTimeRaw): 42120.0 us
  • Diff: -12.6%

AOT benchmarks:

  • Before: protobuf_PackedEnumDecoding(RunTimeRaw): 49180.0 us
  • After: protobuf_PackedEnumDecoding(RunTimeRaw): 45726.82 us
  • Diff: -7%

See also alternative PR: #981.

@osa1 osa1 marked this pull request as draft May 8, 2025 08:10
osa1 added a commit to osa1/protobuf.dart that referenced this pull request May 8, 2025
This is an alternative to google#980 that doesn't make a big difference in
terms of performance of AOT compiled benchmarks, but makes a big
difference when compiling to Wasm, comapred to google#980.

When decoding an enum value, we call a callback in the enum field's
`FieldInfo`. The callback then indexes a map mapping enum numbers to
Dart values.

When these conditions hold:

- The known enum numbers are all positive. (so that we can use a list
  and index it with the number)

- The known enum numbers are more than 70% of the large known enum
  number. (so that the list won't have a lot of `null` entries, wasting
  heap space)

We now generate a list instead of a map to map enum numbers to Dart
values.

Note: similar to the map, the list is runtime allocated. No new code
generated per message or enum type.

Wasm benchmarks:

- Before: `protobuf_PackedEnumDecoding(RunTimeRaw): 48200.0 us`
- PR google#980: `protobuf_PackedEnumDecoding(RunTimeRaw): 42120.0 us`
    - Diff: -12.6%
- After: `protobuf_PackedEnumDecoding(RunTimeRaw): 35733.3 us`
    - Diff against PR google#980: -15%
    - Diff against master: -25%

AOT benchmarks:

- Before: `protobuf_PackedEnumDecoding(RunTimeRaw): 49180.0 us`
- PR google#980: `protobuf_PackedEnumDecoding(RunTimeRaw): 45726.82 us`
    - Diff: -7%
- This PR: `protobuf_PackedEnumDecoding(RunTimeRaw): 42929.7 us`
    - Diff against PR google#980: -6%
    - Diff agianst master: -12%
@osa1 osa1 requested a review from mkustermann May 8, 2025 09:57
@osa1 osa1 marked this pull request as ready for review May 8, 2025 10:08
// Enum.
void e<T>(int tagNumber, String name, int fieldType,
{dynamic defaultOrMaker,
ValueOfFunc? valueOf,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't these changes removing those parameters from public APIs? They seem to be breaking changes. So old generated code will no longer work with newest package:protobuf. If package:protobuf has to upgrade to new major version then it causes trouble for ecosystem:

If an app depends transitively on A and B and A uses old and B uses new protobuf version, then we get constraint conflict.

Copy link
Member Author

@osa1 osa1 May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a breaking change. You need to update protoc_plugin and protobuf together and re-generate your proto classes.

We've done it in the past, for example with the most recent protobuf and protoc_plugins.

If an app depends transitively on A and B and A uses old and B uses new protobuf version, then we get constraint conflict.

Yes, but you can't do any major bumps and breaking changes if you want to avoid this. We deal with it the same way we've dealt with it in the past.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So from your experience it doesn't cause too much churn on the ecosystem? Then it's ok, but you'll have to bump major version number.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So from your experience it doesn't cause too much churn on the ecosystem?

I don't have too much experience update downstream with major version bumps, but I expect this particular change to not be a big problem.

Maybe someone who works on updating dependencies can comment how much churn this kind of thing causes.

Internally, no one notice this change other than maybe their benchmarks getting faster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Releasing a major version (w/ planning, batching breaking changes, planning deprecations, ...) should be just a normal part of the process. The diamond constraint issue becomes a problem when one of the packages in the cycle has become unmaintained (rev'ing their deps doesn't happen or happens w/ lots of latency). You're more likely to run into that when more things depend on the package being released - when it's further down the overall ecosystem stack.

For something like protobuf I'd just do regular planning for a new major version but would still rev. when necessary. Even for breaking changes, you want to minimize the work for customers - they will eventually need to rev from 3 => 4, or 4 => 5.

There are a few packages that are very core to the ecosystem, and you have to be cautious about rev'ing, or choose to not rev at all (like, the direct deps of flutter - https://github.com/flutter/flutter/blob/master/packages/flutter/pubspec.yaml#L8).

// Enum.
void e<T>(int tagNumber, String name, int fieldType,
{dynamic defaultOrMaker,
ValueOfFunc? valueOf,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So from your experience it doesn't cause too much churn on the ecosystem? Then it's ok, but you'll have to bump major version number.

{dynamic defaultOrMaker,
this.subBuilder,
this.valueOf,
this.enumValueMap,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of getting this new parameter, why can't we take enumValues we already get and calculate this based on that? And if we can do that, we can decide whether we calculate a list or a map depending on density of the enum values, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of getting this new parameter, why can't we take enumValues we already get and calculate this based on that? And if we can do that, we can decide whether we calculate a list or a map depending on density of the enum values, etc.

We have one function to decode enum int values:

ProtobufEnum? _decodeEnum(
int tagNumber, ExtensionRegistry? registry, int rawValue) {
final f = valueOfFunc(tagNumber);
if (f != null) {
return f(rawValue);
}
return registry
?.getExtension(qualifiedMessageName, tagNumber)
?.valueOf
?.call(rawValue);
}

For it to index a list sometimes and map other times, we would need a type test, or a closure, as in master branch and the alternative PR.

Re: generating the list or map from the enumValues argument: the map and list should be per-type, but runtime allocated. So it needs to be in a final static in the generated the enum classes. Example:

static final $core.Map<$core.int, CodeGeneratorResponse_Feature> _byValue =
$pb.ProtobufEnum.initByValue(values);

And then this static field needs to be passed to FieldInfo so that BuilderInfo can find it from the FieldInfo.

Does this make sense?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

But instead of passing the representation (e.g. Map<int, XX>, List<XX> in here) or a valueOf closure. One could pass an instance of a ProtobufEnumDescriptor in here that has a .decode() method.

This could be a direct call, it could have a List<T?> values and a bool (whether it's indexed by value or binary search). It would then lookup and return. So there wouldn't be indirect calls (neither closure calls nor dispatch table calls). We would also pass the integer argument unboxed (in closure call we pass it boxed).

Something like

// package:protobuf
class ProtobufEnum {}
class ProtobufEnumDescriptor {
  final bool _binarySearch;
  final List<ProtobufEnum?> _byValue;
  ProtobufEnumDescriptor(this._binarySearch, this._byValue);  

  ProtobufEnum? lookup(int value) { ... }
}

// foo.pbenum.dart
class FooEnum extends ProtobufEnum {
  ...
  static final info_ = ProtobufEnumDescriptor(values, useBinarySearch: <if sparse>);
}

// foo.pb.dart

class Foo extends GeneratedMessage {
  final info_ = BuilderInfo()
    ..enumField(FooEnum.info_);
}

Then the decoder would see

  ProtobufEnum? _decodeEnum(int tagNumber, ExtensionRegistry? registry, int rawValue) {
    final f = enumDescriptorOf(tagNumber);
    if (f != null) {
      return f.lookup(rawValue); // direct call, passing `rawValue` unboxed, does array lookup or binary search
    }
    ...;
  }

Would that work?
If so, wouldn't that be the most efficient way to do it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would work and I agree that it should improve things.

Compared to the current PR, it's similar to passing a list (instead of a map) + a bool to whether the list should be binary searched or indexed.

A BuilderInfo.enumField wouldn't work. The enum infos need to be attached to field infos for repeated and map fields as well, so we need to pass the info to BuilderInfo.m (adds map field info) and BuilderInfo.addRepeated.

An class ProtobufEnumDescriptor would also mean a level of indirection when accessing the list and the boolean flag for whether to binary search. So perhaps passing an extra argument to the current methods for the bool flag would perform the best.

I'll first update the other PR with binary search because that's easy to do. Then update benchmarks to have some sparse enums/enums with large encodings. We can merge the benchmarks separately. Then revisit this idea.

@osa1
Copy link
Member Author

osa1 commented May 12, 2025

Closing this one as alternatives #981 and #985 are faster.

@osa1 osa1 closed this May 12, 2025
@osa1 osa1 deleted the improve_enum_decoding branch May 12, 2025 10:55
osa1 added a commit that referenced this pull request May 15, 2025
When decoding an enum value, we call a callback in the enum field's `FieldInfo`. The callback then indexes a map mapping enum numbers to Dart values.

When these conditions hold:

- The known enum numbers are all positive. (so that we can use a list and index it with the number)

- The known enum numbers are more than 70% of the large known enum number. (so that the list won't have a lot of `null` entries, wasting heap space)

We now generate a list instead of a map to map enum numbers to Dart values.

Similar to the map, the list is runtime allocated. No new code generated per message or enum type.

AOT benchmarks:

- Before: `protobuf_PackedEnumDecoding(RunTimeRaw): 47585.14 us.`
- After: `protobuf_PackedEnumDecoding(RunTimeRaw): 38974.566666666666 us.`
- Diff: -18%

Wasm benchmarks:

- Before: `protobuf_PackedEnumDecoding(RunTimeRaw): 52225.0 us.`
- After: `protobuf_PackedEnumDecoding(RunTimeRaw): 34283.33333333333 us.`
- Diff: -34%

**Alternatives considered:**

- #980 uses a map always, but eliminates the `valueOf` closure.
- #985 uses a list always, and does binary search in the list when the list is "shallow".
- #987 is the same as #985, but instead of calling the `valueOf` closure it stores an extra field in `FieldInfo`s for whether to binary search or directly index.

These are all slower than the current PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants