Skip to content
This repository was archived by the owner on Feb 25, 2025. It is now read-only.

Conversation

@dkwingsmt
Copy link
Contributor

This relands #50931.

Nothing has changed. The crash that caused the previous revert has been fixed by flutter/flutter#144212.

@jonahwilliams
Copy link
Contributor

Copy link
Contributor

@jonahwilliams jonahwilliams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets hold until we understand why the benchmarks changed?

@goderbauer
Copy link
Member

This caused some odd changes in various benchmarks

Wait, do I understand this right: worst_frame_build_time went DOWN when this landed last time and went back up when it was reverted? It's showing a performance improvement with this change? It's also curious, that it effects BUILD time at all since that's all framework-side work which this patch doesn't change?

@jonahwilliams
Copy link
Contributor

Is it showing a performance improvement, or are we measuring things differently? The build time measurement is collected by the FrameTimingsRecorder which is impacted by this patch.

@goderbauer
Copy link
Member

@jonahwilliams
Copy link
Contributor

Its measured by the frame timings recorder for the e2e, but if I look through the list I don't see any e2e tests so 🤷‍♂️

@jonahwilliams
Copy link
Contributor

It looks the most pronounce on the moto g4, which is a pretty slow device. Dropping the slowest frame could cause the rest of the metrics to improve as well.

@dkwingsmt
Copy link
Contributor Author

dkwingsmt commented Feb 28, 2024

@chinmaygarde
Copy link
Member

Triage: It seems like we are trying to explain an apparent performance improvement where we didn't expect any and are now wondering if we are measuring the right things or if there is a correctness issue. Can we grab timelines from a local run with this patch applied to see if we can explain our observations?

@chinmaygarde
Copy link
Member

If this fixes an unrelated performance issue, thats awesome. But we should still understand the observation.

@jonahwilliams
Copy link
Contributor

Some potential pointers here:

Consider looking at the timeline traces before/after. Perhaps there is an expensive frame that is missing now, or the frame is there but being measured differently?

You can profile this locally, or look for the existing traces in CI. For example: https://ci.chromium.org/ui/p/flutter/builders/prod/Linux_android%20android_picture_cache_complexity_scoring_perf__timeline_summary/11955/overview

Look in the "log links" section, save the timeline trace (not the summary) and open in about://tracing

@dkwingsmt
Copy link
Contributor Author

dkwingsmt commented Mar 4, 2024

I've did some device tests on a local Moto G4. I still have no idea what happened, but here's my result for now.

First of all, the metric change is consistently reproducible. The app seem to run fine on the device with no noticeable dropped frames or other smoothness problems. The frame count stay the same as well, swaying within 677~683 before and after the change.

Second, the following chart shows the frame build time against frame start time before and after the change.

image

It seems that build time got a global reduction and got much more consistent, instead of having one or two spikes chopped.

Third, I tried to apply changes little by little and do device test for each revision, and found the change that will trigger the build time drop. A large portion of the PR does not affect the build time, then the revision
a8c0dc3 does, then there are a few changes remaining that seem to not affect the build time.

|-----------------------------------------|---|---------|
^ Current head                                          ^  After PR
                                              ^ culprit revision (a8c0dc)
                                          ^ rev before the culprit revision
Build time
~900                                    ~900  ~550       ~550

To explain what the culprit revision does, the following is the series of events that take place when a view is rendered before the culprit commit:

  1. A LayerTreeTask is created and stored.
  2. frame_timings_recorder_->RecordBuildEnd() and update vsync target time
  3. The frame's layer tree is pushed to the pipeline.
  4. The pipeline is given to the rasterizer.
  5. The framework performs clean-up, such as scene.dispose().
  6. Potentially notify idle.

After the culprit commit, the codepath becomes 1->2->5->3->4->6
So, the work between BuildStart and BuildEnd has not changed at all, but the measured build time decreases from ~900 to ~550.

(More mysteriously, if I change the code path to 1->2->3->5->4->6, then the app is stuck before the first frame.)

As I said, I still have no idea what caused the change. Maybe it's because the garbage collection can be done earlier to not disrupt the next frame?

@goderbauer
Copy link
Member

A theory: Prior to this change, did the FlutterView.render call block while (3) and (4) happened? This could mean that prior to this change that work was (maybe incorrectly?) counted towards BUILD time (as measured from the start of the PlatformDispatcher.onBeginFrame call to the end of the PlatformDispatcher.onDrawFrame call). After this change, do (3) and (4) happen after the end of the PlatformDispatcher.onDrawFrame call and hence don't count into BUILD times anymore?

In other words, is FlutterView.render now faster because the work is delayed until after the PlatformDispatcher.onDrawFrame call ends? This should be fairly easy to validate by instrumenting the FlutterView.render call in the framework before and after this change.

@dkwingsmt
Copy link
Contributor Author

dkwingsmt commented Mar 4, 2024

@goderbauer The list of actions I mentioned here already splits of the work done by FlutterView.render.
Before the change, the FlutterView.render call is action 1, 2, 3, and 4. And yes, they're blocked since they all run on the UI thread.
Action 3 and 4 are never counted towards the build time, since the build time ends at action 2.

(Note that I'm talking about before and after the culprit revision, which is enough to trigger the build time drop. For the entire PR, yes, the build time includes more actions than the current main, a necessary change, but isn't related to the metrics change apparently.)

Also, if we suspect that the metric difference is work being included into/excluded from the build time, then this difference should be counted towards the raster time. The following chart is the raster time vs raster begin time:
image
The chart does not show significant changes.

@goderbauer
Copy link
Member

Before the change, the FlutterView.render call is action 1, 2, 3, and 4. And yes, they're blocked since they all run on the UI thread.
Action 3 and 4 are never counted towards the build time, since the build time ends at action 2.

I don't understand this. If 3 and 4 are part of FlutterView.render they will count towards the BUILD time as measured by the benchmark because that time ends here (after FlutterView.render completes, which is part of the persistent frame callback handling a few lines above): https://github.com/flutter/flutter/blob/1a0dc8f1e11892f8f2365ffb6a9617abc39d4b15/packages/flutter/lib/src/scheduler/binding.dart#L1331

@goderbauer
Copy link
Member

Maybe the confusion is coming from what the benchmark is measuring as BUILD time. As stated in #51019 (comment) I believe it is measured on the framework side from [1] to [2]. It is not measured engine-side by the frame_timings_recorder (unless I am misunderstanding something...)

[1] https://github.com/flutter/flutter/blob/1a0dc8f1e11892f8f2365ffb6a9617abc39d4b15/packages/flutter/lib/src/scheduler/binding.dart#L1200
[2] https://github.com/flutter/flutter/blob/1a0dc8f1e11892f8f2365ffb6a9617abc39d4b15/packages/flutter/lib/src/scheduler/binding.dart#L1331

@dkwingsmt
Copy link
Contributor Author

I'm creating a new PR since this PR was created on the flutter/engine repo by mistake.
Let's move to #51186

@dkwingsmt dkwingsmt closed this Mar 4, 2024
@dkwingsmt dkwingsmt mentioned this pull request Mar 4, 2024
8 tasks
auto-submit bot pushed a commit that referenced this pull request Mar 7, 2024
This relands #50931.

The crash that caused the 4th revert has been fixed by flutter/flutter#144212.

[There has been discussion](#51019) on why the benchmark in previous attempts show significant drop in build time. This PR addresses it using option a as described in [this comment](#51186 (comment)).

This PR also addresses flutter/flutter#144584 with option 1. A test is added.

[C++, Objective-C, Java style guides]: https://github.com/flutter/engine/blob/main/CONTRIBUTING.md#style
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants