-
Notifications
You must be signed in to change notification settings - Fork 6k
Create benchmarks for DisplayListBuilder #34910
Create benchmarks for DisplayListBuilder #34910
Conversation
|
I think we need to tailor how we apply all of the bits and pieces. Running all variants of drawRect/drawAtlas/etc. is probably fine, but on each build out we run all of the transform, all of the clip, and all of the saveLayer calls together. Plus, we have pretty much all of the attributes set every which way. I think a good reason that the mapRect call was having such an impact is that we probably left the DL with a nasty perspective matrix for all the rest of the ops to be built on, but perspective is such a corner case in practice. One technique might be to break out the big array into smaller groups and then we can mix and match then in more controlled ways:
We then could have (the following is a full list, but we might want to start with a subset, see below):
For starters I'd like to see a simple small number of benchmarks with the most common variations:
I'm not sure what to do about attributes for now as I do want to know how long they take, but I don't want to have lots of random settings through most of the benchmarks. The reason they should be benchmarked is that they can do deep copies of complex attributes and it would be good to know how much that costs us in the long run, but I don't think that is urgent at this time, something we can add over time. |
|
@flar |
|
(Comment updated with new numbers on 7/29/22) While the results graphed and discussed here represent considerations for #34365, they actually point out the value of the benchmarks being added in this PR. It looks like the new code can construct DL+bounds (the most common case by far) between 10-20% slower than the previous code, possibly because of having to maintain all of the extra rectangles. DL + RTree is similarly a little slower but the gap is 5-10%. Also, the timing with the ImageFilter saveLayer is much worse - 20-30%. Where the new code shines, though, is in computing both bounds and RTree on a new DL which is quite a bit faster across the board - a 25-45% savings. |
|
The data originally graphed in this comment has been updated and the new graphs are now included in the previous comment.
|
|
Those last 2 comments are really about the benefit of the work done in #34365 This PR, which simply adds the benchmark will be important regardless of that other work, so I'll review it with intent to land it as work progresses on the subsequent developments of DL/bounds/RTree... |
|
I updated the benchmarks data on the top again. |
flar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything LGTM, autosubmit-ing now to start generating data.
|
I was having trouble finding the results in Skia perf until I realized that the test name was misspelled: https://flutter-engine-perf.skia.org/e/?queries=executable%3D._display_list_builder_benchmarks Fix: #35104 |







benchmarks for display list builder/display list.
fixes flutter/flutter#108265
In the process of writing the benchmark, I found that generating the rtree itself is not expensive. However, the call
matrix().mapRect(&bounds)is very expensive.Also I think we should use the second set of data (main + PR #34835) as the base data, because its function is correct.
cc @flar
1. main
2. main + PR #34365
Pre-launch Checklist
writing and running engine tests.
///).