[Impeller] Support 'SyncMode' in 'CommandBuffer::SubmitCommands' #37772

ColdPaleLight · 2022-11-19T03:32:56Z

Added SyncMode to specify how to synchronize when CommandBuffer:::SubmitCommands.

Note that the reason for adding this parameter instead of using CompleteCallback is following:

The CompleteCallback of OpenGLES does not ensure that the CommandBuffer has been completed.
Make it more convenient for developers.

Pre-launch Checklist

I read the [Contributor Guide] and followed the process outlined there for submitting PRs.
I read the [Tree Hygiene] wiki page, which explains my responsibilities.
I read and followed the [Flutter Style Guide] and the [C++, Objective-C, Java style guides].
I listed at least one issue that this PR fixes in the description above.
I added new tests to check the change I am making or feature I am adding, or Hixie said the PR is test-exempt. See [testing the engine] for instructions on writing and running engine tests.
I updated/added relevant documentation (doc comments with ///).
I signed the [CLA].
All existing and new tests are passing.

impeller/entity/entity_pass.cc

dnfield

This seems better to me than #37704 but I'd like @chinmaygarde and @bdero's thoughts on how this might get misused or any other concerns.

impeller/aiks/aiks_context.cc

ColdPaleLight · 2022-11-28T02:48:53Z

friendly ping @chinmaygarde @bdero

bdero · 2022-11-28T05:27:20Z

impeller/renderer/backend/gles/command_buffer_gles.cc

+        gl.Flush();
+        break;
+      case CommandBuffer::SyncMode::kWaitUntilCompleted:
+        gl.Finish();


So looking at #37709, a couple of ideas come to mind:

For GLES: Since the blit pass uses glReadPixels, and GL commands are enqueued serially, is it actually necessary to call glFinish before downloading the texture? My understanding is that glReadPixels will synchronously block until any previous commands which attach the texture have completed on the GPU. In any case, doing the glFlush doesn't hurt and is probably a good idea.

This is a situation where it seems like the underlying rendering backend should own the choice about how to handle synchronization; we have an opportunity to avoid burdening users of the render layer API with an unnecessary choice and subtle bugs that are difficult to track down when used incorrectly:

For Metal (assuming waitUntilScheduled + hazard tracking isn't working as a sufficient stopgap for this), synchronization can be automatically managed by the renderer layer using native synchronization primitives:

Have every TextureMTL track an optional MTLEvent, say TextureMTL::readable_event_.

When a Metal RenderPassMTL encoding begins, create one new MTLEvent.

For each attached texture in the render pass descriptor, set/overwrite the TextureMTL::readable_event_ to the new one.

At the end of RenderPassMTL encoding, encode a signaling command for the event with encodeSignalEvent.

Prior to encoding the Metal blit pass containing the device->host transfer, encode a wait command with encodeWaitForEvent against the source TextureMTL's event (if one has been set).

For Vulkan, we can create a timeline semaphore for every attachable VkImage being tracked which lasts the lifetime of the VkImage, incrementing the attached texture semaphores during RenderPassVK (but I'd recommend holding off on this until the Vulkan backend is further along, of course).

The above approach can be lightly tweaked later on if/when we support parallel command encoding (for example, by making the MTLEvents handle manipulation thread safe). Also, we can get rid of the kWaitUntilScheduled hack later on by extending this approach to track whether read events have finished, so that commands which write to a texture will wait until the reads are done regardless of the encoding order across command buffers.

Sorry about the runaround here! This aspect of our API is a bit unresolved as we haven't thrown together/documented a strategy for handling device resource synchronization comprehensively yet.

FYI @chinmaygarde, I'm thinking now might be the time to clean up my old notes about this and do a write up detailing how we can achieve maximal parallel encoding without exposing any explicit synchronization primitives above the render layer. It's feeling like the right time to address this since it's been popping up in the discussions around Vulkan too.

@bdero It looks like you don't want to change the API, and expect to be able to solve the problem differently on different backends.

I guess we can further discuss the following two scenarios encountered on developing ui.Image.toByteData.

Scenario 1: Render the display list to texture on the Raster thread, and access the texture on the IO thread.

final ui.PictureRecorder recorder = ui.PictureRecorder(); final ui.Canvas canvas = ui.Canvas(recorder); canvas.drawRect(const Rect.fromLTWH(0.0, 0.0, 100, 100), Paint()..color = Colors.red); final ui.Picture picture = recorder.endRecording(); ui.Image image = await picture.toImage(100, 100); // render picture to texture on raster thread ByteData? byteData = await image.toByteData(); // use texture on IO thread

OpenGLES

On the backend of OpenGLES, when the IO thread accesses the texture, we cannot know whether the command that the Raster thread renders the content to the texture is executed or not.

In this PR, I want to do a glFinish on the raster thread to solve this problem (I'm not sure if using glFlush here will also solve the problem). If we don't want developers to specify SyncMode explicitly, we need to introduce a synchronization mechanism, such as glFenceSync. However, these API are only available in GLES 3.0, not in GLES2.0.

Metal

In this scenario, it seems that Metal will not encounter this problem, because the commands corresponding to render to texture and access texture are all in the same CommandQueue, and the order in which they enter the queue in this scenario is determined (if I am wrong , please correct me).

Vulkan

TODO

Scenario 2: Blit operation to copy texture to buffer. We need to ensure that the cpu buffer has content after the blit operation API call ends.

OpenGLES

As you said, the implementation of OpenGLES uses glReadPixels, so there is no need to call glFinish.

Metal

We need to ensure that after the blit operation is completed, the cpu buffer already has data. This PR proposes to use SyncMode to ensure that if the API is not changed, we can also implement the corresponding functions through CompletionCallback, but it is not very convenient for developers to use.

Vulkan

TODO

Summarize

In scenario 1, if the OpenGLES backend does not have SyncMode, it seems that there is no way to ensure that the logic is correct, because OpenGLES 2.0 does not support glFenceSync.

In Scenario 2, Introducing SyncMode can make things sampler, otherwise CompletionCallback and fml::AutoResetWaitableEvent must be used which is a little complicated.

Do you have any thoughts on the above two questions? Thanks!

On the backend of OpenGLES, when the IO thread accesses the texture, we cannot know whether the command that the Raster thread renders the content to the texture is executed or not.

The render pass and blit pass should both be encoded on the raster thread in serial. Would there be any issues with having picture.toImage and image.toByteData enqueue raster tasks with the same priority?

If we don't want developers to specify SyncMode explicitly, we need to introduce a synchronization mechanism, such as glFenceSync. However, these API are only available in GLES 3.0, not in GLES2.0.

Given the behavior of glReadPixels and the fact that GLES 2 commands should be encoded serially anyway (as the GL context is not guaranteed thread safe), I don't think there's a synchronization issue lurking between the framebuffer write and texture read that needs to be resolved.

The glReadPixels GLES 2 docs indicates that glReadPixels waits for completion of all previous commands that write to framebuffers:

glFinish does not return until the effects of all previously called GL commands are complete. Such effects include all changes to GL state, all changes to connection state, and all changes to the frame buffer contents.

Interestingly, the GLES 1.1 docs happen to be more explicit about it, but the description itself is really no different -- it would have been nice if this helpful note had survived to later specs. :)

glFinish is NOT required before a call to eglSwapBuffers or glReadPixels. glFinish can take some time and for performance reasons it is best to use this function infrequently and only when necessary.

Metal
We need to ensure that after the blit operation is completed, the cpu buffer already has data. This PR proposes to use SyncMode to ensure that if the API is not changed, we can also implement the corresponding functions through CompletionCallback, but it is not very convenient for developers to use.

While blocking the raster thread will be unavoidable for GL, using a completion callback for device->host transfers would allow us to avoid blocking the raster thread unnecessarily on Metal, Vulkan, and most other backends we're likely to implement over time (for Vulkan, we could invoke the callback from a worker thread after waiting on a fence, for example).

Given the behavior of glReadPixels and the fact that GLES 2 commands should be encoded serially anyway (as the GL context is not guaranteed thread safe), I don't think there's a synchronization issue lurking between the framebuffer write and texture read that needs to be resolved.

I think things are a little bit different here. There are two threads here, and each thread has its own EGLContext (the relationship between the two EGLContexts is a shared context). In this case, there is no way to guarantee that the render of the Raster thread must occur before the IO accesses the texture. And this is not just theoretical, in fact, I encountered this problem in the development of ui.Image.toByteData, it really happened.

~~That is to say, if the content is rendered on the raster thread, and then glReadPixels is called on the io thread, there is no guarantee that the expected content can be read.~~

Ahh, I see. You mean "if the render pass and blit pass both be encoded on the raster thread in serial, then there is no more synchronization issue".

Yes, that's right.

The render pass and blit pass should both be encoded on the raster thread in serial. Would there be any issues with having picture.toImage and image.toByteData enqueue raster tasks with the same priority?

That sounds good, I'll give it a try :)

There are two threads here, and each thread has its own EGLContext (the relationship between the two EGLContexts is a shared context).

Ah, I wasn't aware we were making a second context on the IO thread. Do you foresee any downside to just using one context and performing all of the GLES activity on the raster thread given the lack of appropriate synchronization primitives in GLES 2? We could always add a multi-context solution later on that gets used in the presence of GLES 3 procs, but I suspect the overhead savings would be somewhat negligible.

Ahh, I see. You mean "if the render pass and blit pass both be encoded on the raster thread in serial, then there is no more synchronization issue".

Indeed!

Do you foresee any downside to just using one context and performing all of the GLES activity on the raster thread given the lack of appropriate synchronization primitives in GLES 2?

Since the context of the raster thread is used for rendering, and the context of the io thread is used for texture uploading, I am worried that all gl activity on the raster thread will cause jank.

Ah, right. Sorry, I meant more with respect to the device->host transfer. Performing new resource host->device transfers on the IO thread seems perfectly reasonable and safe.

ColdPaleLight · 2022-12-01T06:04:51Z

Closing it since #37709 has been landed, this API is no longer needed.

ColdPaleLight added 2 commits November 19, 2022 11:32

[Impeller] Support ‘SyncMode’ in 'CommandBuffer:::SubmitCommands'

1a1cc2a

Clean code

1b611f9

ColdPaleLight changed the title ~~[Impeller] Support ‘SyncMode’ in 'CommandBuffer:::SubmitCommands'~~ [Impeller] Support 'SyncMode' in 'CommandBuffer:::SubmitCommands' Nov 19, 2022

ColdPaleLight mentioned this pull request Nov 19, 2022

[Impeller] Add a new API 'Context::WaitUntilCommandsCompleted' to ensure that commands are completed. #37704

Closed

8 tasks

ColdPaleLight commented Nov 19, 2022

View reviewed changes

impeller/entity/entity_pass.cc Show resolved Hide resolved

Add a check

ac84daa

ColdPaleLight requested review from chinmaygarde and dnfield November 19, 2022 04:11

ColdPaleLight self-assigned this Nov 19, 2022

ColdPaleLight added the e: impeller label Nov 19, 2022

ColdPaleLight mentioned this pull request Nov 19, 2022

[Impeller] Implement 'ui.Image.toByteData()' #37709

Merged

8 tasks

ColdPaleLight changed the title ~~[Impeller] Support 'SyncMode' in 'CommandBuffer:::SubmitCommands'~~ [Impeller] Support 'SyncMode' in 'CommandBuffer::SubmitCommands' Nov 21, 2022

dnfield reviewed Nov 21, 2022

View reviewed changes

impeller/aiks/aiks_context.cc Outdated Show resolved Hide resolved

ColdPaleLight requested a review from bdero November 22, 2022 00:42

ColdPaleLight and others added 4 commits November 22, 2022 10:45

Merge branch 'flutter:main' into support_sync_mode

7f82399

Use CommandBuffer::SyncMode as param instead of bool

347007b

Clean code

96f69a2

Merge branch 'main' into support_sync_mode

b170635

bdero reviewed Nov 28, 2022

View reviewed changes

ColdPaleLight requested a review from bdero November 28, 2022 09:43

ColdPaleLight closed this Dec 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Impeller] Support 'SyncMode' in 'CommandBuffer::SubmitCommands' #37772

[Impeller] Support 'SyncMode' in 'CommandBuffer::SubmitCommands' #37772

Uh oh!

ColdPaleLight commented Nov 19, 2022 •

edited

Loading

Uh oh!

Uh oh!

dnfield left a comment

Uh oh!

Uh oh!

ColdPaleLight commented Nov 28, 2022

Uh oh!

bdero Nov 28, 2022 •

edited

Loading

Uh oh!

ColdPaleLight Nov 28, 2022

Uh oh!

bdero Nov 28, 2022 •

edited

Loading

Uh oh!

ColdPaleLight Nov 28, 2022 •

edited

Loading

Uh oh!

bdero Nov 28, 2022 •

edited

Loading

Uh oh!

ColdPaleLight Nov 29, 2022

Uh oh!

bdero Nov 29, 2022

Uh oh!

ColdPaleLight commented Dec 1, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Impeller] Support 'SyncMode' in 'CommandBuffer::SubmitCommands' #37772

[Impeller] Support 'SyncMode' in 'CommandBuffer::SubmitCommands' #37772

Uh oh!

Conversation

ColdPaleLight commented Nov 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pre-launch Checklist

Uh oh!

Uh oh!

dnfield left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ColdPaleLight commented Nov 28, 2022

Uh oh!

bdero Nov 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ColdPaleLight Nov 28, 2022

Choose a reason for hiding this comment

Scenario 1: Render the display list to texture on the Raster thread, and access the texture on the IO thread.

OpenGLES

Metal

Vulkan

Scenario 2: Blit operation to copy texture to buffer. We need to ensure that the cpu buffer has content after the blit operation API call ends.

OpenGLES

Metal

Vulkan

Summarize

Uh oh!

bdero Nov 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ColdPaleLight Nov 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bdero Nov 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ColdPaleLight Nov 29, 2022

Choose a reason for hiding this comment

Uh oh!

bdero Nov 29, 2022

Choose a reason for hiding this comment

Uh oh!

ColdPaleLight commented Dec 1, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ColdPaleLight commented Nov 19, 2022 •

edited

Loading

bdero Nov 28, 2022 •

edited

Loading

bdero Nov 28, 2022 •

edited

Loading

ColdPaleLight Nov 28, 2022 •

edited

Loading

bdero Nov 28, 2022 •

edited

Loading