Skip to content
This repository was archived by the owner on Feb 25, 2025. It is now read-only.

Conversation

@cbracken
Copy link
Member

@cbracken cbracken commented Oct 18, 2024

Previously, we dispatched a task to a background thread to wait for the first frame to be ready. In the background task, we then throw a task on the main thread that invokes that user-provided callback to notify them that the first frame is ready.

During ARC migration, we ensured that self was strongly-captured in the task that runs on the main thread, in order to prevent the possibility that the last remaining reference to the FlutterEngine instance be the one on the background thread.

However, with the previous code, it's possible that the callback is dispatched to the main thread, executes, and completes before the block on the background thread finishes. In that case, the last remaining FlutterEngine reference may still be the one on the background thread.

In order to prevent this scenario, we use dispatch_group_notify to ensure that the tasks are executed sequentially, and that the callback task is not dispatched to the main thread until the background task has completed. Further, we capture FlutterEngine strongly in the second task to ensure that it is the last remaining task.

Why do we need to capture self strongly in the task fired to the main queue? Imagine the following:

  • We queue up the first task on the background thread.
  • We queue up the callback task on the main thread to be executed only after the background task completes.
  • The task on the background thread starts running.
  • That task captures weakSelf strongly and checks it -- it's not nil, yay.
  • That task now makes the blocking first frame call. Time passes. While that time passes all other strong references to self go away. That second block only captured self weakly, so the background task now has the only remaining strong reference. Uh oh!
  • We hit the end of the background task block and since we have the last remaining strong reference, we dealloc FlutterEngine... on the background thread.
  • KABOOM.

No changes to tests since this is a fix to a race condition that doesn't affect app semantics, isn't testable as written, and adding shims to improve testability will likely make some already subtle/difficult-to-reason-about code even more complex and difficult to reason about.

Issue: flutter/flutter#156177

Pre-launch Checklist

  • I read the Contributor Guide and followed the process outlined there for submitting PRs.
  • I read the Tree Hygiene wiki page, which explains my responsibilities.
  • I read and followed the Flutter Style Guide and the C++, Objective-C, Java style guides.
  • I listed at least one issue that this PR fixes in the description above.
  • I added new tests to check the change I am making or feature I am adding, or the PR is test-exempt. See testing the engine for instructions on writing and running engine tests.
  • I updated/added relevant documentation (doc comments with ///).
  • I signed the CLA.
  • All existing and new tests are passing.

If you need help, consider asking for advice on the #hackers-new channel on Discord.

Previously, we dispatched a task to a background thread to wait for the
first frame to be ready. In the background task, we then throw a task on
the main thread that invokes that user-provided callback to notify them
that the first frame is ready.

During ARC migration, we ensured that self was strongly-captured in the
task that runs on the main thread, in order to prevent the possibility
that the last remaining reference to the FlutterEngine instance be the
one on the background thread.

However, with the previous code, it's possible that the callback is
dispatched to the main thread, executes, and completes before the block
on the background thread finishes. In that case, the last remaining
FlutterEngine reference may *still* be the one on the background thread.

In order to prevent this scenario, we use `dispatch_groupd_notify` to
ensure that the tasks are executed sequentially, and that the callback
task is not dispatched to the main thread until the background task has
completed. Further, we capture FlutterEngine strongly in the second task
to ensure that *it* is the last remaining task.

No changes to tests since this isn't testable as written.

Issue: flutter/flutter#156177
@flutter-dashboard
Copy link

It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption, contact "@test-exemption-reviewer" in the #hackers channel in Discord (don't just cc them here, they won't see it!).

If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix?

Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing. The test exemption team is a small volunteer group, so all reviewers should feel empowered to ask for tests, without delegating that responsibility entirely to the test exemption group.

Copy link
Contributor

@stuartmorgan-g stuartmorgan-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, with the previous code, it's possible that the callback is
dispatched to the main thread, executes, and completes before the block
on the background thread finishes.

Great catch, that's a subtle one. This is one of those things I know in theory, but it's really hard to remember when reading code that "ends" with the thread hop back.

@stuartmorgan-g
Copy link
Contributor

test-exempt: there's no way to force the potential race to happen, so all we can feasibly do AFAICT is to treat this as fixing potential flake in existing tests.

@cbracken cbracken added the autosubmit Merge PR when tree becomes green via auto submit App label Oct 18, 2024
@hellohuanlin
Copy link
Contributor

hellohuanlin commented Oct 18, 2024

However, with the previous code, it's possible that the callback is
dispatched to the main thread, executes, and completes before the block
on the background thread finishes.

Isn't strongSelf.shell.WaitForFirstFrame(waitTime) blocking (based on its name, i didn't read the code to be honest)? I'd expect the background thread to be done with the call before hopping to the main thread.

@hellohuanlin
Copy link
Contributor

Oh nvm i saw your comment in chat.

@auto-submit auto-submit bot merged commit 06240d4 into flutter:main Oct 19, 2024
29 checks passed
@cbracken cbracken deleted the execute-tasks-sequentially branch October 19, 2024 00:38
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Oct 19, 2024
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Oct 19, 2024
jonahwilliams pushed a commit to flutter/flutter that referenced this pull request Oct 19, 2024
…157215)

flutter/engine@76d310e...dd37446

2024-10-19 [email protected] Move keymap from
FlKeyboardViewDelegate to FlKeyboardManager (flutter/engine#55942)
2024-10-19 [email protected] [UI] fix scene builder parameter
naming. (flutter/engine#55969)
2024-10-19 [email protected] iOS: Improve thread safety of first frame
callback (flutter/engine#55966)
2024-10-18 [email protected] iOS: Fix flaky tests (remove timeouts)
(flutter/engine#55961)
2024-10-18 [email protected] [Impeller] allocate the impeller
onscreen texture from the render target cache. (flutter/engine#55943)
2024-10-18 [email protected] Roll Fuchsia Linux SDK from
9F_NaKPd2twhbPwP7... to tNQZ8d5mRYpe3--lk... (flutter/engine#55963)
2024-10-18 [email protected] Started filtering
out close line segments in rrect polylines. (flutter/engine#55929)
2024-10-18 [email protected] [Impeller] libImpeller: Allow custom
font registrations. (flutter/engine#55934)
2024-10-18 [email protected] Re-reland "iOS: Migrate FlutterEngine to
ARC" (flutter/engine#55962)
2024-10-18 [email protected] [Impeller] libImpeller: Add a README.
(flutter/engine#55940)
2024-10-18 [email protected] iOS: Eliminate needless profiler metrics
ivar (flutter/engine#55957)
2024-10-18 98614782+auto-submit[bot]@users.noreply.github.com Reverts
"[Impeller] one descriptor pool per frame. (#55939)"
(flutter/engine#55959)
2024-10-18 [email protected] Revert "Reland "iOS: Migrate FlutterEngine
to ARC" (#55937)" (flutter/engine#55954)
2024-10-18 [email protected] Roll Dart SDK from
993d3069f42e to a51df90298ca (7 revisions) (flutter/engine#55951)
2024-10-18 [email protected] [engine] add back opt out for merged
threads. (flutter/engine#55952)
2024-10-18 [email protected] [Impeller] one descriptor pool per
frame. (flutter/engine#55939)
2024-10-18 [email protected] [Impeller] add mechanism for sharing
bdf inputs. (flutter/engine#55701)

Also rolling transitive DEPS:
  fuchsia/sdk/core/linux-amd64 from 9F_NaKPd2twh to tNQZ8d5mRYpe

If this roll has caused a breakage, revert this CL and stop the roller
using the controls here:
https://autoroll.skia.org/r/flutter-engine-flutter-autoroll
Please CC [email protected],[email protected] on the revert to
ensure that a human
is aware of the problem.

To file a bug in Flutter:
https://github.com/flutter/flutter/issues/new/choose

To report a problem with the AutoRoller itself, please file a bug:
https://issues.skia.org/issues/new?component=1389291&template=1850622

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md
nick9822 pushed a commit to nick9822/flutter that referenced this pull request Dec 18, 2024
)

Previously, we dispatched a task to a background thread to wait for the first frame to be ready. In the background task, we then throw a task on the main thread that invokes that user-provided callback to notify them that the first frame is ready.

During ARC migration, we ensured that self was strongly-captured in the task that runs on the main thread, in order to prevent the possibility that the last remaining reference to the FlutterEngine instance be the one on the background thread.

However, with the previous code, it's possible that the callback is dispatched to the main thread, executes, and completes before the block on the background thread finishes. In that case, the last remaining FlutterEngine reference may *still* be the one on the background thread.

In order to prevent this scenario, we use `dispatch_group_notify` to ensure that the tasks are executed sequentially, and that the callback task is not dispatched to the main thread until the background task has completed. Further, we capture FlutterEngine strongly in the second task to ensure that *it* is the last remaining task.

Why do we need to capture self strongly in the task fired to the main queue? Imagine the following:
* We queue up the first task on the background thread.
* We queue up the callback task on the main thread to be executed only after the background task completes.
* The task on the background thread starts running.
* That task captures weakSelf strongly and checks it -- it's not nil, yay. 
* That task now makes the blocking first frame call. Time passes. While that time passes all other strong references to self go away. That second block only captured self weakly, so the background task now has the only remaining strong reference. Uh oh!
* We hit the end of the background task block and since we have the last remaining strong reference, we dealloc FlutterEngine... on the background thread.
* KABOOM.

No changes to tests since this is a fix to a race condition that doesn't affect app semantics, isn't testable as written, and adding shims to improve testability will likely make some already subtle/difficult-to-reason-about code even more complex and difficult to reason about.

Issue: flutter#156177

[C++, Objective-C, Java style guides]: https://github.com/flutter/engine/blob/main/CONTRIBUTING.md#style
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

autosubmit Merge PR when tree becomes green via auto submit App platform-ios

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants