Add AsyncSource template for speculative execution #903

oerling · 2022-01-20T15:48:04Z

AsyncSource encapsulates a background computation. These can be
scheduled on a background executor. The difference between this and a
future is that when the user requires the result, AsyncSource will
perform the async peration on the caller's thread and will turn the
background operation into a no-op.

facebook-github-bot · 2022-01-20T18:32:24Z

@oerling has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

pedroerp

Nice, code looks pretty clean and easy to understand :) A few minor API and test comments:

pedroerp · 2022-01-21T17:58:35Z

velox/common/base/AsyncSource.h

+  // will either wait for the make to finish or run the make on its
+  // own thread.
+  bool isPending() const {
+    return make_ && !item_;


shouldn't this test making_ instead of make_? For instance, this will return true if the object was just constructed but before prepare() was called, which sounds a bit counter-intuitive at a first look at the API. Is that the intended behavior?

pedroerp · 2022-01-22T02:24:13Z

velox/common/base/AsyncSource.h

+  // Returns the item to the first caller and nullptr to subsequent callers. If
+  // the item is preparing on the executor, waits for the item and otherwise
+  // makes it on the caller thread.
+  std::unique_ptr<Item> move() {


another way to implement this method (in a truly async way), is making it return a Future instead of blocking internally. The client of this API could then decide whether to block or propagate that future. Internally you would have to hold a SharedPromise instead of a Promise

pedroerp · 2022-01-22T02:27:29Z

velox/common/base/AsyncSource.h

+    if (make) {
+      return make();
+    }
+    auto& exec = folly::QueuedImmediateExecutor::instance();


this is the part where ideally you would return the semi future so that the client could attach their preferred executor, instead of blocking the current thread.

pedroerp · 2022-01-22T02:33:28Z

velox/common/tests/AsyncSourceTest.cpp

+  int32_t id;
+};
+
+TEST(AsyncSourceTest, basic) {


I would expect the "basic" test to test the basic API, like construct a single async source, check isPending() and hasValue(), then call prepare(), check the method return the expected results again, then move the results out, etc.

Then having the test below as a "multi-thread" test that checks for concurrent behavior.

oerling · 2022-01-25T04:36:19Z

The use case is to schedule a lot of computations with the expectation that most will be used but the order of use is not necessarily a queue order. If this were strictly a queue, this would be a future. So the pattern is that many threads throw things to be executed ahead of time, for example opening files, processing file metadata, pre-reading som densely referenced columns. Now these computations may advance at a very different rate and therefore strictly queueing these is not optimal. Suppose a big query fills the prefetch and a fast query with a very small read will be blocked behind the slow one. The use case has the user of the data blocking on the data, so a use case like a future ready callback or chaining is not intended. If we have move() return a SemiFuture, this would be moving a scheduled item from one executor to another. Here we just move it to te sync user, we just differ in the queue order. As n example, consider receiving splits. There's some 100 splits that will be consumed by 15 threads. Let's say that the next 30 splits are queud for file open and metadata. The worker threads then look at the split queue and try to get one that is ready. If there is nothing ready, they take the first not ready. This could be executing, but if not, this should not be queued behind some other query's split and instead it would be better to do the operation synchronously on the caller thread since it has nothing better to do There are cases where doing anything other than a sync wait is impractical, e.g. DwrfReader waiting for the next stripe. So we don't really have a good case for returning a SemiFuture. + bool isPending() const { + return make_ && !item_; shouldn't this test making_ instead of make_? For instance, this will return true if the object was just constructed but before prepare() was called, which sounds a bit counter-intuitive at a first look at the API. Is that the intended behavior? A: This is not needed. But The comment is correct.

pedroerp

Thanks for replying to the comments and for the tests. Looks good!

AsyncSource encapsulates a background computation. These can be scheduled on a background executor. The difference between this and a future is that when the user requires the result, AsyncSource will perform the async peration on the caller's thread and will turn the background operation into a no-op.

facebook-github-bot · 2022-01-26T17:57:48Z

@oerling has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 20, 2022

pedroerp reviewed Jan 22, 2022

View reviewed changes

oerling force-pushed the async-pr branch from 3cc10e6 to 41e8006 Compare January 25, 2022 04:24

pedroerp approved these changes Jan 26, 2022

View reviewed changes

oerling force-pushed the async-pr branch from 41e8006 to 855e260 Compare January 26, 2022 17:53

facebook-github-bot closed this in 8be2a9d Jan 29, 2022

rui-mo pushed a commit to rui-mo/velox that referenced this pull request Mar 17, 2023

Remove compile.sh from doc (facebookincubator#903)

de2d694

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add AsyncSource template for speculative execution #903

Add AsyncSource template for speculative execution #903

Uh oh!

oerling commented Jan 20, 2022

Uh oh!

facebook-github-bot commented Jan 20, 2022

Uh oh!

pedroerp left a comment

Uh oh!

pedroerp Jan 21, 2022

Uh oh!

pedroerp Jan 22, 2022

Uh oh!

pedroerp Jan 22, 2022

Uh oh!

pedroerp Jan 22, 2022

Uh oh!

oerling commented Jan 25, 2022 via email

Uh oh!

pedroerp left a comment

Uh oh!

facebook-github-bot commented Jan 26, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add AsyncSource template for speculative execution #903

Add AsyncSource template for speculative execution #903

Uh oh!

Conversation

oerling commented Jan 20, 2022

Uh oh!

facebook-github-bot commented Jan 20, 2022

Uh oh!

pedroerp left a comment

Choose a reason for hiding this comment

Uh oh!

pedroerp Jan 21, 2022

Choose a reason for hiding this comment

Uh oh!

pedroerp Jan 22, 2022

Choose a reason for hiding this comment

Uh oh!

pedroerp Jan 22, 2022

Choose a reason for hiding this comment

Uh oh!

pedroerp Jan 22, 2022

Choose a reason for hiding this comment

Uh oh!

oerling commented Jan 25, 2022 via email

Uh oh!

pedroerp left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jan 26, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants