Skip to content

Conversation

@devmotion
Copy link
Member

@devmotion devmotion commented Apr 28, 2020

I had a closer look at the implementation due to #64 and ended up with a huge amount of changes, so it might be better to split this PR...

Basically, it includes:

  • A CTask wrapper for Tasks for which task copying is enabled
  • An implementation of the iteration interface for CTask that just consumes
  • A cleaner CTaskException
  • A lot of additional comments in the source code and (hopefully) cleaner syntax
  • A cleaner implementation of the test suite, by, e.g., grouping together related tests in test sets and replacing @assert with @test
  • Including the implementation and tests of TRef
  • Update of the README (while writing the tests I noticed that using the same variable names inside of the function and for the task is a bit unfortunate since it breaks in a local scope, i.e., e.g., inside of @testset).

@devmotion
Copy link
Member Author

Hmmm seems I introduced a bug in Julia 1.0 🤔

src/ctask.jl Outdated
ct = _current_task()
res = func()
ct.result = res
ct.state = :done
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change might be problematic during task switching, although I can't remember the exact details

maybe @KDr2 can clarify?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yes, I was wondering if it is needed. I removed it (which worked fine on Julia > 1.0 it seems) since in

Libtask.jl/src/ctask.jl

Lines 189 to 190 in 2ca2a89

isa(p.storage, IdDict) && haskey(p.storage, :_libtask_state) &&
(p.state = p.storage[:_libtask_state])
ct.state is set to ct.storage[:_libtask_state] if it exists. But maybe it's actually needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll check what happens if I revert the changes that you commented on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to resolve the issue but I would still be interested to know the details - @KDr2, can you explain why this separate state is needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And there's still something not quite right it seems.

Copy link
Member

@yebai yebai Apr 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems not related to Libtask, more like a Julia Windows x86 issue on the nightly build

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still be interested to know the details

Any copy of a task should NOT finish or fail before the original task finishes or fails. So I added a wait() at the end of each branch in the function task_wrapper, and that makes the tasks and their copies never be done... It's a bad solution but I couldn't find a better way to ensure a task don't end before its origin end.

If we assign :done or :failed directly to task.state, the task ends, and this may happen before its origin ends.

The root reason is how Julia starts and ends a task (pseudo code):

jl_task_t *t = get current task; // (a)
t->run_the_julia_code(); // (b) here, we copy the task
finish_and_clean(t); // (c)

a. t is the original task
b. origin and copies. Let's assume a copy (say, task_ct) ends first, then the control flow goes to (c)
c. we come back to here because a copy (task_ct) ends, but the original task (t) doesn't ends yet, but t is the original task and we do clean job on it...

There are our ways to fix this:

  • at the (c) point, we refresh the variable t (t = get_current_task();) then clean it. this solution is the perfect one but it should modify the code of JuliaLang...
  • do some hack, update the t variable (which is on the stack) in a copied task, this is hard and arch specific and not portable...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, @KDr2. As @devmotion mentioned, removing this hack seems working fine for Julia versions from v1.1. It would be interesting to understand why. Also, can you add some notes to this hack in a separate PR for future reference?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will do some investigation and write some docs. But I think it must don't work under Julia >1.1 without the hack neither since I just looked into relevant code in JuliaLang today, maybe the bug is not triggered by current test cases. Anyway, I will write this down and make it clean how it works and how it doesn't work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, thanks for the detailed explanations!

Copy link
Member

@KDr2 KDr2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@yebai yebai merged commit 5cf4774 into TuringLang:master Apr 29, 2020
@devmotion devmotion deleted the refactor branch April 29, 2020 09:09
@yebai yebai mentioned this pull request Apr 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants