-
Notifications
You must be signed in to change notification settings - Fork 9
Use recorded tape to impl task copying #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
To solve the 3rd issue, maybe we can find the instruction which produces the input(argument) of UPDATE: I tried it and found that it's not that simple. |
# @test consume(ctask) == 4 # we get 4 in the old version
@test consume(ctask) == 2 # --> 3.I'm confused. Here we are interpreting the instructions on the tape for both |
|
The IR of Thus, the tape is: The time we copy the task is when In the copied task, we copy the tape, and replay it from This only affects SHARED DATA, which doesn't concern us much and seems not to have a clear semantic in the old Libtask. |
|
@KDr2 I think the reason is the difference between the new function (instr::Instruction{typeof(produce)})()
args = val(instr.input[1])
tape = gettape(instr)
tf = tape.owner
ttask = tf.owner
ch = ttask.channel
put!(ch, args)
# check whether there is a waiting consumer in the queue, if yes, continue;
# if not, put the task to sleep until the queue becomes non-empty.
# This would delay the `produce` action until there is a demand.
endfThis queuing mechanism is similar to the one we have in the |
|
You are right, but I'm afraid that maybe the queue mechanism wouldn't help here? When the execution reaches the point where the comment is placed, the data which will be produced is already determined by the last instruction, whether we wait for a consumer or not, that determined value would not change. A right mechanism would be, we check if there's a consumer, if not, we wait at the instruction which computes the data, instead of waiting at the instruction of |
|
Take |
|
I think the queue mechanism can work if we ignore the first EDIT: We currently only allow copying tasks after at least one |
|
The purpose of inserting a dummy So if we want to delay the computing, we should insert the dummy But, on the other hand, it is NOT very easy to determine which instruction(s) affects the computing of data, for example: In this tape, we should put the dummy More precisely, the copy only can occur during a call to If the data is of primitive types or deepcopy-implemented types (like If the data is shared between the tasks, we should make the semantic of In the old version, we didn't clear up this either I think, but it works intuitively due to the stack copy. |
|
Issue 3 fixed. Thanks to @yebai. |
|
Great, thanks @KDr2 - let's now move forward to fix the integration tests with Turing and AdvancedPS. |
|
Isn't it a bit surprising snd potentially misleading if Libtask.jl is not based on Libtask_jll anymore but a tape based approach? Wouldn't it be cleaner to register a new package? I assume the next release would be breaking (seems safer even if the API is not changed), so downstream packages would require manual updates anyway. |
|
It is significantly more challenging to port In the longer run, we might consider depreciating |
|
I think it is great if we don't have to deal with the Libtask_jll issues and new Julia versions don't break Turing anymore. I'm not against this approach at all. My main question is just if it is reasonable to merge it into Libtask.jl given that it is not based on Libtask_jll anymore and such a fundamentally different approach. I think it should not matter for downstream packages if the tape approach is available in a breaking Libtask release or a new package. Different packages might even motivate a common API and a more modular design in the future where packages and/or users can choose the preferred backend. |
|
Had a quick look at the integration test for AdvancedPS. We're using infinite loop which are not supported by this I think. Tests don't hang anymore with finite loops but fail on: InvalidStateException("Channel is closed.", :closed)This seems to happen whenever we're using anonymous functions: function outer()
for i in 1:10
produce(10)
end
end
ctask = CTask(outer)
consume(ctask) # Fine, gives 10
ctask2 = CTask(() -> outer()) # Similar to the way AdvancedPS.Trace works
consume(ctask2) # Fails on Channel error |
|
It's because the limitations of the new approach:
|
|
@KDr2 There are valid cases where we would like to construct a CTask(outer::Function, args::Tuple, ...) # call `outer(args)` internallyThis would avoid the need to create anonymous functions for tasks. |
|
But I saw some cases like this: in a function which is used to create a ctask. so maybe we should track 1 more depth down? |
The |
phipsgabler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this, it's an elegant tape implementation.
I do however share @devmotion's concern that it is a bit weird to completely replace the underlying library without changing the name. Why not make a separate package, and archive Libtask.jl?
The Channel Closed Exception is a good indicator I think? |
|
I updated --- a/src/container.jl
+++ b/src/container.jl
@@ -7,13 +7,7 @@ end
const Particle = Trace
function Trace(f, rng::TracedRNG)
- ctask = let f = f
- Libtask.CTask() do
- res = f(rng)
- Libtask.produce(nothing)
- return res
- end
- end
+ ctask = Libtask.CTask(f, rng)
# add backward reference
newtrace = Trace(f, ctask, rng)
@@ -62,13 +56,8 @@ function forkr(trace::Trace)
newf = reset_model(trace.f)
Random123.set_counter!(trace.rng, 1)
- ctask = let f = trace.ctask.task.code
- Libtask.CTask() do
- res = f()(trace.rng)
- Libtask.produce(nothing)
- return res
- end
- end
+ func = trace.ctask.tf.func
+ ctask = Libtask.CTask(func, trace.rng)There are still some test failures. |
|
Another problem with updating AdvancedPS is that Turing does not support (and I assume is not compatible with) the many unreleased changes in its master branch. |
|
Most test cases of AdvancedPS passed now: https://github.com/TuringLang/AdvancedPS.jl/pull/35/files |
|
I just found a way to support So we don't need to specialize |
@KDr2 The new |
That's brilliant! 😄 |
I copied the "taped" implementation from https://github.com/KDr2/Taped.jl and updated the
TArrayandTRef. Now all test cases have passed, but some changes had been made to let them pass:while true->for _ in RANG)Task copying ONLY occurs at the time of callingproduce, i.e., when the task is waiting on an unbuffered channel. This will cause some differences as to the value we get the first time we callconsumeafter a task is copied.The details of item 3:
On line
--> 1., the task callsproducewitht[1]which is computed and has the value of 2, and at this point, the task is scheduled off, then the copy occurs. then:--> 3)produceinstruction, with its cached input which is 2, is replayed, and we also get 2 (line--> 2)This happens on shared data.
UPDATE: The 3rd issue has been fixed.