Skip to content

Conversation

@xukai92
Copy link
Member

@xukai92 xukai92 commented May 12, 2017

Address #237

Also fix #245

Almost done. But I also want to implement a cache of gradient - we can actually cache some of the gradient for leapfrog. The gradient used in NUTS is especially low-efficient because our leapfrog also re-compute grad in the beginning and the recursion of NUTS calls the leapfrog by only 1 step each time:

  • The way our trick to cache gradient inside leapfrog makes the number of computing gradient to be t+1
  • Calling leapfrog only one-step each time means the number to be 1 + 1, and calling t times 1-step leapfrog means 2t

I think it's fine to just cache the gradient by a dictionary inside spl.info?

@xukai92
Copy link
Member Author

xukai92 commented May 12, 2017

HMC on LDA before cache was 429s (https://travis-ci.org/yebai/Turing.jl/jobs/231713149) and after is 369s (https://travis-ci.org/yebai/Turing.jl/jobs/231726143). Cache saves 1 run of gradient if the previous step is not rejected.

@xukai92
Copy link
Member Author

xukai92 commented May 12, 2017

218s after setting chunksize to 60 (https://travis-ci.org/yebai/Turing.jl/jobs/231734804)

@yebai
Copy link
Member

yebai commented May 12, 2017

I think it's fine to just cache the gradient by a dictionary inside spl.info?

Yes, we can do that first.

218s after setting chunksize to 60

Nice!

@xukai92
Copy link
Member Author

xukai92 commented May 12, 2017

I think appveyor fails because the cache dict uses Vector{Dual} as keys, and somehow win32 doesn't support it. Will fix it tomorrow.

@yebai
Copy link
Member

yebai commented May 13, 2017

UPDATE:

Collecting 1000 samples for LDA:

Turing.HMCDA takes 58.7 seconds.
Turing.NUTS takes 301.90 seconds.
Stan.NUTS takes 7.66134 seconds.

@xukai92
Copy link
Member Author

xukai92 commented May 13, 2017

@yebai I guess I will leave the vectorization of assume for future. The reason is that the vectorization is only make things faster if all of reconstruct vectorize link and invlink are vectorized, where the vectorization of link and invlink seems to be tricky for SimplexDistribution (for LDA).

Shall we merge this PR then?

@xukai92
Copy link
Member Author

xukai92 commented May 13, 2017

Another related issue I was stuck is the convention of matrix, i.e. default functions all treat thing as matrix and we now write things as vector of vector because of our chain interface issue (https://github.com/yebai/Turing.jl/issues/207).

@xukai92
Copy link
Member Author

xukai92 commented May 14, 2017

Can we merge this to master?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Found a typo in NUTS constructor

3 participants