-
Notifications
You must be signed in to change notification settings - Fork 19
Closed
Description
I gave my variant of fasta on the Benchmarks Game tracker and got directed here for comments.
The basic idea was that the x86 one core benchmarks put Haskell well in the lead. I just took the multithreaded Rust program and copied Haskell's implementation.
This has two main techniques:
- The cyclic part writes only slices of a prepared extended length string directly, rather than using a lazy cyclical stream or such. I improved on Haskell's method here by avoiding a second
B.putStrLn ""
by temporarily mutating the buffer to have an appropriately placed newline. - The lookup table results are cached. There seems to be a lot of variability in this. Of the top 5 in the single-threaded benchmarks, the 3 Haskell implementations all use a cache, the C and Fortran use non-short-circuiting linear searches. Also, the current Rust does a short-circuiting linear search and FWIW Python does a binary search. So who knows.
- Writing for this is done in blocks like the multithreaded Rust version.
Then I parallelized the RNG by implementing a somewhat trivial future
method that allows getting parallel streams of the same RNG. This makes parallelism on my two cores almost 100% efficient, although I've no idea what'll happen with more than two.
I also stole the while let Err(_) = wr.lock().unwrap().write(...)
idea, but I added a std::thread::yield_now()
. IIRC, it seems to help, although I can't remember by how much.
I can make a pull request if wanted.
Metadata
Metadata
Assignees
Labels
No labels