Skip to content

Conversation

feadoor
Copy link
Contributor

@feadoor feadoor commented Oct 24, 2016

BigUint and BigInt can now be multiplied by a BigDigit, re-using the same buffer for the output, thereby reducing allocations and copying.

Points for comment:

  • These operations are only unidirectional. That is, I've only implemented Mul<BigDigit> for BigUint, and not Mul<BigUint> for BigDigit. It won't be any trouble to add the other direction if it's felt that it should exist.
  • I haven't implemented by-reference equivalents (such as impl<'a> Mul<BigDigit> for &'a BigUint). That's because you would not be able to reuse the same buffer for output, and that misses the point somewhat. Again, not a lot of trouble to add it in if it would be useful, but I'm less convinced of that.
  • Should there also be impl Mul<i32> for BigInt, or can users cope with multiplying by u32 and negating the result?

BigUint and BigInt can now be multiplied by a BigDigit, re-using the same buffer for the output, thereby reducing allocations and copying.
@feadoor
Copy link
Contributor Author

feadoor commented Oct 24, 2016

As an example of the benefit, I compared the speed of calculating 1000000! with and without scalar multiplication (cf issue #234), using the following two functions:

extern crate num;
use num::{BigUint, One};

fn factorial_fast(n: u32) -> BigUint {
    let mut f: BigUint = One::one();
    for i in 1..(n+1) {
        f = f * i;
    }
    f
}

fn factorial_slow(n: u32) -> BigUint {
    let mut f: BigUint = One::one();
    for i in 1..(n+1) {
        let bu: BigUint = From::from(i);
        f = f * bu;
    }
    f
}

The results were:

factorial_slow(1_000_000):

real    7m48.038s
user    6m20.716s
sys     1m27.170s

factorial_fast(1_000_000):

real    3m8.596s
user    3m8.497s
sys     0m0.052s

@cuviper
Copy link
Member

cuviper commented Oct 24, 2016

That speedup is great, especially in eliminating sys time!

These operations are only unidirectional. That is, I've only implemented Mul for BigUint, and not Mul for BigDigit. It won't be any trouble to add the other direction if it's felt that it should exist.

Yes please, we should have this both ways. I don't think users should have to worry about order when it's so easy to make both work.

I haven't implemented by-reference equivalents (such as impl<'a> Mul for &'a BigUint). That's because you would not be able to reuse the same buffer for output, and that misses the point somewhat. Again, not a lot of trouble to add it in if it would be useful, but I'm less convinced of that.

Right, you don't get the buffer reuse, but the actual multiplication might still be faster.

Should there also be impl Mul for BigInt, or can users cope with multiplying by u32 and negating the result?

Hmm. This opens a can of worms, that we might actually want Mul for all integer types, if only because I don't really like adding more BigDigit to the API surface. I'd eventually like to break that into an internal implementation detail. So, privileging u32 seems wrong from that perspective, but maybe it's a reasonable starting point.

For that matter, we probably should have scalars for all ops too. There's a stronger performance reason to have Mul and maybe Div too, but implementing the rest is better for API consistency. I don't want users to have to keep in mind, "what are the corner cases that num implemented?" Rather, they can just know that bigint now implements ops with scalar operands, across the board.

That's a lot of scope creep, I realize. If you're game, I'd love to see the full monty. If you're not, we should at least plan for how much work it is to get there. It might just be that we should fill out naive implementations everywhere first (convert to bigint and do a normal op), then optimize when we can.

@feadoor
Copy link
Contributor Author

feadoor commented Oct 26, 2016

I agree with you there, in terms of implementing the arithmetic operations for all integer types. I think the work breaks down into pieces quite nicely:

  • Implementing the operations for 32-bit scalars.
  • Getting the operations to work in both directions and with references.
  • Getting the operations to work with smaller integer types by promoting to 32-bit.
  • Implementing the operations for 64-bit scalars (which is a bit more fiddly). We can probably still get buffer-reuse here, but the associated costs are greater (at least for multiplication, it takes significantly more operations on primitive types). Maybe investigate whether the benefits outweigh the drawbacks, and if converting to BigInt / BigUint first is better.

The first and last bullets are where the real work is, the others can almost certainly be done with some useful macros, in the style of the rest of num-bigint.

Does that general approach sound like the right thing to do? I think I'm game - even though, as you say, it's a lot more than I originally set out to do with this PR.

@cuviper
Copy link
Member

cuviper commented Oct 26, 2016

Yes, I think your breakdown sounds great, and we can merge those incrementally over several PRs.

One note on the 64-bit part - even if we find it hard to do a fully-optimized scalar version, we can probably still avoid a full BigUint conversion with a heap allocation. Just converting to a local [lo,hi] array can probably let it re-use the internals of the normal operators.

remexre pushed a commit to remexre/num that referenced this pull request Jun 1, 2017
Refactor json_internal macro to need no helper macros
@cuviper
Copy link
Member

cuviper commented Jun 30, 2017

Closing in favor of #313

@cuviper cuviper closed this Jun 30, 2017
bors bot added a commit that referenced this pull request Jul 12, 2017
313: Scalar operations across all integer types r=cuviper

With my apologies for opening a new PR, and also for the 8 month delay, this continues the work started in #237 - the discussion there outlines the goals I was aiming for. I suppose this supersedes that PR and the other one can now be closed.

This PR adds support for Add, Sub, Mul, Div and Rem operations involving one BigInt/BigUint and one primitive integer, with operands in either order, and any combination of owned/borrowed arguments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants