Skip to content

Conversation

@daxpedda
Copy link
Contributor

@daxpedda daxpedda commented Jan 7, 2022

Just a start, I'm gonna continue to pick them off one by one.

Copy link
Contributor

@mikelodder7 mikelodder7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make the prod by a nonzero?
That would prevent the case of accidentally calling hash to field for 0 bytes and getting points at infinity

@mikelodder7
Copy link
Contributor

This is cool By the way

@daxpedda
Copy link
Contributor Author

daxpedda commented Jan 7, 2022

I believe I covered all possible overflows now. The only thing that's not according to spec is the limit on the output size of the hash. It should actually be Or<<L as IsLess<U256>>, <HashT::OutputSize as IsLessOrEqual<U256>>>, sadly there is no such thing as an Or, but I think that's alright for now.

@daxpedda daxpedda marked this pull request as ready for review January 7, 2022 05:35
@daxpedda
Copy link
Contributor Author

daxpedda commented Jan 7, 2022

Can you make the prod by a nonzero? That would prevent the case of accidentally calling hash to field for 0 bytes and getting points at infinity

I'm surprised I can't find anything in the spec, it says nowhere that it has to be a non-zero as far as I searched.

@mikelodder7
Copy link
Contributor

mikelodder7 commented Jan 7, 2022

I'm surprised I can't find anything in the spec, it says nowhere that it has to be a non-zero as far as I searched.

I know but it’s a case we should check anyway

@daxpedda
Copy link
Contributor Author

daxpedda commented Jan 7, 2022

Maybe we should report it?

@mikelodder7
Copy link
Contributor

Agree. I’ll file a ticket at IETF

@daxpedda
Copy link
Contributor Author

daxpedda commented Jan 7, 2022

I hope the comments I added aren't too much, otherwise this is good to go @tarcieri.

@daxpedda daxpedda mentioned this pull request Jan 7, 2022
@tarcieri
Copy link
Member

tarcieri commented Jan 7, 2022

I think something like this would be cool with proper const generics, but speaking as a long-time power user I really think you're going to have a bad time using it with generic-array/typenum.

For starters, the caller needs to know all possible input message lengths at compile time? Will they?

You have a fairly large section of code that's generic around the message size. That section of code is going to monomorphize for the size of each input message, which could lead to code bloat.

But the real kicker to me is generic-array is bad for arrays this size, due to the limitations of typenum.

U65536 works, but what if you want something that's 65535 bytes? typenum doesn't provide a U65535 alias.

After U1024, there are large ranges of numbers which typenum doesn't provide aliases for:

  • 1025-2047
  • 2049-4095
  • 4097-8191
  • 8193-9999
  • 10001-16383
  • 16385-32767
  • 32769-65535

While there are hacks you can use to express those values, they are very much not going to be ergonomic to use.

@daxpedda
Copy link
Contributor Author

daxpedda commented Jan 7, 2022

I was hoping that in practice only certain sizes are used in very low ranges so that this wouldn't be an issue. I don't disagree though.

A proposal: we could provide an alternative fallible run-time checked version, but keep this one around to make sure the hash2curve implementation stays infallible. We could keep the infallible version of ExpandMsg internal too.

WDYT?

@tarcieri
Copy link
Member

tarcieri commented Jan 7, 2022

You should definitely change the internal implementation to a fallible, type-erased one that operates on slices to avoid monomorphization code bloat. The type safe one could call expect (although of course the compiler won't be able to prove it can remove the panic)

Infallibility/fallibility creates a pretty big divide per #871. If you want to support public fallible, slice-based APIs in addition to infallible type-safe ones you will need to have this paired fallible/infallible approach for 6 different methods at least (12 methods total).

The monomorphization code bloat problem will impact all of these methods as well, so maybe at the very least you should have private, fallible slice-based methods the infallible type-safe ones call.

Having gotten burned by large generic-arrays in the past I really do worry about the ergonomics here. These are not going to be fun APIs to use if you have messages >1024 bytes. And they will be an enormous pain if anyone wants to use them with variable-sized messages unless the fallible APIs are public.

@daxpedda
Copy link
Contributor Author

daxpedda commented Jan 7, 2022

Alright, let me see if I can remove these constraints without the other improvements I've done so far.

@daxpedda
Copy link
Contributor Author

daxpedda commented Jan 7, 2022

I'm pretty sure I basically reverted everything I've done so far, sad that this can't be salvaged. Let's continue on #871.

@daxpedda daxpedda closed this Jan 7, 2022
@tarcieri
Copy link
Member

tarcieri commented Jan 7, 2022

I think the core idea is still good, but we need generic_const_exprs to stabilize to address some of the issues I highlighted and make it more ergonomic

tarcieri pushed a commit that referenced this pull request Jun 13, 2025
This PR introduces two changes:
- Remove requirements that were only relevant for oversized `DST`s. Now
these requirements are checked on runtime. While this is unfortunate,
the currently limitations simply was that usage with regular sized
`DST`s incurred limitations that were not necessary.
- Change `len_in_bytes` from `NonZero<usize>` to `NonZero<u16>`. This
isn't a big improvement because the error is just moved from
`expand_msg()` to the various `GroupDigest` methods.

Companion PR: RustCrypto/elliptic-curves#1256.

---
I know I have been refactoring this API over and over again, but I
actually think this is the last of it (apart from
#872 with
`generic_const_exprs`).

But for completions sake I want to mention the following [from the
spec](https://www.rfc-editor.org/rfc/rfc9380.html#section-5.3.1-8):
> It is possible, however, to entirely avoid this overhead by taking
advantage of the fact that Z_pad depends only on H, and not on the
arguments to expand_message_xmd. To do so, first precompute and save the
internal state of H after ingesting Z_pad. Then, when computing b_0,
initialize H using the saved state. Further details are implementation
dependent and are beyond the scope of this document.

In summary, we could cache this part:
```rust
let mut b_0 = HashT::default();
b_0.update(&Array::<u8, HashT::BlockSize>::default());
```
Doing this requires passing `ExpandMsg` state, which would change the
entire API having to add a parameter to every function.

However, as the spec mentions, the cost of not caching it is most likely
negligible. We will see in the future if this shows up in benchmarks and
if it does we can re-evaluate. I don't believe this will be the case
though.

Alternatively, we could add a trait to `digest` which allows users to
construct a hash prefixed with a `BlockSize` full of zeros that has been
computed at compile-time. Which would also require no changes to the API
except binding to this trait.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants