Skip to content

Conversation

@bsilverthorn
Copy link

The existing implementation unfortunately falls prey to Java signed byte arithmetic in a couple of places. The hash of almost any odd-length buffer involving trailing 0xff bytes, for example, will not match the SipHash reference C implementation. This problem also makes it easy to generate collisions.

This pull request applies the typical Java unsigned-math workaround where necessary. The output now matches the reference implementation in my testing.

Apologies for the couple of accidental whitespace changes.

@bsilverthorn
Copy link
Author

I haven't had time to add a test case myself, unfortunately, that specifically highlights this problem. Should be straightforward to write, though (see the note above regarding trailing 0xff bytes, for example).

That said: I'd suggest taking a look at this. The existing implementation works for simple test cases, but will often compute incorrect results on real data. (It may even fail to prevent hash flooding, but I haven't convinced myself of that.)

@thomasmueller
Copy link

I found the same problem. Note: the length can not be negative, so the change "((long) data.length) & 0xff" is not needed. It is unfortunate the that official SipHash test vectors doesn't include randomly generated data...

@thomasmueller
Copy link

By the way, you may also want to fix "b[0] = (byte) (l & 0xff);" and so on (for those cases, the "& 0xff" is not needed).

@bsilverthorn
Copy link
Author

Good point. Reverted the length change and some whitespace noise. Leaving the b[0] code alone for this PR.

private static long lastBlock(byte[] data, int iter) {
long last = ((long) data.length) << 56;
int off = iter * 8;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one more change for this PR:

diff --git a/src/main/java/com/github/emboss/siphash/SipHash.java b/src/main/java/com/github/emboss/siphash/SipHash.java
index 9f31287..00c4640 100644
--- a/src/main/java/com/github/emboss/siphash/SipHash.java
+++ b/src/main/java/com/github/emboss/siphash/SipHash.java
@@ -21,7 +21,7 @@ public class SipHash {
     }
 
     private static long lastBlock(byte[] data, int iter) {
-        long last = ((long) data.length) << 56;
+        long last = (((long) data.length) & 0xff) << 56;
         int off = iter * 8;
 
         switch (data.length % 8) {

tpikonen pushed a commit to tpikonen/m8b that referenced this pull request Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants