Skip to content

Conversation

@HeartSaVioR
Copy link
Contributor

@HeartSaVioR HeartSaVioR commented Nov 29, 2019

What changes were proposed in this pull request?

This patch fixes the bug on ArrayKeyIndexType.hashCode() as it is simply calling Array.hashCode() which in turn calls Object.hashCode(). That should be Arrays.hashCode() to reflect the elements in the array.

Why are the changes needed?

I've encountered the bug in #25811 while adding test codes for #25811, and I've split the fix into individual PR to speed up reviewing. Without this patch, ArrayKeyIndexType would bring various issues when it's used as type of collections.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

I've skipped adding UT as ArrayKeyIndexType is in test and the patch is pretty simple one-liner.

@HeartSaVioR HeartSaVioR changed the title [SPARK-30075][CORE] Fix the hashCode implementation of ArrayKeyIndexType correctly [SPARK-30075][CORE][TESTS] Fix the hashCode implementation of ArrayKeyIndexType correctly Nov 29, 2019
@HeartSaVioR
Copy link
Contributor Author

HeartSaVioR commented Nov 29, 2019

I'm also seeing possible spot to touch: ArrayWrappers.

https://github.com/apache/spark/blob/master/common/kvstore/src/main/java/org/apache/spark/util/kvstore/ArrayWrappers.java

ComparableXXXArray implements its own hashCode, which would work as IDE creates similar code, but how about simply using Arrays.hashCode()? The major difference between custom code and Arrays.hashCode() would be "for loop with int index" and "for each loop", which I'm not sure there's outstanding difference - the array is expected to be fairly small. Arrays.hashCode() is also null safe while custom code is not.

If it also makes sense, I'll address it altogether in this PR. Thanks in advance!

@SparkQA
Copy link

SparkQA commented Nov 29, 2019

Test build #114605 has finished for PR 26709 at commit 5de21fa.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HeartSaVioR
Copy link
Contributor Author

Retest this, please

@SparkQA
Copy link

SparkQA commented Nov 29, 2019

Test build #114619 has finished for PR 26709 at commit 5de21fa.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@Override
public int hashCode() {
return key.hashCode();
return Arrays.hashCode(key);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that equals() uses id too, how about Arrays.hashCode(key) ^ Arrays.hashCode(id)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might be the reason of participating only key, but your suggestion would be symmetrical and no big deal (as it had been completely broken). Thanks for the suggestion! Applied.

@SparkQA
Copy link

SparkQA commented Nov 30, 2019

Test build #114644 has finished for PR 26709 at commit 7a74f06.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HeartSaVioR
Copy link
Contributor Author

HeartSaVioR commented Nov 30, 2019

Retest this, please

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Nov 30, 2019

Test build #114664 has finished for PR 26709 at commit 7a74f06.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 30, 2019

Test build #114667 has finished for PR 26709 at commit 7a74f06.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen srowen closed this in e04a634 Dec 2, 2019
@srowen
Copy link
Member

srowen commented Dec 2, 2019

Merged to master

@HeartSaVioR
Copy link
Contributor Author

Thanks all for reviewing and merging!

@HeartSaVioR HeartSaVioR deleted the SPARK-30075 branch December 2, 2019 22:43
attilapiros pushed a commit to attilapiros/spark that referenced this pull request Dec 6, 2019
…yIndexType correctly

### What changes were proposed in this pull request?

This patch fixes the bug on ArrayKeyIndexType.hashCode() as it is simply calling Array.hashCode() which in turn calls Object.hashCode(). That should be Arrays.hashCode() to reflect the elements in the array.

### Why are the changes needed?

I've encountered the bug in apache#25811 while adding test codes for apache#25811, and I've split the fix into individual PR to speed up reviewing. Without this patch, ArrayKeyIndexType would bring various issues when it's used as type of collections.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

I've skipped adding UT as ArrayKeyIndexType is in test and the patch is pretty simple one-liner.

Closes apache#26709 from HeartSaVioR/SPARK-30075.

Authored-by: Jungtaek Lim (HeartSaVioR) <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants