Skip to content

Conversation

@ueshin
Copy link
Member

@ueshin ueshin commented Jun 10, 2022

What changes were proposed in this pull request?

Backport of #36812.

Fixes ArraySort to throw an exception when the comparator returns null.

Also updates the doc to follow the corrected behavior.

Why are the changes needed?

When the comparator of ArraySort returns null, currently it handles it as 0 (equal).

According to the doc,

It returns -1, 0, or 1 as the first element is less than, equal to, or greater than
the second element. If the comparator function returns other
values (including null), the function will fail and raise an error.

It's fine to return non -1, 0, 1 integers to follow the Java convention (still need to update the doc, though), but it should throw an exception for null result.

Does this PR introduce any user-facing change?

Yes, if a user uses a comparator that returns null, it will throw an error after this PR.

The legacy flag spark.sql.legacy.allowNullComparisonResultInArraySort can be used to restore the legacy behavior that handles null as 0 (equal).

How was this patch tested?

Added some tests.

…rator returns null

### What changes were proposed in this pull request?

Fixes `ArraySort` to throw an exception when the comparator returns `null`.

Also updates the doc to follow the corrected behavior.

### Why are the changes needed?

When the comparator of `ArraySort` returns `null`, currently it handles it as `0` (equal).

According to the doc,

```
It returns -1, 0, or 1 as the first element is less than, equal to, or greater than
the second element. If the comparator function returns other
values (including null), the function will fail and raise an error.
```

It's fine to return non -1, 0, 1 integers to follow the Java convention (still need to update the doc, though), but it should throw an exception for `null` result.

### Does this PR introduce _any_ user-facing change?

Yes, if a user uses a comparator that returns `null`, it will throw an error after this PR.

The legacy flag `spark.sql.legacy.allowNullComparisonResultInArraySort` can be used to restore the legacy behavior that handles `null` as `0` (equal).

### How was this patch tested?

Added some tests.
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM (Pending CIs)

dongjoon-hyun pushed a commit that referenced this pull request Jun 10, 2022
…comparator returns null

### What changes were proposed in this pull request?

Backport of #36812.

Fixes `ArraySort` to throw an exception when the comparator returns `null`.

Also updates the doc to follow the corrected behavior.

### Why are the changes needed?

When the comparator of `ArraySort` returns `null`, currently it handles it as `0` (equal).

According to the doc,

```
It returns -1, 0, or 1 as the first element is less than, equal to, or greater than
the second element. If the comparator function returns other
values (including null), the function will fail and raise an error.
```

It's fine to return non -1, 0, 1 integers to follow the Java convention (still need to update the doc, though), but it should throw an exception for `null` result.

### Does this PR introduce _any_ user-facing change?

Yes, if a user uses a comparator that returns `null`, it will throw an error after this PR.

The legacy flag `spark.sql.legacy.allowNullComparisonResultInArraySort` can be used to restore the legacy behavior that handles `null` as `0` (equal).

### How was this patch tested?

Added some tests.

Closes #36834 from ueshin/issues/SPARK-39419/3.3/array_sort.

Authored-by: Takuya UESHIN <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
@dongjoon-hyun
Copy link
Member

Merged to branch-3.3. Thank you, @ueshin .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants