Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

In Spark SQL, some expression may output safe format values, e.g. CreateArray, CreateStruct, Cast, etc. When we compare 2 values, we should be able to compare safe and unsafe formats.

The GreaterThan, LessThan, etc. in Spark SQL already handles it, but the EqualTo doesn't. This PR fixes it.

How was this patch tested?

new unit test and regression test

@cloud-fan
Copy link
Contributor Author

@SparkQA
Copy link

SparkQA commented Nov 18, 2016

Test build #68843 has finished for PR 15929 at commit 00c432a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

val funcCode: String =
s"""
public int $compareFunc(ArrayData a, ArrayData b) {
if (a instanceof UnsafeArrayData && b instanceof UnsafeArrayData && a == b) {
Copy link
Contributor

@hvanhovell hvanhovell Nov 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the instanceof checks? This is java, you are comparing by reference.

case dt: DataType if isPrimitiveType(dt) => s"$c1 == $c2"
case dt: DataType if dt.isInstanceOf[AtomicType] => s"$c1.equals($c2)"
case array: ArrayType => genComp(array, c1, c2) + " == 0"
case struct: StructType => genComp(struct, c1, c2) + " == 0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about MapType?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MapType is not comparable. We currently do not support equals() nor hashcode() for MapData. See https://issues.apache.org/jira/browse/SPARK-18134 for a fun discussion on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

: ) That is a fun discussion.

Also accidentally found how Preso did it in a PR: https://github.com/prestodb/presto/pull/2469/files

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do not plan to support MapType, could we add a negative test case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in #15956

} else if (left.dataType != BinaryType) {
input1 == input2
} else {
java.util.Arrays.equals(input1.asInstanceOf[Array[Byte]], input2.asInstanceOf[Array[Byte]])
Copy link
Member

@gatorsmile gatorsmile Nov 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug, right? I mean the previous code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, we may compare an unsafe and safe row/array/map previously.

@SparkQA
Copy link

SparkQA commented Nov 21, 2016

Test build #68915 has finished for PR 15929 at commit bf87125.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@hvanhovell
Copy link
Contributor

hvanhovell commented Nov 23, 2016

LGTM. Merging to master/2.1/2.0. Thanks!

asfgit pushed a commit that referenced this pull request Nov 23, 2016
## What changes were proposed in this pull request?

In Spark SQL, some expression may output safe format values, e.g. `CreateArray`, `CreateStruct`, `Cast`, etc. When we compare 2 values, we should be able to compare safe and unsafe formats.

The `GreaterThan`, `LessThan`, etc. in Spark SQL already handles it, but the `EqualTo` doesn't. This PR fixes it.

## How was this patch tested?

new unit test and regression test

Author: Wenchen Fan <[email protected]>

Closes #15929 from cloud-fan/type-aware.

(cherry picked from commit 84284e8)
Signed-off-by: Herman van Hovell <[email protected]>
asfgit pushed a commit that referenced this pull request Nov 23, 2016
## What changes were proposed in this pull request?

In Spark SQL, some expression may output safe format values, e.g. `CreateArray`, `CreateStruct`, `Cast`, etc. When we compare 2 values, we should be able to compare safe and unsafe formats.

The `GreaterThan`, `LessThan`, etc. in Spark SQL already handles it, but the `EqualTo` doesn't. This PR fixes it.

## How was this patch tested?

new unit test and regression test

Author: Wenchen Fan <[email protected]>

Closes #15929 from cloud-fan/type-aware.

(cherry picked from commit 84284e8)
Signed-off-by: Herman van Hovell <[email protected]>
@asfgit asfgit closed this in 84284e8 Nov 23, 2016
robert3005 pushed a commit to palantir/spark that referenced this pull request Dec 2, 2016
## What changes were proposed in this pull request?

In Spark SQL, some expression may output safe format values, e.g. `CreateArray`, `CreateStruct`, `Cast`, etc. When we compare 2 values, we should be able to compare safe and unsafe formats.

The `GreaterThan`, `LessThan`, etc. in Spark SQL already handles it, but the `EqualTo` doesn't. This PR fixes it.

## How was this patch tested?

new unit test and regression test

Author: Wenchen Fan <[email protected]>

Closes apache#15929 from cloud-fan/type-aware.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
## What changes were proposed in this pull request?

In Spark SQL, some expression may output safe format values, e.g. `CreateArray`, `CreateStruct`, `Cast`, etc. When we compare 2 values, we should be able to compare safe and unsafe formats.

The `GreaterThan`, `LessThan`, etc. in Spark SQL already handles it, but the `EqualTo` doesn't. This PR fixes it.

## How was this patch tested?

new unit test and regression test

Author: Wenchen Fan <[email protected]>

Closes apache#15929 from cloud-fan/type-aware.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants