[SPARK-25368][SQL] Incorrect predicate pushdown returns wrong result #22368

wangyum · 2018-09-09T04:27:14Z

What changes were proposed in this pull request?

How to reproduce:

val df1 = spark.createDataFrame(Seq(
   (1, 1)
)).toDF("a", "b").withColumn("c", lit(null).cast("int"))
val df2 = df1.union(df1).withColumn("d", spark_partition_id).filter($"c".isNotNull)
df2.show

+---+---+----+---+
|  a|  b|   c|  d|
+---+---+----+---+
|  1|  1|null|  0|
|  1|  1|null|  1|
+---+---+----+---+

filter($"c".isNotNull) was transformed to (null <=> c#10) before #19201, but it is transformed to (c#10 = null) since #20155. This pr revert it to (null <=> c#10) to fix this issue.

How was this patch tested?

unit tests

SparkQA · 2018-09-09T07:05:01Z

Test build #95841 has finished for PR 22368 at commit 865e0af.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

wangyum · 2018-09-09T07:28:01Z

retest this please

SparkQA · 2018-09-09T11:24:22Z

Test build #95844 has finished for PR 22368 at commit 865e0af.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile

LGTM

PR title is not accurate. This is related to constraint inference instead of predicate pushdown

Thanks! Merged to master/2.4/2.3

## What changes were proposed in this pull request? How to reproduce: ```scala val df1 = spark.createDataFrame(Seq( (1, 1) )).toDF("a", "b").withColumn("c", lit(null).cast("int")) val df2 = df1.union(df1).withColumn("d", spark_partition_id).filter($"c".isNotNull) df2.show +---+---+----+---+ | a| b| c| d| +---+---+----+---+ | 1| 1|null| 0| | 1| 1|null| 1| +---+---+----+---+ ``` `filter($"c".isNotNull)` was transformed to `(null <=> c#10)` before #19201, but it is transformed to `(c#10 = null)` since #20155. This pr revert it to `(null <=> c#10)` to fix this issue. ## How was this patch tested? unit tests Closes #22368 from wangyum/SPARK-25368. Authored-by: Yuming Wang <[email protected]> Signed-off-by: gatorsmile <[email protected]> (cherry picked from commit 77c9964) Signed-off-by: gatorsmile <[email protected]>

How to reproduce: ```scala val df1 = spark.createDataFrame(Seq( (1, 1) )).toDF("a", "b").withColumn("c", lit(null).cast("int")) val df2 = df1.union(df1).withColumn("d", spark_partition_id).filter($"c".isNotNull) df2.show +---+---+----+---+ | a| b| c| d| +---+---+----+---+ | 1| 1|null| 0| | 1| 1|null| 1| +---+---+----+---+ ``` `filter($"c".isNotNull)` was transformed to `(null <=> c#10)` before #19201, but it is transformed to `(c#10 = null)` since #20155. This pr revert it to `(null <=> c#10)` to fix this issue. unit tests Closes #22368 from wangyum/SPARK-25368. Authored-by: Yuming Wang <[email protected]> Signed-off-by: gatorsmile <[email protected]> (cherry picked from commit 77c9964) Signed-off-by: gatorsmile <[email protected]>

wangyum added 2 commits September 9, 2018 11:46

Fix SPARK-25368

86b9b78

Fix InferFiltersFromConstraintsSuite test error

865e0af

gatorsmile reviewed Sep 9, 2018

View reviewed changes

asfgit closed this in 77c9964 Sep 9, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-25368][SQL] Incorrect predicate pushdown returns wrong result #22368

[SPARK-25368][SQL] Incorrect predicate pushdown returns wrong result #22368

Uh oh!

wangyum commented Sep 9, 2018 •

edited

Loading

Uh oh!

SparkQA commented Sep 9, 2018

Uh oh!

wangyum commented Sep 9, 2018

Uh oh!

SparkQA commented Sep 9, 2018

Uh oh!

gatorsmile left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SPARK-25368][SQL] Incorrect predicate pushdown returns wrong result #22368

[SPARK-25368][SQL] Incorrect predicate pushdown returns wrong result #22368

Uh oh!

Conversation

wangyum commented Sep 9, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Sep 9, 2018

Uh oh!

wangyum commented Sep 9, 2018

Uh oh!

SparkQA commented Sep 9, 2018

Uh oh!

gatorsmile left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wangyum commented Sep 9, 2018 •

edited

Loading

gatorsmile left a comment •

edited

Loading