Skip to content

Conversation

@gengliangwang
Copy link
Member

What changes were proposed in this pull request?

Previously, PR #19201 fix the problem of non-converging constraints.
After that PR #19149 improve the loop and constraints is inferred only once.
So the problem of non-converging constraints is gone.

However, the case below will fail.


spark.range(5).write.saveAsTable("t")
val t = spark.read.table("t")
val left = t.withColumn("xid", $"id" + lit(1)).as("x")
val right = t.withColumnRenamed("id", "xid").as("y")
val df = left.join(right, "xid").filter("id = 3").toDF()
checkAnswer(df, Row(4, 3))

Because aliasMap replace all the aliased child. See the test case in PR for details.

This PR is to fix this bug by removing useless code for preventing non-converging constraints.
It can be also fixed with #20270, but this is much simpler and clean up the code.

How was this patch tested?

Unit test

@SparkQA
Copy link

SparkQA commented Jan 16, 2018

Test build #86181 has started for PR 20278 at commit c02d9b4.

@gengliangwang
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Jan 17, 2018

Test build #86216 has finished for PR 20278 at commit c02d9b4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

comparePlans(optimized, correctAnswer)
}

test("inner join with alias: don't generate constraints for recursive functions") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These test cases are invalid after the code changes, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

PushDownPredicate,
InferFiltersFromConstraints,
CombineFilters,
SimplifyBinaryComparison,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this rule, the test cases will fail?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just different from the expected result because of this line:
https://github.com/gengliangwang/spark/blob/c02d9b4bdccafcdf4008dcdd4c2ad9509c9acd96/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala#L258
These predicates can be simplified by SimplifyBinaryComparison. So no need to modified the expected test results .

@cloud-fan
Copy link
Contributor

LGTM, merging to master/2.3!

asfgit pushed a commit that referenced this pull request Jan 17, 2018
## What changes were proposed in this pull request?

Previously, PR #19201 fix the problem of non-converging constraints.
After that PR #19149 improve the loop and constraints is inferred only once.
So the problem of non-converging constraints is gone.

However, the case below will fail.

```

spark.range(5).write.saveAsTable("t")
val t = spark.read.table("t")
val left = t.withColumn("xid", $"id" + lit(1)).as("x")
val right = t.withColumnRenamed("id", "xid").as("y")
val df = left.join(right, "xid").filter("id = 3").toDF()
checkAnswer(df, Row(4, 3))

```

Because `aliasMap` replace all the aliased child. See the test case in PR for details.

This PR is to fix this bug by removing useless code for preventing non-converging constraints.
It can be also fixed with #20270, but this is much simpler and clean up the code.

## How was this patch tested?

Unit test

Author: Wang Gengliang <[email protected]>

Closes #20278 from gengliangwang/FixConstraintSimple.

(cherry picked from commit 8598a98)
Signed-off-by: Wenchen Fan <[email protected]>
@asfgit asfgit closed this in 8598a98 Jan 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants