Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,11 @@ object BooleanSimplification extends Rule[LogicalPlan] with PredicateHelper {
case TrueLiteral Or _ => TrueLiteral
case _ Or TrueLiteral => TrueLiteral

case a And b if Not(a).semanticEquals(b) => FalseLiteral
case a Or b if Not(a).semanticEquals(b) => TrueLiteral
case a And b if a.semanticEquals(Not(b)) => FalseLiteral
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logically it feels like duplication of code from line 156 ... but unfortunately Not is not smart enough to realise that. I think if you override the semanticEquals in Not then you should be able to get rid of this line. The advantage being we would make the expression smart enough to figure this out by itself rather than handling this in outside code (which is possibly more places in the code).

Same applies for line 159.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant something like this for Not:

  override def semanticEquals(other: Expression): Boolean = other match {
    case Not(otherChild) => child.semanticEquals(otherChild)
    case _ => child match {
      case Not(innerChild) =>
        // eliminate double negation
        innerChild.semanticEquals(other)
      case _ =>
        super.semanticEquals(other)
    }
  }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this? Not(Not(a)) will be simplified to a

case a Or b if a.semanticEquals(Not(b)) => TrueLiteral

case a And b if a.semanticEquals(b) => a
case a Or b if a.semanticEquals(b) => a

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ import org.apache.spark.sql.catalyst.plans.PlanTest
import org.apache.spark.sql.catalyst.plans.logical._
import org.apache.spark.sql.catalyst.rules._
import org.apache.spark.sql.internal.SQLConf
import org.apache.spark.sql.Row

class BooleanSimplificationSuite extends PlanTest with PredicateHelper {

Expand All @@ -42,6 +43,16 @@ class BooleanSimplificationSuite extends PlanTest with PredicateHelper {

val testRelation = LocalRelation('a.int, 'b.int, 'c.int, 'd.string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding another boolean column?

  val testRelation = LocalRelation('a.int, 'b.int, 'c.int, 'd.string, 'e.boolean)

Our filter conditions are not allowed to accept the non-boolean predicates


val testRelationWithData = LocalRelation.fromExternalRows(
testRelation.output, Seq(Row(1, 2, 3, "abc"))
)

private def checkCondition(input: Expression, expected: LogicalPlan): Unit = {
val plan = testRelationWithData.where(input).analyze
val actual = Optimize.execute(plan)
comparePlans(actual, expected)
}

private def checkCondition(input: Expression, expected: Expression): Unit = {
val plan = testRelation.where(input).analyze
val actual = Optimize.execute(plan)
Expand Down Expand Up @@ -160,4 +171,12 @@ class BooleanSimplificationSuite extends PlanTest with PredicateHelper {
testRelation.where('a > 2 || ('b > 3 && 'b < 5)))
comparePlans(actual, expected)
}

test("Complementation Laws") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about double negation ? ie. 'a && !(!'a)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really required for this PR?

checkCondition('a && !'a, testRelation)
checkCondition(!'a && 'a, testRelation)

checkCondition('a || !'a, testRelationWithData)
checkCondition(!'a || 'a, testRelationWithData)
}
}