Skip to content

Conversation

@ptkool
Copy link
Contributor

@ptkool ptkool commented Mar 7, 2017

What changes were proposed in this pull request?

This pull request implements the EVERY and ANY aggregates.

How was this patch tested?

Testing was performed using unit tests, integration tests, and manual tests.

@hvanhovell
Copy link
Contributor

hvanhovell commented Mar 7, 2017

@ptkool code wise this looks pretty good. I have one high level question. We could also use the exisiting max aggregate for any and min aggregate for all; what is the benefit of adding these operators?

@hvanhovell
Copy link
Contributor

ok to test

@ptkool
Copy link
Contributor Author

ptkool commented Mar 7, 2017

@hvanhovell So you're saying max(x) == min(x) for every and max(x) == true for any?

@SparkQA
Copy link

SparkQA commented Mar 7, 2017

Test build #74136 has finished for PR 17194 at commit 65fd666.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class AnyAgg(child: Expression) extends DeclarativeAggregate with ImplicitCastInputTypes
  • case class Every(child: Expression) extends DeclarativeAggregate with ImplicitCastInputTypes

@hvanhovell
Copy link
Contributor

I am sorry I have flipped the pairs (I have updated the comment). We can use the ordering of booleans here, false is smaller than true. Let's say that we have column x which is a boolean. if max(x) returns true we know that the column contains at least one true value. If min(x) returns true we know that all column only contains true values.

@ptkool
Copy link
Contributor Author

ptkool commented Mar 7, 2017

I think every and any are more intuitive, particularly if the operand is a boolean expression. For example, every(t1.c1 > t2.c2) vs max(t1.c1 > t2.c2). Also, every and any return false for empty sets, while min and max return null.

@ptkool
Copy link
Contributor Author

ptkool commented Mar 7, 2017

every and any are also part of the SQL standard, so most SQL users will be familiar with them.

@SparkQA
Copy link

SparkQA commented Mar 8, 2017

Test build #74141 has finished for PR 17194 at commit babd280.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zjffdu
Copy link
Contributor

zjffdu commented Mar 9, 2017

@ptkool Please update the title to include the JIRA Id so that it can be linked to jira automatically.

@ptkool ptkool changed the title Add new aggregates EVERY and ANY (SOME). [SPARK-19851] Add new aggregates EVERY and ANY (SOME). Mar 9, 2017
@ptkool ptkool force-pushed the master branch 2 times, most recently from 021cc85 to 5764f21 Compare March 13, 2017 17:56
@SparkQA
Copy link

SparkQA commented Mar 13, 2017

Test build #74447 has finished for PR 17194 at commit 5949456.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 13, 2017

Test build #74450 has finished for PR 17194 at commit 40b1267.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 13, 2017

Test build #74457 has finished for PR 17194 at commit 5764f21.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 13, 2017

Test build #74456 has finished for PR 17194 at commit 021cc85.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 14, 2017

Test build #74478 has finished for PR 17194 at commit 1f49c7f.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 14, 2017

Test build #74513 has finished for PR 17194 at commit 5e894f2.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 14, 2017

Test build #74522 has finished for PR 17194 at commit aba8d40.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 15, 2017

Test build #74599 has finished for PR 17194 at commit 0642113.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 15, 2017

Test build #74600 has finished for PR 17194 at commit f77b50e.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 15, 2017

Test build #74601 has finished for PR 17194 at commit bdc05d6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants