-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-29545][SQL] Add support for bit_xor aggregate function #26205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #112452 has finished for PR 26205 at commit
|
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dunno, @cloud-fan what do you think of adding this? seems niche but reasonable.
sql/core/src/test/resources/sql-tests/results/postgreSQL/aggregates_part2.sql.out
Outdated
Show resolved
Hide resolved
...t/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/bitwiseAggregates.scala
Show resolved
Hide resolved
|
Also do you need to update docs or Pyspark with this function? |
The newly added bitwise functions are only available in SQL. This is a general question: shall we add scala/python/R functions for all the builtin SQL functions? I feel we only need the common ones, as users can always do |
|
Test build #112503 has finished for PR 26205 at commit
|
|
Sounds fine to not add to Pyspark if other similar functions are not. I also don't see particular docs for these bitwise functions, so this may be all that's needed for consistency now. |
| 6 | ||
| """, | ||
| since = "3.0.0") | ||
| case class BitXorAgg(child: Expression) extends DeclarativeAggregate with ExpectsInputTypes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BitAndAgg, BitOrAgg, and BitXorAgg has the similar logic, so can we share it by using a new trait (e.g., BitwiseOpLike)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, will update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can make it much simpler like this;
abstract class BitAggregate extends DeclarativeAggregate with ExpectsInputTypes {
...
override lazy val updateExpressions: Seq[Expression] =
If(IsNull(bitAgg),
child,
If(IsNull(child), bitAgg, bitOp(bitAgg, child))) :: Nil
override lazy val mergeExpressions: Seq[Expression] =
If(IsNull(bitAgg.left),
bitAgg.right,
If(IsNull(bitAgg.right), bitAgg.left, bitOp(bitAgg.left, bitAgg.right))) :: Nil
}
case class BitAndAgg(child: Expression) extends BitAggregate {
override def nodeName: String = "bit_and"
override def bitOp(left: Expression, right: Expression): BinaryArithmetic =
BitwiseAnd(left, right)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, much cooler, thanks
|
@srowen I guess docs will be generated via |
|
Test build #112605 has finished for PR 26205 at commit
|
|
Test build #112611 has finished for PR 26205 at commit
|
|
The tests in |
|
just for curiosity, if add same test in |
|
I meant the tests for this pr should be put only in |
|
ok i got it |
|
Test build #112643 has finished for PR 26205 at commit
|
|
retest this please |
|
Test build #112653 has finished for PR 26205 at commit
|
maropu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM @cloud-fan @srowen can you recheck?
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still looks reasonable to me.
|
Thanks, @srowen. Merged to master. |
What changes were proposed in this pull request?
bit_xor(expr) - Returns the bitwise XOR of all non-null input values, or null if none
Why are the changes needed?
As we support
bit_and,bit_ornow, we'd better support the related aggregate function bit_xor ahead of postgreSQL, because many other popular databases support it.http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.help.sqlanywhere.12.0.1/dbreference/bit-xor-function.html
https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_bit-or
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/Functions/Aggregate/BIT_XOR.htm?TocPath=SQL%20Reference%20Manual%7CSQL%20Functions%7CAggregate%20Functions%7C_____10
Does this PR introduce any user-facing change?
add a new bit agg
How was this patch tested?
UTs added