Skip to content

Conversation

@dilipbiswal
Copy link
Contributor

What changes were proposed in this pull request?

Enhances the parser and analyzer to support ANSI compliant syntax for GROUPING SET. As part of this change we derive the grouping expressions from user supplied groupings in the grouping sets clause.

SELECT c1, c2, max(c3) 
FROM t1
GROUP BY GROUPING SETS ((c1), (c1, c2))

How was this patch tested?

Added tests in SQLQueryTestSuite and ResolveGroupingAnalyticsSuite.

Please review http://spark.apache.org/contributing.html before opening a pull request.

@dilipbiswal dilipbiswal changed the title [SPARK 24424] Support ANSI-SQL compliant syntax for GROUPING SET [SPARK 24424][SQL] Support ANSI-SQL compliant syntax for GROUPING SET Jul 19, 2018
@SparkQA
Copy link

SparkQA commented Jul 19, 2018

Test build #93260 has finished for PR 21813 at commit b5ada3f.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dilipbiswal
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Jul 19, 2018

Test build #93266 has finished for PR 21813 at commit b5ada3f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dilipbiswal dilipbiswal changed the title [SPARK 24424][SQL] Support ANSI-SQL compliant syntax for GROUPING SET [SPARK-24424][SQL] Support ANSI-SQL compliant syntax for GROUPING SET Jul 19, 2018
aggregationExprs: Seq[NamedExpression],
child: LogicalPlan): LogicalPlan = {


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this blank lines can be removed?

fromClause
: FROM relation (',' relation)* lateralView* pivotClause?
;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this?

child: LogicalPlan): LogicalPlan = {
val gid = AttributeReference(VirtualColumn.groupingIdName, IntegerType, false)()

val finalGroupByExpressions = if (groupByExprs == Nil) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we do this in the branch of case x: GroupingSets if x.expressions.forall(_.resolved) =>? I think this constructAggregate method is also used by other clauses like Cube and Rollup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya Yeah.. so for cube and rollup, we will always have groupByExprs setup right ? So i felt its better to keep the code consolidated here in this function. What do u think ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Mind to add a comment on this like SPARK-24424: this only happens for ANSI-SQL compliant syntax for GROUPING SET?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya Sure will do.


val originalPlan2 = GroupingSets(Seq(Seq(), Seq(unresolved_a), Seq(unresolved_a, unresolved_b)),
Nil, r1,
Seq(unresolved_a, unresolved_b, UnresolvedAlias(count(unresolved_c))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I think originalPlan2 looks the same as originalPlan?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya Thanks.. u right. I will remove it.

@SparkQA
Copy link

SparkQA commented Jul 19, 2018

Test build #93291 has finished for PR 21813 at commit ac8f04f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 20, 2018

Test build #93299 has finished for PR 21813 at commit 7cf187d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

HAVING GROUPING__ID > 1;

-- Group sets without explicit group by
SELECT grouping(c1) FROM (VALUES ('x', 'a', 10), ('y', 'b', 20)) AS t (c1, c2, c3) GROUP BY c1,c2 GROUPING SETS (c1,c2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this have explicit group by?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya sorry.. yeah.. i will remove the explicit group by cols.. I had it for testing but forgot to take them out.

FROM (VALUES (1, 2), (3, 2)) t(c1, c2)
GROUP BY GROUPING SETS ( ( c1 ), ( c1, c2 ) )
HAVING col2 IS NOT NULL
ORDER BY -col1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've manually verified the results should be correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya Sorry Simon.. do i have to do something for this comment ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at all. @dilipbiswal :-)

// can be null. In such case, we derive the groupByExprs from the user supplied values for
// grouping sets.
val finalGroupByExpressions = if (groupByExprs == Nil) {
selectedGroupByExprs.flatten.foldLeft(Seq.empty[Expression]) { (result, currentExpr) =>
Copy link
Member

@viirya viirya Jul 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if GROUP BY GROUPING SETS (())? Is it a valid query?

Copy link
Contributor Author

@dilipbiswal dilipbiswal Jul 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya No. We should be getting an error as we don't have a group by specification. I had tried this scenario against db2 to double check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a test case for it too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya Yeah.. was already adding it .. knew u would ask :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dilipbiswal Thanks! :-)

@SparkQA
Copy link

SparkQA commented Jul 20, 2018

Test build #93304 has finished for PR 21813 at commit e0c57f7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 20, 2018

Test build #93313 has finished for PR 21813 at commit 2ecf3e1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@gatorsmile gatorsmile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gatorsmile
Copy link
Member

Thanks! Merged to master.

@asfgit asfgit closed this in 2b91d99 Jul 20, 2018
@dilipbiswal
Copy link
Contributor Author

Thank you very much @gatorsmile @viirya

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants