Skip to content

Conversation

@hvanhovell
Copy link
Contributor

This PR is a 2nd follow-up for SPARK-9241. It contains the following improvements:

  • Fix for a potential bug in distinct child expression and attribute alignment.
  • Improved handling of duplicate distinct child expressions.
  • Added test for distinct UDAF with multiple children.

cc @yhuai

@SparkQA
Copy link

SparkQA commented Nov 9, 2015

Test build #45361 has finished for PR 9566 at commit ea4ea69.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yhuai
Copy link
Contributor

yhuai commented Nov 10, 2015

#9556 is in. Let's rebase this one. Also, can you explain more about the potential bug? So, it will be easier for a reader to get the context.

@hvanhovell
Copy link
Contributor Author

I'll rebase.

The potential bug is caused by the fact that the attributes for the distinct columns and the expressions for the distinct columns can possibly be misaligned. This currently relies on the fact that a sequence maintains the same order as a (hash) map derived from it. This currently works, but is not guaranteed by the scala collections library. I'd rather fix this, than to wait for something to go wrong.

…t & improve handling of duplicate distinct child expressions. Added test for distinct UDAF with multiple children.
@hvanhovell hvanhovell force-pushed the SPARK-9241-followup-2 branch from ea4ea69 to e61849b Compare November 10, 2015 22:06
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By calling toMap I am potentially breaking the alignment between distinctAggChildren and distinctAggChildAttrs.

@yhuai
Copy link
Contributor

yhuai commented Nov 10, 2015

LGTM

@SparkQA
Copy link

SparkQA commented Nov 11, 2015

Test build #45558 has finished for PR 9566 at commit e61849b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yhuai
Copy link
Contributor

yhuai commented Nov 11, 2015

Thanks! Merging to master and branch 1.6.

asfgit pushed a commit that referenced this pull request Nov 11, 2015
This PR is a 2nd follow-up for [SPARK-9241](https://issues.apache.org/jira/browse/SPARK-9241). It contains the following improvements:
* Fix for a potential bug in distinct child expression and attribute alignment.
* Improved handling of duplicate distinct child expressions.
* Added test for distinct UDAF with multiple children.

cc yhuai

Author: Herman van Hovell <[email protected]>

Closes #9566 from hvanhovell/SPARK-9241-followup-2.

(cherry picked from commit 21c562f)
Signed-off-by: Yin Huai <[email protected]>
@asfgit asfgit closed this in 21c562f Nov 11, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants