Skip to content

Conversation

@davies
Copy link
Contributor

@davies davies commented Dec 1, 2015

Use try to match the behavior for single distinct aggregation with Spark 1.5, but that's not scalable, we should be robust by default, have a flag to address performance regression for low cardinality aggregation.

cc @yhuai @nongli

@SparkQA
Copy link

SparkQA commented Dec 1, 2015

Test build #46994 has finished for PR 10075 at commit 192a04d.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually prefer the old name better. Two years from now, I'm not sure how much context we will have about "1.5".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the discussion with @nongli @yhuai , we agreed that the only reason we have this flag here is to address the concern about performance regression for single distinct on low cardinality aggregation since 1.5, also this is the way Oracle usually did.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the big difference is that Oracle releases every x years, while we release every 3 month...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we are doing the same thing as maintaining compatibility and no (much) regressions, otherwise we should just remove this flag (it's not public right now).

@davies davies changed the title [SPARK-10277] [SQL] change the default plan for single distinct [SPARK-12077] [SQL] change the default plan for single distinct Dec 2, 2015
@yhuai
Copy link
Contributor

yhuai commented Dec 2, 2015

LGTM pending jenkins.

@SparkQA
Copy link

SparkQA commented Dec 2, 2015

Test build #2137 has finished for PR 10075 at commit 192a04d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 2, 2015

Test build #47023 has finished for PR 10075 at commit 0a237b9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 2, 2015

Test build #2142 has finished for PR 10075 at commit 893b327.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yhuai
Copy link
Contributor

yhuai commented Dec 2, 2015

LGTM. Merging to master and branch 1.6.

asfgit pushed a commit that referenced this pull request Dec 2, 2015
Use try to match the behavior for single distinct aggregation with Spark 1.5, but that's not scalable, we should be robust by default, have a flag to address performance regression for low cardinality aggregation.

cc yhuai nongli

Author: Davies Liu <[email protected]>

Closes #10075 from davies/agg_15.

(cherry picked from commit 96691fe)
Signed-off-by: Yin Huai <[email protected]>
@asfgit asfgit closed this in 96691fe Dec 2, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants