[SPARK-12888][SQL] benchmark the new hash expression #10816

cloud-fan · 2016-01-18T21:56:41Z

Benchmark it on 4 different schemas, the result:

Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz
Hash For simple:                   Avg Time(ms)    Avg Rate(M/s)  Relative Rate
-------------------------------------------------------------------------------
interpreted version                       31.47           266.54         1.00 X
codegen version                           64.52           130.01         0.49 X

Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz
Hash For normal:                   Avg Time(ms)    Avg Rate(M/s)  Relative Rate
-------------------------------------------------------------------------------
interpreted version                     4068.11             0.26         1.00 X
codegen version                         1175.92             0.89         3.46 X

Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz
Hash For array:                    Avg Time(ms)    Avg Rate(M/s)  Relative Rate
-------------------------------------------------------------------------------
interpreted version                     9276.70             0.06         1.00 X
codegen version                        14762.23             0.04         0.63 X

Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz
Hash For map:                      Avg Time(ms)    Avg Rate(M/s)  Relative Rate
-------------------------------------------------------------------------------
interpreted version                    58869.79             0.01         1.00 X
codegen version                         9285.36             0.06         6.34 X

SparkQA · 2016-01-18T22:03:24Z

Test build #49621 has finished for PR 10816 at commit d753ce8.

This patch fails RAT tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- class ALS(@Since(\"1.4.0\") override val uid: String) extends Estimator[ALSModel] with ALSParams

cloud-fan · 2016-01-18T22:09:40Z

The hash exprssion is slow on simple schema because the UnsafeProjection has some unnecessary opeations every round and will hurt the performance if each round is quite small(like simple schema), this can be fixed by #10809. For array type, still investigate.

cc @nongli

cloud-fan · 2016-01-18T22:29:31Z

retest this please

nongli · 2016-01-18T23:59:22Z

I'm confused by the conclusino. We don't use UnsafeProjection anymore right?

cloud-fan · 2016-01-19T00:09:48Z

We need UnsafeProjection to execute the codegen version of hash expression.

SparkQA · 2016-01-19T00:27:27Z

Test build #49629 has finished for PR 10816 at commit 1771733.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

nongli · 2016-01-19T19:10:48Z

sql/catalyst/src/test/scala/org/apache/spark/sql/HashBenchmark.scala

Can you include the results as comments for each test() case?

nongli · 2016-01-19T19:10:59Z

LGTM

cloud-fan · 2016-01-19T19:15:59Z

we should run this benchmark again after #10809 is merged.

rxin · 2016-01-20T23:07:48Z

I'm going to merge this. Please submit an update once #10809 is merged and put the result in the test case as comments.

cloud-fan added 3 commits January 15, 2016 21:21

benchmark for hash expression

a8d9433

Merge remote-tracking branch 'origin/master' into hash-benchmark

28763a1

Merge remote-tracking branch 'origin/master' into hash-benchmark

d753ce8

add license

1771733

nongli reviewed Jan 19, 2016
View reviewed changes

sql/catalyst/src/test/scala/org/apache/spark/sql/HashBenchmark.scala

Copy link

Contributor

nongli Jan 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you include the results as comments for each test() case?

asfgit closed this in f3934a8 Jan 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-12888][SQL] benchmark the new hash expression #10816

[SPARK-12888][SQL] benchmark the new hash expression #10816

Uh oh!

cloud-fan commented Jan 18, 2016

Uh oh!

SparkQA commented Jan 18, 2016

Uh oh!

cloud-fan commented Jan 18, 2016

Uh oh!

cloud-fan commented Jan 18, 2016

Uh oh!

nongli commented Jan 18, 2016

Uh oh!

cloud-fan commented Jan 19, 2016

Uh oh!

SparkQA commented Jan 19, 2016

Uh oh!

nongli Jan 19, 2016

Uh oh!

nongli commented Jan 19, 2016

Uh oh!

cloud-fan commented Jan 19, 2016

Uh oh!

rxin commented Jan 20, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-12888][SQL] benchmark the new hash expression #10816

[SPARK-12888][SQL] benchmark the new hash expression #10816

Uh oh!

Conversation

cloud-fan commented Jan 18, 2016

Uh oh!

SparkQA commented Jan 18, 2016

Uh oh!

cloud-fan commented Jan 18, 2016

Uh oh!

cloud-fan commented Jan 18, 2016

Uh oh!

nongli commented Jan 18, 2016

Uh oh!

cloud-fan commented Jan 19, 2016

Uh oh!

SparkQA commented Jan 19, 2016

Uh oh!

nongli Jan 19, 2016

Choose a reason for hiding this comment

Uh oh!

nongli commented Jan 19, 2016

Uh oh!

cloud-fan commented Jan 19, 2016

Uh oh!

rxin commented Jan 20, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants