Skip to content

Conversation

@cloud-fan
Copy link
Contributor

Benchmark it on 4 different schemas, the result:

Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz
Hash For simple:                   Avg Time(ms)    Avg Rate(M/s)  Relative Rate
-------------------------------------------------------------------------------
interpreted version                       31.47           266.54         1.00 X
codegen version                           64.52           130.01         0.49 X
Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz
Hash For normal:                   Avg Time(ms)    Avg Rate(M/s)  Relative Rate
-------------------------------------------------------------------------------
interpreted version                     4068.11             0.26         1.00 X
codegen version                         1175.92             0.89         3.46 X
Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz
Hash For array:                    Avg Time(ms)    Avg Rate(M/s)  Relative Rate
-------------------------------------------------------------------------------
interpreted version                     9276.70             0.06         1.00 X
codegen version                        14762.23             0.04         0.63 X
Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz
Hash For map:                      Avg Time(ms)    Avg Rate(M/s)  Relative Rate
-------------------------------------------------------------------------------
interpreted version                    58869.79             0.01         1.00 X
codegen version                         9285.36             0.06         6.34 X

@SparkQA
Copy link

SparkQA commented Jan 18, 2016

Test build #49621 has finished for PR 10816 at commit d753ce8.

  • This patch fails RAT tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class ALS(@Since(\"1.4.0\") override val uid: String) extends Estimator[ALSModel] with ALSParams

@cloud-fan
Copy link
Contributor Author

The hash exprssion is slow on simple schema because the UnsafeProjection has some unnecessary opeations every round and will hurt the performance if each round is quite small(like simple schema), this can be fixed by #10809. For array type, still investigate.

cc @nongli

@cloud-fan
Copy link
Contributor Author

retest this please

@nongli
Copy link
Contributor

nongli commented Jan 18, 2016

I'm confused by the conclusino. We don't use UnsafeProjection anymore right?

@cloud-fan
Copy link
Contributor Author

We need UnsafeProjection to execute the codegen version of hash expression.

@SparkQA
Copy link

SparkQA commented Jan 19, 2016

Test build #49629 has finished for PR 10816 at commit 1771733.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you include the results as comments for each test() case?

@nongli
Copy link
Contributor

nongli commented Jan 19, 2016

LGTM

@cloud-fan
Copy link
Contributor Author

we should run this benchmark again after #10809 is merged.

@rxin
Copy link
Contributor

rxin commented Jan 20, 2016

I'm going to merge this. Please submit an update once #10809 is merged and put the result in the test case as comments.

@asfgit asfgit closed this in f3934a8 Jan 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants