[SPARK-5277][SQL] - SparkSqlSerializer doesn't always register user specified KryoRegistrators #5237

mhseiden · 2015-03-28T07:05:16Z

[SPARK-5277][SQL] - SparkSqlSerializer doesn't always register user specified KryoRegistrators

There were a few places where new SparkSqlSerializer instances were created with new, empty SparkConfs resulting in user specified registrators sometimes not getting initialized.

The fix is to try and pull a conf from the SparkEnv, and construct a new conf (that loads defaults) if one cannot be found.

The changes touched:
1) SparkSqlSerializer's resource pool (this appears to fix the issue in the comment)
2) execution.Exchange (for all of the partitioners)
3) execution.Limit (for the HashPartitioner)

A few tests were added to ColumnTypeSuite, ensuring that a custom registrator and serde is initialized and used when in-memory columns are written.

…ed KryoRegistrators There were a few places where new SparkSqlSerializer instances were created with new, empty SparkConfs resulting in user specified registrators sometimes not getting initialized. The fix is to try and pull a conf from the SparkEnv, and construct a new conf (that loads default) if one cannot be found. The changes touched: 1) SparkSqlSerializer's resource pool (this appears to fix the issue in the comment) 2) execution.Exchange (for all of the partitioners) 3) execution.Limit (for the HashPartitioner) A few tests were added to ColumnTypeSuite, ensuring that a custom registrator and serde is initialized and used when in-memory columns are written.

marmbrus · 2015-04-11T22:53:00Z

ok to test

SparkQA · 2015-04-12T00:26:57Z

Test build #30093 has finished for PR 5237 at commit e5011fb.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.
This patch does not change any dependencies.

marmbrus · 2015-04-12T00:53:57Z

sql/core/src/main/scala/org/apache/spark/sql/execution/Exchange.scala

All nodes should have a sqlContext which should have a sparkContext. Can we just get this from there instead?

marmbrus · 2015-04-12T00:54:36Z

Thanks for working on this. This LGTM with two minor comments.

yhuai · 2015-04-13T20:17:49Z

Seems new SparkSqlSerializer(new SparkConf(false)) is also used in org.apache.spark.sql.execution.Limit. Can you fix that as well?

mhseiden · 2015-04-14T21:47:23Z

@yhuai - The Limit case is covered as well.

@marmbrus - Not a problem! My most recent PR addresses your comments. Is this a change that could be backported to 1.2.x / 1.3.x, or will it only live in master?

SparkQA · 2015-04-14T23:29:38Z

Test build #30279 has finished for PR 5237 at commit 3175c2f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.
This patch adds the following new dependencies:
- RoaringBitmap-0.4.5.jar
- activation-1.1.jar
- akka-actor_2.10-2.3.4-spark.jar
- akka-remote_2.10-2.3.4-spark.jar
- akka-slf4j_2.10-2.3.4-spark.jar
- aopalliance-1.0.jar
- arpack_combined_all-0.1.jar
- avro-1.7.7.jar
- breeze-macros_2.10-0.11.2.jar
- breeze_2.10-0.11.2.jar
- chill-java-0.5.0.jar
- chill_2.10-0.5.0.jar
- commons-beanutils-1.7.0.jar
- commons-beanutils-core-1.8.0.jar
- commons-cli-1.2.jar
- commons-codec-1.10.jar
- commons-collections-3.2.1.jar
- commons-compress-1.4.1.jar
- commons-configuration-1.6.jar
- commons-digester-1.8.jar
- commons-httpclient-3.1.jar
- commons-io-2.1.jar
- commons-lang-2.5.jar
- commons-lang3-3.3.2.jar
- commons-math-2.1.jar
- commons-math3-3.4.1.jar
- commons-net-2.2.jar
- compress-lzf-1.0.0.jar
- config-1.2.1.jar
- core-1.1.2.jar
- curator-client-2.4.0.jar
- curator-framework-2.4.0.jar
- curator-recipes-2.4.0.jar
- gmbal-api-only-3.0.0-b023.jar
- grizzly-framework-2.1.2.jar
- grizzly-http-2.1.2.jar
- grizzly-http-server-2.1.2.jar
- grizzly-http-servlet-2.1.2.jar
- grizzly-rcm-2.1.2.jar
- groovy-all-2.3.7.jar
- guava-14.0.1.jar
- guice-3.0.jar
- hadoop-annotations-2.2.0.jar
- hadoop-auth-2.2.0.jar
- hadoop-client-2.2.0.jar
- hadoop-common-2.2.0.jar
- hadoop-hdfs-2.2.0.jar
- hadoop-mapreduce-client-app-2.2.0.jar
- hadoop-mapreduce-client-common-2.2.0.jar
- hadoop-mapreduce-client-core-2.2.0.jar
- hadoop-mapreduce-client-jobclient-2.2.0.jar
- hadoop-mapreduce-client-shuffle-2.2.0.jar
- hadoop-yarn-api-2.2.0.jar
- hadoop-yarn-client-2.2.0.jar
- hadoop-yarn-common-2.2.0.jar
- hadoop-yarn-server-common-2.2.0.jar
- ivy-2.4.0.jar
- jackson-annotations-2.4.0.jar
- jackson-core-2.4.4.jar
- jackson-core-asl-1.8.8.jar
- jackson-databind-2.4.4.jar
- jackson-jaxrs-1.8.8.jar
- jackson-mapper-asl-1.8.8.jar
- jackson-module-scala_2.10-2.4.4.jar
- jackson-xc-1.8.8.jar
- jansi-1.4.jar
- javax.inject-1.jar
- javax.servlet-3.0.0.v201112011016.jar
- javax.servlet-3.1.jar
- javax.servlet-api-3.0.1.jar
- jaxb-api-2.2.2.jar
- jaxb-impl-2.2.3-1.jar
- jcl-over-slf4j-1.7.10.jar
- jersey-client-1.9.jar
- jersey-core-1.9.jar
- jersey-grizzly2-1.9.jar
- jersey-guice-1.9.jar
- jersey-json-1.9.jar
- jersey-server-1.9.jar
- jersey-test-framework-core-1.9.jar
- jersey-test-framework-grizzly2-1.9.jar
- jets3t-0.7.1.jar
- jettison-1.1.jar
- jetty-util-6.1.26.jar
- jline-0.9.94.jar
- jline-2.10.4.jar
- jodd-core-3.6.3.jar
- json4s-ast_2.10-3.2.10.jar
- json4s-core_2.10-3.2.10.jar
- json4s-jackson_2.10-3.2.10.jar
- jsr305-1.3.9.jar
- jtransforms-2.4.0.jar
- jul-to-slf4j-1.7.10.jar
- kryo-2.21.jar
- log4j-1.2.17.jar
- lz4-1.2.0.jar
- management-api-3.0.0-b012.jar
- mesos-0.21.0-shaded-protobuf.jar
- metrics-core-3.1.0.jar
- metrics-graphite-3.1.0.jar
- metrics-json-3.1.0.jar
- metrics-jvm-3.1.0.jar
- minlog-1.2.jar
- netty-3.8.0.Final.jar
- netty-all-4.0.23.Final.jar
- objenesis-1.2.jar
- opencsv-2.3.jar
- oro-2.0.8.jar
- paranamer-2.6.jar
- parquet-column-1.6.0rc3.jar
- parquet-common-1.6.0rc3.jar
- parquet-encoding-1.6.0rc3.jar
- parquet-format-2.2.0-rc1.jar
- parquet-generator-1.6.0rc3.jar
- parquet-hadoop-1.6.0rc3.jar
- parquet-jackson-1.6.0rc3.jar
- protobuf-java-2.4.1.jar
- protobuf-java-2.5.0-spark.jar
- py4j-0.8.2.1.jar
- pyrolite-2.0.1.jar
- quasiquotes_2.10-2.0.1.jar
- reflectasm-1.07-shaded.jar
- scala-compiler-2.10.4.jar
- scala-library-2.10.4.jar
- scala-reflect-2.10.4.jar
- scalap-2.10.4.jar
- scalatest_2.10-2.2.1.jar
- slf4j-api-1.7.10.jar
- slf4j-log4j12-1.7.10.jar
- snappy-java-1.1.1.7.jar
- spark-bagel_2.10-1.4.0-SNAPSHOT.jar
- spark-catalyst_2.10-1.4.0-SNAPSHOT.jar
- spark-core_2.10-1.4.0-SNAPSHOT.jar
- spark-graphx_2.10-1.4.0-SNAPSHOT.jar
- spark-launcher_2.10-1.4.0-SNAPSHOT.jar
- spark-mllib_2.10-1.4.0-SNAPSHOT.jar
- spark-network-common_2.10-1.4.0-SNAPSHOT.jar
- spark-network-shuffle_2.10-1.4.0-SNAPSHOT.jar
- spark-repl_2.10-1.4.0-SNAPSHOT.jar
- spark-sql_2.10-1.4.0-SNAPSHOT.jar
- spark-streaming_2.10-1.4.0-SNAPSHOT.jar
- spire-macros_2.10-0.7.4.jar
- spire_2.10-0.7.4.jar
- stax-api-1.0.1.jar
- stream-2.7.0.jar
- tachyon-0.5.0.jar
- tachyon-client-0.5.0.jar
- uncommons-maths-1.2.2a.jar
- unused-1.0.0.jar
- xmlenc-0.52.jar
- xz-1.0.jar
- zookeeper-3.4.5.jar

marmbrus · 2015-04-15T23:15:44Z

Manually fixed conflicts while merging with master. Thanks!

marmbrus reviewed Apr 12, 2015
View reviewed changes

[SPARK-5277][SQL] - address code review comments

3175c2f

asfgit closed this in 8a53de1 Apr 15, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-5277][SQL] - SparkSqlSerializer doesn't always register user specified KryoRegistrators #5237

[SPARK-5277][SQL] - SparkSqlSerializer doesn't always register user specified KryoRegistrators #5237

Uh oh!

mhseiden commented Mar 28, 2015

Uh oh!

marmbrus commented Apr 11, 2015

Uh oh!

SparkQA commented Apr 12, 2015

Uh oh!

marmbrus Apr 12, 2015

Uh oh!

marmbrus commented Apr 12, 2015

Uh oh!

yhuai commented Apr 13, 2015

Uh oh!

mhseiden commented Apr 14, 2015

Uh oh!

SparkQA commented Apr 14, 2015

Uh oh!

marmbrus commented Apr 15, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-5277][SQL] - SparkSqlSerializer doesn't always register user specified KryoRegistrators #5237

[SPARK-5277][SQL] - SparkSqlSerializer doesn't always register user specified KryoRegistrators #5237

Uh oh!

Conversation

mhseiden commented Mar 28, 2015

Uh oh!

marmbrus commented Apr 11, 2015

Uh oh!

SparkQA commented Apr 12, 2015

Uh oh!

marmbrus Apr 12, 2015

Choose a reason for hiding this comment

Uh oh!

marmbrus commented Apr 12, 2015

Uh oh!

yhuai commented Apr 13, 2015

Uh oh!

mhseiden commented Apr 14, 2015

Uh oh!

SparkQA commented Apr 14, 2015

Uh oh!

marmbrus commented Apr 15, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants