-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-5277][SQL] - SparkSqlSerializer doesn't always register user specified KryoRegistrators #5237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ed KryoRegistrators
There were a few places where new SparkSqlSerializer instances were created with new, empty SparkConfs resulting in user specified registrators sometimes not getting initialized.
The fix is to try and pull a conf from the SparkEnv, and construct a new conf (that loads default) if one cannot be found.
The changes touched:
1) SparkSqlSerializer's resource pool (this appears to fix the issue in the comment)
2) execution.Exchange (for all of the partitioners)
3) execution.Limit (for the HashPartitioner)
A few tests were added to ColumnTypeSuite, ensuring that a custom registrator and serde is initialized and used when in-memory columns are written.
|
ok to test |
|
Test build #30093 has finished for PR 5237 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All nodes should have a sqlContext which should have a sparkContext. Can we just get this from there instead?
|
Thanks for working on this. This LGTM with two minor comments. |
|
Seems |
|
Test build #30279 has finished for PR 5237 at commit
|
|
Manually fixed conflicts while merging with master. Thanks! |
[SPARK-5277][SQL] - SparkSqlSerializer doesn't always register user specified KryoRegistrators
There were a few places where new SparkSqlSerializer instances were created with new, empty SparkConfs resulting in user specified registrators sometimes not getting initialized.
The fix is to try and pull a conf from the SparkEnv, and construct a new conf (that loads defaults) if one cannot be found.
The changes touched:
1) SparkSqlSerializer's resource pool (this appears to fix the issue in the comment)
2) execution.Exchange (for all of the partitioners)
3) execution.Limit (for the HashPartitioner)
A few tests were added to ColumnTypeSuite, ensuring that a custom registrator and serde is initialized and used when in-memory columns are written.