-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-6479][Block Manager]Create off-heap block storage API #5430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
To help me understand this patch, can you also put the Tachyon implementation into this? |
|
Test build #29903 has finished for PR 5430 at commit
|
|
@rxin I have attached the patch with Tachyon migration code to the JIRA. https://issues.apache.org/jira/secure/attachment/12724088/spark-6479-tachyon.patch The patch is incomplete on purpose, because most of the diff (not included) is just changing the term from tachyon to offheap. If you think it is better to do tachyon migration with this JIRA, please let me know and I will do it in one shot. By the way, there is minor change in OffHeapStore and OffHeapBlockManager, which is inconsistent with this PR, please ignore it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think default value is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
User may not want to use offheap, and in this case the OffHeapBlockManager will be None.
|
Thanks - let's put the Tachyon patch with this. In this case, I think it will be easier to review and understand the API semantics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the extra blank line here when you update
|
@rxin This is the complete change for offheap api, and tachyon migration code. There is no any logical change in tachyong (just move code around), except changing one system.ext to throw exceptions in TachyonBlockManager initialization part. |
|
Test build #29997 has finished for PR 5430 at commit
|
|
Test build #30017 has finished for PR 5430 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an API breaking change. We need to keep the old one around (as an alias), and deprecate it.
|
@zhzhan I also left a high level comment on JIRA - it'd be better to call this external block store, rather than off-heap store. |
|
Test build #30203 has finished for PR 5430 at commit
|
|
@rxin Could you help to review the patch and let me know if you have any concern? |
|
Test build #30609 has finished for PR 5430 at commit
|
|
Test build #30614 has finished for PR 5430 at commit
|
|
Sorry for the delay - I will look at this again today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to handle backwards compatibility - i.e. logs from older versions of Spark where it says Tachyon. If you look there are other backwards compatibility tests relating to this protocol too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the "Tachyon" version is present, I'd look for that and then just convert it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is desc going to be used? maybe we should just override toString?
|
Jenkins, retest this please. |
|
Jenkins, test it please. |
|
Jenkins, test this please. |
1 similar comment
|
Jenkins, test this please. |
|
Test build #31175 has finished for PR 5430 at commit
|
|
Jenkins, retest this please. |
|
Test build #31197 has finished for PR 5430 at commit
|
|
I took as pass and this LGTM. However it needs to be brought up to date. |
|
I think Spark-5213 fail the mina test [info] spark-sql: found 1 potential binary incompatibilities (filtered 129) |
|
Test build #31502 has finished for PR 5430 at commit
|
|
Jenkins, retest this please. Thanks @zhzhan I reverted the patch. |
|
Test build #31506 has finished for PR 5430 at commit
|
This is the classes for creating off-heap block storage API. It also includes the migration for Tachyon. The diff seems to be big, but it mainly just rename tachyon to offheap. New implementation for hdfs will be submit for review in spark-6112. Author: Zhan Zhang <[email protected]> Closes apache#5430 from zhzhan/SPARK-6479 and squashes the following commits: 60acd84 [Zhan Zhang] minor change to kickoff the test 12f54c9 [Zhan Zhang] solve merge conflicts a54132c [Zhan Zhang] solve review comments ffb8e00 [Zhan Zhang] rebase to sparkcontext change 6e121e0 [Zhan Zhang] resolve review comments and restructure blockmanasger code a7aed6c [Zhan Zhang] add Tachyon migration code 186de31 [Zhan Zhang] initial commit for off-heap block storage api
This is the classes for creating off-heap block storage API. It also includes the migration for Tachyon. The diff seems to be big, but it mainly just rename tachyon to offheap. New implementation for hdfs will be submit for review in spark-6112. Author: Zhan Zhang <[email protected]> Closes apache#5430 from zhzhan/SPARK-6479 and squashes the following commits: 60acd84 [Zhan Zhang] minor change to kickoff the test 12f54c9 [Zhan Zhang] solve merge conflicts a54132c [Zhan Zhang] solve review comments ffb8e00 [Zhan Zhang] rebase to sparkcontext change 6e121e0 [Zhan Zhang] resolve review comments and restructure blockmanasger code a7aed6c [Zhan Zhang] add Tachyon migration code 186de31 [Zhan Zhang] initial commit for off-heap block storage api
This is the classes for creating off-heap block storage API. It also includes the migration for Tachyon. The diff seems to be big, but it mainly just rename tachyon to offheap. New implementation for hdfs will be submit for review in spark-6112. Author: Zhan Zhang <[email protected]> Closes apache#5430 from zhzhan/SPARK-6479 and squashes the following commits: 60acd84 [Zhan Zhang] minor change to kickoff the test 12f54c9 [Zhan Zhang] solve merge conflicts a54132c [Zhan Zhang] solve review comments ffb8e00 [Zhan Zhang] rebase to sparkcontext change 6e121e0 [Zhan Zhang] resolve review comments and restructure blockmanasger code a7aed6c [Zhan Zhang] add Tachyon migration code 186de31 [Zhan Zhang] initial commit for off-heap block storage api
This is the classes for creating off-heap block storage API. It also includes the migration for Tachyon. The diff seems to be big, but it mainly just rename tachyon to offheap. New implementation for hdfs will be submit for review in spark-6112.