-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[GraphX][SPARK-2245] add a materialize method to materialize VertexRDD by calling RDD's count #1177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Here's a simple code that could reproduce the problem The first output is |
|
Thanks for pointing this out. See my comment on the JIRA issue -- the right solution is to override checkpoint() in VertexRDD. |
delegate checkpoint related method to partitionsRDD
|
I override those public checkpoint related functions, but that will let to other exceptions. Hope you can help me on that. The detailed description is on SPARK-2245 |
|
@bxshi can you add |
|
Can one of the admins verify this patch? |
|
@ankurdave @bxshi What's the status of this PR? It hasn't been updated in a while, though it looks simple enough from a review standpoint. |
|
It's not as simple as I thought... Here is my reply on JIRA about this PR I edited my original comment to add the updates, but I do not know if you can get them via email. So I resubmit it again. Hope that won't bother you. Ankur Dave Am I on the right track? Or Should there be another way to change it? |
|
Can one of the admins verify this patch? |
This reverts commit b9984350a3d2b706db92e50253d66c75f4304bb2.
Seems one can not materialize VertexRDD by simply calling count method, which is overridden by VertexRDD. But if you call RDD's count, it could materialize it.
Is this a feature that designed to get the count without materialize VertexRDD? If so, do you guys think it is necessary to add a materialize method to VertexRDD?
By the way, does count() is the cheapest way to materialize a RDD? Or it just cost the same resources like other actions?
Best,