[GraphX][SPARK-2245] add a materialize method to materialize VertexRDD by calling RDD's count #1177

bxshi · 2014-06-22T20:54:24Z

Seems one can not materialize VertexRDD by simply calling count method, which is overridden by VertexRDD. But if you call RDD's count, it could materialize it.

Is this a feature that designed to get the count without materialize VertexRDD? If so, do you guys think it is necessary to add a materialize method to VertexRDD?

By the way, does count() is the cheapest way to materialize a RDD? Or it just cost the same resources like other actions?

Best,

…unt method

bxshi · 2014-06-22T20:58:56Z

Here's a simple code that could reproduce the problem

    val conf = new SparkConf().setAppName("HDTM")
      .setMaster("local[4]")

    val sc = new SparkContext(conf)

    sc.setCheckpointDir("./checkpoint")
    val v = sc.parallelize(Seq[(VertexId, Long)]((0L, 0L), (1L, 1L), (2L, 2L)))
    val e = sc.parallelize(Seq[Edge[Long]](Edge(0L, 1L, 0L), Edge(1L, 2L, 1L), Edge(2L, 0L, 2L)))
    val g = Graph(v, e)
    g.vertices.checkpoint()
    g.edges.checkpoint()
    g.vertices.count()
    g.numEdges
    println(s"${g.vertices.isCheckpointed } ${g.edges.isCheckpointed}")

    g.vertices.materialize()
    println(s"${g.vertices.isCheckpointed } ${g.edges.isCheckpointed}")

The first output is false true and after calling materialize the output is true true, which means vertexRDD is correctly check pointed.

ankurdave · 2014-06-23T21:31:35Z

Thanks for pointing this out. See my comment on the JIRA issue -- the right solution is to override checkpoint() in VertexRDD.

delegate checkpoint related method to partitionsRDD

bxshi · 2014-06-26T16:16:57Z

I override those public checkpoint related functions, but that will let to other exceptions. Hope you can help me on that. The detailed description is on SPARK-2245

pwendell · 2014-09-02T01:23:20Z

@bxshi can you add [GraphX] to the title? This isn't getting sorted properly in our PR tool.

SparkQA · 2014-09-05T23:46:05Z

Can one of the admins verify this patch?

nchammas · 2015-02-25T22:00:59Z

@ankurdave @bxshi What's the status of this PR?

It hasn't been updated in a while, though it looks simple enough from a review standpoint.

bxshi · 2015-02-25T23:04:38Z

It's not as simple as I thought...

Here is my reply on JIRA about this PR

I edited my original comment to add the updates, but I do not know if you can get them via email. So I resubmit it again. Hope that won't bother you. Ankur Dave
Hi Ankur Dave, I changed my pull request. But there is another exception, ShippableVertexPartition is not serializable. So I serialized it, but there is another exception org.apache.spark.graphx.impl.RoutingTablePartition is not serializable. Then I serialized it again, but on iteration 2 there will be an exception: org.apache.spark.graphx.impl.ShippableVertexPartition cannot be cast to scala.Tuple2
The code I'm using are:

val conf = new SparkConf().setAppName("HDTM")
.setMaster("local[4]")
val sc = new SparkContext(conf)
sc.setCheckpointDir("./checkpoint")
val v = sc.parallelize(Seq[(VertexId, Long)]((0L, 0L), (1L, 1L), (2L, 2L)))
val e = sc.parallelize(Seq[Edge[Long]](Edge(0L, 1L, 0L), Edge(1L, 2L, 1L), Edge(2L, 0L, 2L)))
var g = Graph(v, e)
val vertexIds = Seq(0L, 1L, 2L)
var prevG: Graph[VertexId, Long] = null
for (i <- 1 to 2000) {
vertexIds.toStream.foreach(id =>
{ prevG = g g = Graph(g.vertices, g.edges) g.vertices.cache() g.edges.cache() prevG.unpersistVertices(blocking = false) prevG.edges.unpersist(blocking = false) }
)
g.vertices.checkpoint()
g.edges.checkpoint()
g.edges.count()
g.vertices.count()
println(s"$
{g.vertices.isCheckpointed}
$
{g.edges.isCheckpointed}
")
println(" iter " + i + " finished")
}
println(g.vertices.collect().mkString(" "))
println(g.edges.collect().mkString(" "))

Am I on the right track? Or Should there be another way to change it?

AmplabJenkins · 2015-07-13T21:54:46Z

Can one of the admins verify this patch?

This reverts commit b9984350a3d2b706db92e50253d66c75f4304bb2.

add a materialize method to materialize VertexRDD by calling RDD's co…

3be5d6a

…unt method

bxshi changed the title ~~add a materialize method to materialize VertexRDD by calling RDD's count~~ [SPARK-2245] add a materialize method to materialize VertexRDD by calling RDD's count Jun 23, 2014

fix SPARK-2245

f1ed7f3

delegate checkpoint related method to partitionsRDD

bxshi changed the title ~~[SPARK-2245] add a materialize method to materialize VertexRDD by calling RDD's count~~ [GraphX][SPARK-2245] add a materialize method to materialize VertexRDD by calling RDD's count Sep 2, 2014

bxshi closed this Jul 14, 2015

mapr-devops pushed a commit to mapr/spark that referenced this pull request May 8, 2025

Revert "[EZAF-4620] Update Go to 1.19.10] (apache#1176)" (apache#1177)

297866a

This reverts commit b9984350a3d2b706db92e50253d66c75f4304bb2.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[GraphX][SPARK-2245] add a materialize method to materialize VertexRDD by calling RDD's count #1177

[GraphX][SPARK-2245] add a materialize method to materialize VertexRDD by calling RDD's count #1177

Uh oh!

bxshi commented Jun 22, 2014

Uh oh!

bxshi commented Jun 22, 2014

Uh oh!

ankurdave commented Jun 23, 2014

Uh oh!

bxshi commented Jun 26, 2014

Uh oh!

pwendell commented Sep 2, 2014

Uh oh!

SparkQA commented Sep 5, 2014

Uh oh!

nchammas commented Feb 25, 2015

Uh oh!

bxshi commented Feb 25, 2015

Uh oh!

AmplabJenkins commented Jul 13, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[GraphX][SPARK-2245] add a materialize method to materialize VertexRDD by calling RDD's count #1177

[GraphX][SPARK-2245] add a materialize method to materialize VertexRDD by calling RDD's count #1177

Uh oh!

Conversation

bxshi commented Jun 22, 2014

Uh oh!

bxshi commented Jun 22, 2014

Uh oh!

ankurdave commented Jun 23, 2014

Uh oh!

bxshi commented Jun 26, 2014

Uh oh!

pwendell commented Sep 2, 2014

Uh oh!

SparkQA commented Sep 5, 2014

Uh oh!

nchammas commented Feb 25, 2015

Uh oh!

bxshi commented Feb 25, 2015

Uh oh!

AmplabJenkins commented Jul 13, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants