[SPARK-9109] [GraphX] Keep the cached edge in the graph #7469

tien-dungle · 2015-07-17T14:32:59Z

The change here is to keep the cached RDDs in the graph object so that when the graph.unpersist() is called these RDDs are correctly unpersisted.

import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD
import org.slf4j.LoggerFactory
import org.apache.spark.graphx.util.GraphGenerators

// Create an RDD for the vertices
val users: RDD[(VertexId, (String, String))] =
  sc.parallelize(Array((3L, ("rxin", "student")), (7L, ("jgonzal", "postdoc")),
                       (5L, ("franklin", "prof")), (2L, ("istoica", "prof"))))
// Create an RDD for edges
val relationships: RDD[Edge[String]] =
  sc.parallelize(Array(Edge(3L, 7L, "collab"),    Edge(5L, 3L, "advisor"),
                       Edge(2L, 5L, "colleague"), Edge(5L, 7L, "pi")))
// Define a default user in case there are relationship with missing user
val defaultUser = ("John Doe", "Missing")
// Build the initial Graph
val graph = Graph(users, relationships, defaultUser)
graph.cache().numEdges

graph.unpersist()

sc.getPersistentRDDs.foreach( r => println( r._2.toString))

srowen · 2015-07-17T14:37:49Z

@ankurdave @jegonzal

SparkQA · 2015-07-17T15:03:35Z

Test build #1095 has finished for PR 7469 at commit 8d87997.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- logDebug("isMulticlass = " + metadata.isMulticlass)
- * (i.e., if isMulticlass && isSpaceSufficientForAllCategoricalSplits),
- logDebug("isMulticlass = " + metadata.isMulticlass)
- abstract class UnsafeProjection extends Projection
- case class FromUnsafeProjection(fields: Seq[DataType]) extends Projection
- abstract class BaseProjection extends Projection
- class SpecificProjection extends $
- class SpecificProjection extends $

The change here is to keep the cached RDDs in the graph object so that when the graph.unpersist() is called these RDDs are correctly unpersisted. ```java import org.apache.spark.graphx._ import org.apache.spark.rdd.RDD import org.slf4j.LoggerFactory import org.apache.spark.graphx.util.GraphGenerators // Create an RDD for the vertices val users: RDD[(VertexId, (String, String))] = sc.parallelize(Array((3L, ("rxin", "student")), (7L, ("jgonzal", "postdoc")), (5L, ("franklin", "prof")), (2L, ("istoica", "prof")))) // Create an RDD for edges val relationships: RDD[Edge[String]] = sc.parallelize(Array(Edge(3L, 7L, "collab"), Edge(5L, 3L, "advisor"), Edge(2L, 5L, "colleague"), Edge(5L, 7L, "pi"))) // Define a default user in case there are relationship with missing user val defaultUser = ("John Doe", "Missing") // Build the initial Graph val graph = Graph(users, relationships, defaultUser) graph.cache().numEdges graph.unpersist() sc.getPersistentRDDs.foreach( r => println( r._2.toString)) ``` Author: tien-dungle <[email protected]> Closes #7469 from tien-dungle/SPARK-9109_Graphx-unpersist and squashes the following commits: 8d87997 [tien-dungle] Keep the cached edge in the graph (cherry picked from commit 587c315) Signed-off-by: Ankur Dave <[email protected]>

ankurdave · 2015-07-17T19:16:58Z

Thanks for finding the leak here. Merged into master and branch-1.4.

srowen · 2015-07-19T08:12:46Z

Thanks @ankurdave -- you can follow this by resolving the issue (done already now)

ankurdave · 2015-07-19T23:52:40Z

Oh, thanks @srowen.

Keep the cached edge in the graph

8d87997

asfgit closed this in 587c315 Jul 17, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-9109] [GraphX] Keep the cached edge in the graph #7469

[SPARK-9109] [GraphX] Keep the cached edge in the graph #7469

Uh oh!

tien-dungle commented Jul 17, 2015

Uh oh!

srowen commented Jul 17, 2015

Uh oh!

SparkQA commented Jul 17, 2015

Uh oh!

ankurdave commented Jul 17, 2015

Uh oh!

srowen commented Jul 19, 2015

Uh oh!

ankurdave commented Jul 19, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-9109] [GraphX] Keep the cached edge in the graph #7469

[SPARK-9109] [GraphX] Keep the cached edge in the graph #7469

Uh oh!

Conversation

tien-dungle commented Jul 17, 2015

Uh oh!

srowen commented Jul 17, 2015

Uh oh!

SparkQA commented Jul 17, 2015

Uh oh!

ankurdave commented Jul 17, 2015

Uh oh!

srowen commented Jul 19, 2015

Uh oh!

ankurdave commented Jul 19, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants