Skip to content

Conversation

@squito
Copy link
Contributor

@squito squito commented Sep 27, 2019

RDD dependencies and partitions can be simultaneously
accessed and mutated by user threads and spark's scheduler threads, so
access must be thread-safe. In particular, as partitions and
dependencies are lazily-initialized, before this change they could get
initialized multiple times, which would lead to the scheduler having an
inconsistent view of the pendings stages and get stuck.

Tested with existing unit tests.

RDD dependencies, partitions, and storageLevel can be simultaneously
accessed and mutated by user threads and spark's scheduler threads, so
access must be thread-safe.  In particular, as partitions and
dependencies are lazily-initiliazed, before this change they could get
initialized multiple times, which would lead to the scheduler having an
inconsistent view of the pendings stages and get stuck.
def debugSelf(rdd: RDD[_]): Seq[String] = stateLock.synchronized {
import Utils.bytesToString

val persistence = if (storageLevel != StorageLevel.NONE) storageLevel.description else ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @squito . Shall we use getStorageLevel instead of accessing storageLevel here? Then, it seems that we don't need stateLock.synchronized for this debugSelf.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, thanks


/** Get the RDD's current storage level, or StorageLevel.NONE if none is set. */
def getStorageLevel: StorageLevel = storageLevel
def getStorageLevel: StorageLevel = stateLock.synchronized { storageLevel }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Synchronizing calls on simple getters always raises flags for me. Is this really safe?

For example, in one of the places that calls this (getOrCompute): what happens if the storage level changes after this value is returned, but before the call to blockManager.getOrElseUpdate actually does its thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will be honest, I don't really have a good understanding of how the mutability of storageLevel could be an actual problem. I do know of an instance where moving a .cache() before the creation of a Future which submitted a job fixed a hanging job. Unfortunately I have very little info about what else was going on in that job.

I have audited the uses of storageLevel, likegetOrCompute, and I think its OK -- there its passing in a local copy, and anyway that whole function is running on the executor where the storageLevel is immutable.

another tricky one is DAGScheduler.getCacheLocs, which reads both storage level and rdd.partitions. that gets exposed via SparkContext.getPreferredLocs, and used in CoalescedRDD and PartitionAwareUnionRDD. but I still can't come up with an explanation for the behavior we see. I will try to poke some more.

I could also submit a change which just covers partitions and dependencies for now, since that is clear, though I'd like to understand the problem w/ storageLevel as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, what is this racing with? by itself, I'm not sure this forces something to set this before something else needs to read it.

@SparkQA
Copy link

SparkQA commented Sep 27, 2019

Test build #111485 has finished for PR 25951 at commit f83d1a7.

  • This patch fails from timeout after a configured wait of 400m.
  • This patch merges cleanly.
  • This patch adds no public classes.

// Kind of ugly: need to register RDDs with the cache and map output tracker here
// since we can't do it in the RDD constructor because # of partitions is unknown
logInfo("Registering RDD " + rdd.id + " (" + rdd.getCreationSite + ")")
logInfo("Registering RDD " + rdd.id + " (" + rdd.getCreationSite + s") as input to " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use interpolation for consistency


/** Get the RDD's current storage level, or StorageLevel.NONE if none is set. */
def getStorageLevel: StorageLevel = storageLevel
def getStorageLevel: StorageLevel = stateLock.synchronized { storageLevel }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, what is this racing with? by itself, I'm not sure this forces something to set this before something else needs to read it.

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Oct 1, 2019

Test build #111633 has finished for PR 25951 at commit f83d1a7.

  • This patch fails from timeout after a configured wait of 400m.
  • This patch merges cleanly.
  • This patch adds no public classes.

@squito
Copy link
Contributor Author

squito commented Oct 2, 2019

I spent a while trying to figure out where a race on storageLevel could cause issues, but I couldn't think of anything. Sadly, I don't think I'll get more specifics from the case where moving a call to cache() helped; I suspect it was an indirect affect somehow. So I decided to undo all the changes related to storageLevel.

@SparkQA
Copy link

SparkQA commented Oct 3, 2019

Test build #111698 has finished for PR 25951 at commit 34d794c.

  • This patch fails from timeout after a configured wait of 400m.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 3, 2019

Test build #111713 has finished for PR 25951 at commit f0a662f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • * The only purpose of this class is to have something to lock (like a generic Object) but that

@SparkQA
Copy link

SparkQA commented Oct 3, 2019

Test build #111751 has finished for PR 25951 at commit 432dc22.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@vanzin vanzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

test("reference partitions inside a task") {
// Run a simple job which just makes sure there is no failure if we touch rdd.partitions
// inside a task. This requires the stateLock to be serializable. This is very convoluted
// use case, its just a check for backwards-compatibility after the fix for SPARK-28917.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's

@SparkQA
Copy link

SparkQA commented Oct 4, 2019

Test build #111788 has finished for PR 25951 at commit b08bf0a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Oct 8, 2019

Merging to master / 2.4.

@vanzin vanzin closed this in 0da667d Oct 8, 2019
vanzin pushed a commit that referenced this pull request Oct 8, 2019
RDD dependencies and partitions can be simultaneously
accessed and mutated by user threads and spark's scheduler threads, so
access must be thread-safe.  In particular, as partitions and
dependencies are lazily-initialized, before this change they could get
initialized multiple times, which would lead to the scheduler having an
inconsistent view of the pendings stages and get stuck.

Tested with existing unit tests.

Closes #25951 from squito/SPARK-28917.

Authored-by: Imran Rashid <[email protected]>
Signed-off-by: Marcelo Vanzin <[email protected]>
(cherry picked from commit 0da667d)
Signed-off-by: Marcelo Vanzin <[email protected]>
atronchi pushed a commit to atronchi/spark that referenced this pull request Oct 23, 2019
RDD dependencies and partitions can be simultaneously
accessed and mutated by user threads and spark's scheduler threads, so
access must be thread-safe.  In particular, as partitions and
dependencies are lazily-initialized, before this change they could get
initialized multiple times, which would lead to the scheduler having an
inconsistent view of the pendings stages and get stuck.

Tested with existing unit tests.

Closes apache#25951 from squito/SPARK-28917.

Authored-by: Imran Rashid <[email protected]>
Signed-off-by: Marcelo Vanzin <[email protected]>
* The use of Integer is simply so this is serializable -- executors may reference the shared
* fields (though they should never mutate them, that only happens on the driver).
*/
private val stateLock = new Integer(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Integer constructor has been deprecated already, see https://docs.oracle.com/javase/9/docs/api/java/lang/Integer.html . This yields the warning:

RDD.scala:240: constructor Integer in class Integer is deprecated: see corresponding Javadoc for more information.

Is it possible to replace it by something else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to eliminate the warning in #27399

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants