Skip to content

Conversation

@mingyukim
Copy link
Contributor

This patch attempts to fix the Hadoop Configuration thread safety issue for NewHadoopRDD in the same way SPARK-2546 fixed the issue for HadoopRDD.

@srowen
Copy link
Member

srowen commented Sep 15, 2015

Looks like a plausible variation of #2684 -- cc @JoshRosen

@SparkQA
Copy link

SparkQA commented Sep 15, 2015

Test build #1755 has finished for PR 8763 at commit d516aae.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If HADOOP-10456 is still an issue then I think that we need to be locking on the same lock in both NewHadoopRDD and HadoopRDD. Therefore, I'd be in favor of moving this lock to SparkHadoopUtil or a similar location, then updating both RDD implementations to use that lock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HADOOP-10456 is fixed in Hadoop 2.4.1 (https://issues.apache.org/jira/browse/HADOOP-10456), so people will still hit the issue in some configurations.

The concurrency issue with the Configuration/JobConf instantiations seems to be per object, not per class. (i.e. happens when two threads try to copy the same conf object, not when two threads try to instantiate Configuration/JobConf with two separate conf objects.) The lock could've even be allocated per NewHadoopRDD instance or HadoopRDD instance.

So, I think this should be safe, although I can certainly move this to SparkHadoopUtil for clarity. Please let me know what you think!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. In the original version of the fix for SPARK-1097, we actually synchronized on the Configuration being cloned. It looks like the HadoopRDD object lock was added in a followup patch to avoid a deadlock caused by this synchronization: #1409

@sryza
Copy link
Contributor

sryza commented Sep 15, 2015

This seems like it could have a pretty serious perf impact. Are we able to do some benchmarking to assess this?

@sryza
Copy link
Contributor

sryza commented Sep 15, 2015

Oh, nevermind, sorry, this is off by default.

@mccheah
Copy link
Contributor

mccheah commented Sep 15, 2015

Jenkins, test this please

@mingyukim
Copy link
Contributor Author

Can someone kick off the build? Also, please let me know if there's any other comments!

@SparkQA
Copy link

SparkQA commented Sep 17, 2015

Test build #1769 has finished for PR 8763 at commit 4ba2acd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor

LGTM, since this is off by default. I'm going to merge this into master, but let me know how far back you think this fix should be backported.

@asfgit asfgit closed this in 8074208 Sep 18, 2015
@mingyukim
Copy link
Contributor Author

Thanks @JoshRosen. Backporting this to Spark 1.5 would be good enough for me. Would that be reasonable?

@JoshRosen
Copy link
Contributor

SGTM; I'll pull it in now for inclusion in 1.5.1.

asfgit pushed a commit that referenced this pull request Sep 18, 2015
This patch attempts to fix the Hadoop Configuration thread safety issue for NewHadoopRDD in the same way SPARK-2546 fixed the issue for HadoopRDD.

Author: Mingyu Kim <[email protected]>

Closes #8763 from mingyukim/mkim/SPARK-10611.

(cherry picked from commit 8074208)
Signed-off-by: Josh Rosen <[email protected]>
@mingyukim
Copy link
Contributor Author

Thanks!

mingyukim added a commit to palantir/spark that referenced this pull request Sep 18, 2015
This patch attempts to fix the Hadoop Configuration thread safety issue for NewHadoopRDD in the same way SPARK-2546 fixed the issue for HadoopRDD.

Author: Mingyu Kim <[email protected]>

Closes apache#8763 from mingyukim/mkim/SPARK-10611.
@robert3005 robert3005 deleted the mkim/SPARK-10611 branch September 24, 2016 04:09
ashangit pushed a commit to ashangit/spark that referenced this pull request Oct 19, 2016
This patch attempts to fix the Hadoop Configuration thread safety issue for NewHadoopRDD in the same way SPARK-2546 fixed the issue for HadoopRDD.

Author: Mingyu Kim <[email protected]>

Closes apache#8763 from mingyukim/mkim/SPARK-10611.

(cherry picked from commit 8074208)
Signed-off-by: Josh Rosen <[email protected]>
(cherry picked from commit a6c3153)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants