Skip to content

Conversation

@JoshRosen
Copy link
Contributor

This patch upgrades Py4J from 0.9.1 to 0.9.2 in order to include a patch which modifies Py4J to use the current thread's ContextClassLoader when performing reflection / class loading. This is necessary in order to fix SPARK-5185, a longstanding issue affecting the use of --jars and --packages in PySpark.

In order to demonstrate that the fix works, I removed the workarounds which were added as part of SPARK-6027 / #4779 and other patches.

Py4J diff: py4j/py4j@0.9.1...0.9.2

/cc @zsxwing @tdas @davies @brkyvz

@JoshRosen
Copy link
Contributor Author

I suppose we could also backport this change to Spark 1.6.x if there's interest, but for now I'm mostly concerned with fixing this in 2.0.0 because I believe that the bug that this fixes was causing other issues while trying to remove our tests' reliance on the Spark assembly.

@SparkQA
Copy link

SparkQA commented Mar 13, 2016

Test build #53038 has finished for PR 11687 at commit 40d22cc.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 14, 2016

Test build #53042 has finished for PR 11687 at commit 15e23f8.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

return helperClass.newInstance()
except Py4JJavaError as e:
# TODO: use --jar once it also work on driver
if 'ClassNotFoundException' in str(e.java_exception):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's still keep this check. For other errors (e.g., the py4j java server is down), we should not call _printErrorMsg as it's confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this change because the call now fails with a different set of exceptions (such as "attempting to call a package") and wanted to err on the side of over-displaying the warning message. Let me try to figure out a narrower exception pattern match.

@JoshRosen
Copy link
Contributor Author

@zsxwing, updated, please take another look.

@SparkQA
Copy link

SparkQA commented Mar 14, 2016

Test build #53048 has finished for PR 11687 at commit 73344df.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@brkyvz
Copy link
Contributor

brkyvz commented Mar 14, 2016

I'm so happy this issue is finally resolved on the Py4j side. Thanks @JoshRosen for the update.

@SparkQA
Copy link

SparkQA commented Mar 14, 2016

Test build #53074 has finished for PR 11687 at commit 650d589.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member

zsxwing commented Mar 14, 2016

LGTM

@JoshRosen
Copy link
Contributor Author

Thanks @zsxwing. Merging this into master now.

@asfgit asfgit closed this in 07cb323 Mar 14, 2016
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Mar 17, 2016
…oading issue

This patch upgrades Py4J from 0.9.1 to 0.9.2 in order to include a patch which modifies Py4J to use the current thread's ContextClassLoader when performing reflection / class loading. This is necessary in order to fix [SPARK-5185](https://issues.apache.org/jira/browse/SPARK-5185), a longstanding issue affecting the use of `--jars` and `--packages` in PySpark.

In order to demonstrate that the fix works, I removed the workarounds which were added as part of [SPARK-6027](https://issues.apache.org/jira/browse/SPARK-6027) / apache#4779 and other patches.

Py4J diff: py4j/py4j@0.9.1...0.9.2

/cc zsxwing tdas davies brkyvz

Author: Josh Rosen <[email protected]>

Closes apache#11687 from JoshRosen/py4j-0.9.2.
roygao94 pushed a commit to roygao94/spark that referenced this pull request Mar 22, 2016
…oading issue

This patch upgrades Py4J from 0.9.1 to 0.9.2 in order to include a patch which modifies Py4J to use the current thread's ContextClassLoader when performing reflection / class loading. This is necessary in order to fix [SPARK-5185](https://issues.apache.org/jira/browse/SPARK-5185), a longstanding issue affecting the use of `--jars` and `--packages` in PySpark.

In order to demonstrate that the fix works, I removed the workarounds which were added as part of [SPARK-6027](https://issues.apache.org/jira/browse/SPARK-6027) / apache#4779 and other patches.

Py4J diff: py4j/py4j@0.9.1...0.9.2

/cc zsxwing tdas davies brkyvz

Author: Josh Rosen <[email protected]>

Closes apache#11687 from JoshRosen/py4j-0.9.2.
@JoshRosen JoshRosen deleted the py4j-0.9.2 branch August 29, 2016 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants