Commit 7227bd5
[SPARK-23517][PYTHON] Make
## What changes were proposed in this pull request?
This PR proposes for `pyspark.util._exception_message` to produce the trace from Java side by `Py4JJavaError`.
Currently, in Python 2, it uses `message` attribute which `Py4JJavaError` didn't happen to have:
```python
>>> from pyspark.util import _exception_message
>>> try:
... sc._jvm.java.lang.String(None)
... except Exception as e:
... pass
...
>>> e.message
''
```
Seems we should use `str` instead for now:
https://github.com/bartdag/py4j/blob/aa6c53b59027925a426eb09b58c453de02c21b7c/py4j-python/src/py4j/protocol.py#L412
but this doesn't address the problem with non-ascii string from Java side -
`https://github.com/bartdag/py4j/issues/306`
So, we could directly call `__str__()`:
```python
>>> e.__str__()
u'An error occurred while calling None.java.lang.String.\n: java.lang.NullPointerException\n\tat java.lang.String.<init>(String.java:588)\n\tat sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)\n\tat sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)\n\tat sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)\n\tat java.lang.reflect.Constructor.newInstance(Constructor.java:422)\n\tat py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)\n\tat py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)\n\tat py4j.Gateway.invoke(Gateway.java:238)\n\tat py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)\n\tat py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)\n\tat py4j.GatewayConnection.run(GatewayConnection.java:214)\n\tat java.lang.Thread.run(Thread.java:745)\n'
```
which doesn't type coerce unicodes to `str` in Python 2.
This can be actually a problem:
```python
from pyspark.sql.functions import udf
spark.conf.set("spark.sql.execution.arrow.enabled", True)
spark.range(1).select(udf(lambda x: [[]])()).toPandas()
```
**Before**
```
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../spark/python/pyspark/sql/dataframe.py", line 2009, in toPandas
raise RuntimeError("%s\n%s" % (_exception_message(e), msg))
RuntimeError:
Note: toPandas attempted Arrow optimization because 'spark.sql.execution.arrow.enabled' is set to true. Please set it to false to disable this.
```
**After**
```
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../spark/python/pyspark/sql/dataframe.py", line 2009, in toPandas
raise RuntimeError("%s\n%s" % (_exception_message(e), msg))
RuntimeError: An error occurred while calling o47.collectAsArrowToPython.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 0.0 failed 1 times, most recent failure: Lost task 7.0 in stage 0.0 (TID 7, localhost, executor driver): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
File "/.../spark/python/pyspark/worker.py", line 245, in main
process()
File "/.../spark/python/pyspark/worker.py", line 240, in process
...
Note: toPandas attempted Arrow optimization because 'spark.sql.execution.arrow.enabled' is set to true. Please set it to false to disable this.
```
## How was this patch tested?
Manually tested and unit tests were added.
Author: hyukjinkwon <[email protected]>
Closes apache#20680 from HyukjinKwon/SPARK-23517.
(cherry picked from commit fab563b)
Signed-off-by: hyukjinkwon <[email protected]>pyspark.util._exception_message produce the trace from Java side by Py4JJavaError1 parent 7f07685 commit 7227bd5
2 files changed
+18
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2286 | 2286 | | |
2287 | 2287 | | |
2288 | 2288 | | |
| 2289 | + | |
| 2290 | + | |
| 2291 | + | |
| 2292 | + | |
| 2293 | + | |
| 2294 | + | |
| 2295 | + | |
| 2296 | + | |
| 2297 | + | |
| 2298 | + | |
| 2299 | + | |
2289 | 2300 | | |
2290 | 2301 | | |
2291 | 2302 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
| |||
33 | 34 | | |
34 | 35 | | |
35 | 36 | | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
36 | 43 | | |
37 | 44 | | |
38 | 45 | | |
| |||
0 commit comments