Skip to content

Commit 4e8907a

Browse files
soxofaanHyukjinKwon
authored andcommitted
[SPARK-41989][PYTHON] Avoid breaking logging config from pyspark.pandas
See https://issues.apache.org/jira/browse/SPARK-41989 for in depth explanation Short summary: `pyspark/pandas/__init__.py` uses, at import time, `logging.warning()` which might silently call `logging.basicConfig()`. So by importing `pyspark.pandas` (directly or indirectly) a user might unknowingly break their own logging setup (e.g. when based on `logging.basicConfig()` or related). `logging.getLogger(...).warning()` does not trigger this behavior. User-defined logging setups will be more predictable. Manual testing so far. I'm not sure it's worthwhile to cover this with a unit test Closes apache#39516 from soxofaan/SPARK-41989-pyspark-pandas-logging-setup. Authored-by: Stefaan Lippens <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]> (cherry picked from commit 04836ba) Signed-off-by: Hyukjin Kwon <[email protected]>
1 parent 1421811 commit 4e8907a

File tree

1 file changed

+1
-3
lines changed

1 file changed

+1
-3
lines changed

python/pyspark/pandas/__init__.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -44,9 +44,7 @@
4444
LooseVersion(pyarrow.__version__) >= LooseVersion("2.0.0")
4545
and "PYARROW_IGNORE_TIMEZONE" not in os.environ
4646
):
47-
import logging
48-
49-
logging.warning(
47+
warnings.warn(
5048
"'PYARROW_IGNORE_TIMEZONE' environment variable was not set. It is required to "
5149
"set this environment variable to '1' in both driver and executor sides if you use "
5250
"pyarrow>=2.0.0. "

0 commit comments

Comments
 (0)