Skip to content

Commit 04836ba

Browse files
soxofaanHyukjinKwon
authored andcommitted
[SPARK-41989][PYTHON] Avoid breaking logging config from pyspark.pandas
### What changes were proposed in this pull request? See https://issues.apache.org/jira/browse/SPARK-41989 for in depth explanation Short summary: `pyspark/pandas/__init__.py` uses, at import time, `logging.warning()` which might silently call `logging.basicConfig()`. So by importing `pyspark.pandas` (directly or indirectly) a user might unknowingly break their own logging setup (e.g. when based on `logging.basicConfig()` or related). `logging.getLogger(...).warning()` does not trigger this behavior. ### Does this PR introduce _any_ user-facing change? User-defined logging setups will be more predictable. ### How was this patch tested? Manual testing so far. I'm not sure it's worthwhile to cover this with a unit test Closes #39516 from soxofaan/SPARK-41989-pyspark-pandas-logging-setup. Authored-by: Stefaan Lippens <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
1 parent f5924cf commit 04836ba

File tree

1 file changed

+1
-3
lines changed

1 file changed

+1
-3
lines changed

python/pyspark/pandas/__init__.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -47,9 +47,7 @@
4747
LooseVersion(pyarrow.__version__) >= LooseVersion("2.0.0")
4848
and "PYARROW_IGNORE_TIMEZONE" not in os.environ
4949
):
50-
import logging
51-
52-
logging.warning(
50+
warnings.warn(
5351
"'PYARROW_IGNORE_TIMEZONE' environment variable was not set. It is required to "
5452
"set this environment variable to '1' in both driver and executor sides if you use "
5553
"pyarrow>=2.0.0. "

0 commit comments

Comments
 (0)