Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

while working on #24129, I realized that I missed some document fixes in #24285. This PR covers all of them.

How was this patch tested?

N/A

@SparkQA
Copy link

SparkQA commented Apr 4, 2019

Test build #104272 has finished for PR 24295 at commit 461e4ad.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member

viirya commented Apr 4, 2019

retest this please.

@SparkQA
Copy link

SparkQA commented Apr 4, 2019

Test build #104282 has finished for PR 24295 at commit 461e4ad.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member

kiszk commented Apr 4, 2019

I think that the following error is not related to this PR...

======================================================================
ERROR: test_create_dataframe_from_pandas_with_timestamp (pyspark.sql.tests.test_dataframe.DataFrameTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/tests/test_dataframe.py", line 562, in test_create_dataframe_from_pandas_with_timestamp
    df = self.spark.createDataFrame(pdf)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/session.py", line 763, in createDataFrame
    data = self._convert_from_pandas(data, schema, timezone)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/session.py", line 506, in _convert_from_pandas
    s = _check_series_convert_timestamps_tz_local(series, timezone)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/types.py", line 1902, in _check_series_convert_timestamps_tz_local
    return _check_series_convert_timestamps_localize(s, timezone, None)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/types.py", line 1877, in _check_series_convert_timestamps_localize
    lambda ts: ts.tz_localize(from_tz, ambiguous=False).tz_convert(to_tz).tz_localize(None)
  File "/home/anaconda/lib/python2.7/site-packages/pandas/core/series.py", line 2294, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas/src/inference.pyx", line 1207, in pandas.lib.map_infer (pandas/lib.c:66124)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/types.py", line 1878, in <lambda>
    if ts is not pd.NaT else pd.NaT)
  File "pandas/tslib.pyx", line 609, in pandas.tslib.Timestamp.tz_localize (pandas/tslib.c:13468)
  File "pandas/tslib.pyx", line 1768, in pandas.tslib.maybe_get_tz (pandas/tslib.c:32362)
  File "/home/anaconda/lib/python2.7/site-packages/pytz/__init__.py", line 178, in timezone
    raise UnknownTimeZoneError(zone)
UnknownTimeZoneError: 'US/Pacific-New'

======================================================================
ERROR: test_create_dateframe_from_pandas_with_dst (pyspark.sql.tests.test_dataframe.DataFrameTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/tests/test_dataframe.py", line 590, in test_create_dateframe_from_pandas_with_dst
    df = self.spark.createDataFrame(pdf)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/session.py", line 763, in createDataFrame
    data = self._convert_from_pandas(data, schema, timezone)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/session.py", line 506, in _convert_from_pandas
    s = _check_series_convert_timestamps_tz_local(series, timezone)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/types.py", line 1902, in _check_series_convert_timestamps_tz_local
    return _check_series_convert_timestamps_localize(s, timezone, None)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/types.py", line 1877, in _check_series_convert_timestamps_localize
    lambda ts: ts.tz_localize(from_tz, ambiguous=False).tz_convert(to_tz).tz_localize(None)
  File "/home/anaconda/lib/python2.7/site-packages/pandas/core/series.py", line 2294, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas/src/inference.pyx", line 1207, in pandas.lib.map_infer (pandas/lib.c:66124)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/types.py", line 1878, in <lambda>
    if ts is not pd.NaT else pd.NaT)
  File "pandas/tslib.pyx", line 609, in pandas.tslib.Timestamp.tz_localize (pandas/tslib.c:13468)
  File "pandas/tslib.pyx", line 1768, in pandas.tslib.maybe_get_tz (pandas/tslib.c:32362)
  File "/home/anaconda/lib/python2.7/site-packages/pytz/__init__.py", line 178, in timezone
    raise UnknownTimeZoneError(zone)
UnknownTimeZoneError: 'US/Pacific-New'

======================================================================
ERROR: test_to_pandas (pyspark.sql.tests.test_dataframe.DataFrameTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/tests/test_dataframe.py", line 522, in test_to_pandas
    pdf = self._to_pandas()
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/tests/test_dataframe.py", line 517, in _to_pandas
    return df.toPandas()
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/dataframe.py", line 2189, in toPandas
    _check_series_convert_timestamps_local_tz(pdf[field.name], timezone)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/types.py", line 1891, in _check_series_convert_timestamps_local_tz
    return _check_series_convert_timestamps_localize(s, None, timezone)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/types.py", line 1877, in _check_series_convert_timestamps_localize
    lambda ts: ts.tz_localize(from_tz, ambiguous=False).tz_convert(to_tz).tz_localize(None)
  File "/home/anaconda/lib/python2.7/site-packages/pandas/core/series.py", line 2294, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas/src/inference.pyx", line 1207, in pandas.lib.map_infer (pandas/lib.c:66124)
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/types.py", line 1878, in <lambda>
    if ts is not pd.NaT else pd.NaT)
  File "pandas/tslib.pyx", line 649, in pandas.tslib.Timestamp.tz_convert (pandas/tslib.c:13923)
  File "pandas/tslib.pyx", line 407, in pandas.tslib.Timestamp.__new__ (pandas/tslib.c:10447)
  File "pandas/tslib.pyx", line 1467, in pandas.tslib.convert_to_tsobject (pandas/tslib.c:27504)
  File "pandas/tslib.pyx", line 1768, in pandas.tslib.maybe_get_tz (pandas/tslib.c:32362)
  File "/home/anaconda/lib/python2.7/site-packages/pytz/__init__.py", line 178, in timezone
    raise UnknownTimeZoneError(zone)
UnknownTimeZoneError: 'US/Pacific-New'

@kiszk
Copy link
Member

kiszk commented Apr 4, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Apr 4, 2019

Test build #104289 has finished for PR 24295 at commit 461e4ad.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan cloud-fan closed this in f7bd1ab Apr 4, 2019
@cloud-fan
Copy link
Contributor Author

thanks, merging to master!

mccheah pushed a commit to palantir/spark that referenced this pull request May 24, 2019
## What changes were proposed in this pull request?

while working on apache#24129, I realized that I missed some document fixes in apache#24285. This PR covers all of them.

## How was this patch tested?

N/A

Author: Wenchen Fan <[email protected]>

Closes apache#24295 from cloud-fan/doc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants