diff --git a/doc/source/whatsnew/v0.17.0.txt b/doc/source/whatsnew/v0.17.0.txt index 5d9e415b85acf..2c192ee33061b 100644 --- a/doc/source/whatsnew/v0.17.0.txt +++ b/doc/source/whatsnew/v0.17.0.txt @@ -21,9 +21,14 @@ users upgrade to this version. After installing pandas-datareader, you can easily change your imports: - .. code-block:: Python + .. code-block:: python + + from pandas.io import data, wb + + becomes + + .. code-block:: python - from pandas.io import data, wb # becomes from pandas_datareader import data, wb Highlights include: @@ -53,44 +58,60 @@ Check the :ref:`API Changes ` and :ref:`deprecations ` for more details. (:issue:`8260`, :issue:`10763`, :issue:`11034`). - df1 = pd.DataFrame({'col1':[0,1], 'col_left':['a','b']}) - df2 = pd.DataFrame({'col1':[1,2,2],'col_right':[2,2,2]}) - pd.merge(df1, df2, on='col1', how='outer', indicator=True) +The new implementation allows for having a single-timezone across all rows, with operations in a performant manner. - For more, see the :ref:`updated docs ` +.. ipython:: python -- ``DataFrame`` has gained the ``nlargest`` and ``nsmallest`` methods (:issue:`10393`) -- SQL io functions now accept a SQLAlchemy connectable. (:issue:`7877`) -- Enable writing complex values to HDF stores when using table format (:issue:`10447`) -- Enable reading gzip compressed files via URL, either by explicitly setting the compression parameter or by inferring from the presence of the HTTP Content-Encoding header in the response (:issue:`8685`) -- Add a ``limit_direction`` keyword argument that works with ``limit`` to enable ``interpolate`` to fill ``NaN`` values forward, backward, or both (:issue:`9218` and :issue:`10420`) + df = DataFrame({'A' : date_range('20130101',periods=3), + 'B' : date_range('20130101',periods=3,tz='US/Eastern'), + 'C' : date_range('20130101',periods=3,tz='CET')}) + df + df.dtypes - .. ipython:: python +.. ipython:: python - ser = pd.Series([np.nan, np.nan, 5, np.nan, np.nan, np.nan, 13]) - ser.interpolate(limit=1, limit_direction='both') + df.B + df.B.dt.tz_localize(None) -- Round DataFrame to variable number of decimal places (:issue:`10568`). +This uses a new-dtype representation as well, that is very similar in look-and-feel to its numpy cousin ``datetime64[ns]`` - .. ipython :: python +.. ipython:: python - df = pd.DataFrame(np.random.random([3, 3]), columns=['A', 'B', 'C'], - index=['first', 'second', 'third']) - df - df.round(2) - df.round({'A': 0, 'C': 2}) + df['B'].dtype + type(df['B'].dtype) + +.. note:: + + There is a slightly different string repr for the underlying ``DatetimeIndex`` as a result of the dtype changes, but + functionally these are the same. + + Previous Behavior: + + .. code-block:: python + + In [1]: pd.date_range('20130101',periods=3,tz='US/Eastern') + Out[1]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00', + '2013-01-03 00:00:00-05:00'], + dtype='datetime64[ns]', freq='D', tz='US/Eastern') + + In [2]: pd.date_range('20130101',periods=3,tz='US/Eastern').dtype + Out[2]: dtype('` + +- ``DataFrame`` has gained the ``nlargest`` and ``nsmallest`` methods (:issue:`10393`) +- SQL io functions now accept a SQLAlchemy connectable. (:issue:`7877`) +- Enable writing complex values to HDF stores when using table format (:issue:`10447`) +- Enable reading gzip compressed files via URL, either by explicitly setting the compression parameter or by inferring from the presence of the HTTP Content-Encoding header in the response (:issue:`8685`) +- Add a ``limit_direction`` keyword argument that works with ``limit`` to enable ``interpolate`` to fill ``NaN`` values forward, backward, or both (:issue:`9218` and :issue:`10420`) + + .. ipython:: python + + ser = pd.Series([np.nan, np.nan, 5, np.nan, np.nan, np.nan, 13]) + ser.interpolate(limit=1, limit_direction='both') + +- Round DataFrame to variable number of decimal places (:issue:`10568`). + + .. ipython :: python + + df = pd.DataFrame(np.random.random([3, 3]), columns=['A', 'B', 'C'], + index=['first', 'second', 'third']) + df + df.round(2) + df.round({'A': 0, 'C': 2}) + - ``pd.read_sql`` and ``to_sql`` can accept database URI as ``con`` parameter (:issue:`10214`) - Enable ``pd.read_hdf`` to be used without specifying a key when the HDF file contains a single dataset (:issue:`10443`) - Enable writing Excel files in :ref:`memory <_io.excel_writing_buffer>` using StringIO/BytesIO (:issue:`7074`) @@ -321,13 +382,15 @@ Other enhancements Timestamp('2014') DatetimeIndex(['2012Q2', '2014']) - .. note:: If you want to perform calculations based on today's date, use ``Timestamp.now()`` and ``pandas.tseries.offsets``. + .. note:: - .. ipython:: python + If you want to perform calculations based on today's date, use ``Timestamp.now()`` and ``pandas.tseries.offsets``. - import pandas.tseries.offsets as offsets - Timestamp.now() - Timestamp.now() + offsets.DateOffset(years=1) + .. ipython:: python + + import pandas.tseries.offsets as offsets + Timestamp.now() + Timestamp.now() + offsets.DateOffset(years=1) - ``to_datetime`` can now accept ``yearfirst`` keyword (:issue:`7599`) @@ -411,6 +474,9 @@ Other enhancements pd.concat([foo, bar, baz], 1) +- Allow passing `kwargs` to the interpolation methods (:issue:`10378`). +- Improved error message when concatenating an empty iterable of dataframes (:issue:`9157`) + .. _whatsnew_0170.api: @@ -516,60 +582,6 @@ To keep the previous behaviour, you can use ``errors='ignore'``: Furthermore, ``pd.to_timedelta`` has gained a similar API, of ``errors='raise'|'ignore'|'coerce'``, and the ``coerce`` keyword has been deprecated in favor of ``errors='coerce'``. -.. _whatsnew_0170.tz: - -Datetime with TZ -~~~~~~~~~~~~~~~~ - -We are adding an implementation that natively supports datetime with timezones. A ``Series`` or a ``DataFrame`` column previously -*could* be assigned a datetime with timezones, and would work as an ``object`` dtype. This had performance issues with a large -number rows. See the :ref:`docs ` for more details. (:issue:`8260`, :issue:`10763`, :issue:`11034`). - -The new implementation allows for having a single-timezone across all rows, with operations in a performant manner. - -.. ipython:: python - - df = DataFrame({'A' : date_range('20130101',periods=3), - 'B' : date_range('20130101',periods=3,tz='US/Eastern'), - 'C' : date_range('20130101',periods=3,tz='CET')}) - df - df.dtypes - -.. ipython:: python - - df.B - df.B.dt.tz_localize(None) - -This uses a new-dtype representation as well, that is very similar in look-and-feel to its numpy cousin ``datetime64[ns]`` - -.. ipython:: python - - df['B'].dtype - type(df['B'].dtype) - -.. note:: - - There is a slightly different string repr for the underlying ``DatetimeIndex`` as a result of the dtype changes, but - functionally these are the same. - - Previous Behavior: - - .. code-block:: python - - In [1]: pd.date_range('20130101',periods=3,tz='US/Eastern') - Out[1]: DatetimeIndex(['2013-01-01 00:00:00-05:00', '2013-01-02 00:00:00-05:00', - '2013-01-03 00:00:00-05:00'], - dtype='datetime64[ns]', freq='D', tz='US/Eastern') - - In [2]: pd.date_range('20130101',periods=3,tz='US/Eastern').dtype - Out[2]: dtype('