Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion doc/source/categorical.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1145,7 +1145,8 @@ dtype in apply

Pandas currently does not preserve the dtype in apply functions: If you apply along rows you get
a `Series` of ``object`` `dtype` (same as getting a row -> getting one element will return a
basic type) and applying along columns will also convert to object.
basic type) and applying along columns will also convert to object. ``NaN`` values are unaffected.
You can use ``fillna`` to handle missing values before applying a function.

.. ipython:: python

Expand Down
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.24.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1273,6 +1273,7 @@ Categorical
- Bug when resampling :meth:`Dataframe.resample()` and aggregating on categorical data, the categorical dtype was getting lost. (:issue:`23227`)
- Bug in many methods of the ``.str``-accessor, which always failed on calling the ``CategoricalIndex.str`` constructor (:issue:`23555`, :issue:`23556`)
- Bug in :meth:`Series.where` losing the categorical dtype for categorical data (:issue:`24077`)
- Bug in :meth:`Categorical.apply` where ``NaN`` values could be handled unpredictably. They now remain unchanged (:issue:`24241`)

Datetimelike
^^^^^^^^^^^^
Expand Down
5 changes: 4 additions & 1 deletion pandas/core/arrays/categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -1166,7 +1166,7 @@ def map(self, mapper):
Maps the categories to new categories. If the mapping correspondence is
one-to-one the result is a :class:`~pandas.Categorical` which has the
same order property as the original, otherwise a :class:`~pandas.Index`
is returned.
is returned. NaN values are unaffected.

If a `dict` or :class:`~pandas.Series` is used any unmapped category is
mapped to `NaN`. Note that if this happens an :class:`~pandas.Index`
Expand Down Expand Up @@ -1234,6 +1234,9 @@ def map(self, mapper):
categories=new_categories,
ordered=self.ordered)
except ValueError:
if any(self._codes == -1):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this use np.any? I think any will short-circuit, but np.any will likely be faster.

new_categories = new_categories.insert(len(new_categories),
np.nan)
return np.take(new_categories, self._codes)

__eq__ = _cat_compare_op('__eq__')
Expand Down
23 changes: 23 additions & 0 deletions pandas/tests/indexes/test_category.py
Original file line number Diff line number Diff line change
Expand Up @@ -311,6 +311,29 @@ def test_map_with_categorical_series(self):
exp = pd.Index(["odd", "even", "odd", np.nan])
tm.assert_index_equal(a.map(c), exp)

@pytest.mark.parametrize(
(
'data',
'f'
),
(
([1, 1, np.nan], pd.isna),
([1, 2, np.nan], pd.isna),
([1, 1, np.nan], {1: False}),
([1, 2, np.nan], {1: False, 2: False}),
([1, 1, np.nan], pd.Series([False, False])),
([1, 2, np.nan], pd.Series([False, False, False]))
))
def test_map_with_nan(self, data, f): # GH 24241
values = pd.Categorical(data)
result = values.map(f)
if data[1] == 1:
expected = pd.Categorical([False, False, np.nan])
tm.assert_categorical_equal(result, expected)
else:
expected = pd.Index([False, False, np.nan])
tm.assert_index_equal(result, expected)

@pytest.mark.parametrize('klass', [list, tuple, np.array, pd.Series])
def test_where(self, klass):
i = self.create_index()
Expand Down