Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions doc/source/whatsnew/v1.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,20 @@ including other versions of pandas.
Enhancements
~~~~~~~~~~~~

.. _whatsnew_110.specify_missing_labels:

KeyErrors raised by loc specify missing labels
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Previously, if labels were missing for a loc call, a Key Error was raised stating that this was no longer supported.

Now the error message also includes a list of the missing labels. For example,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u add the issue number here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you do it like other issue references, e.g. :issue:`34272`

Copy link
Contributor Author

@timhunderwood timhunderwood Jun 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback - sure, this is now done.

.. code-block:: ipython

s = pd.Series({"a": 1, "b": 2, "c": 3})
s.loc[["a", "b", "missing_0", "c", "missing_1", "missing_2"]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don’t need a section for this just a note in the whats new

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I've removed the code block and just left the text entry in the what's new.

...
KeyError: "Passing list-likes to .loc or [] with any missing labels is no longer supported. The following labels were missing: ['missing_0', 'missing_1', 'missing_2']. See https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike"
.. _whatsnew_110.astype_string:

All dtypes can now be converted to ``StringDtype``
Expand Down
9 changes: 6 additions & 3 deletions pandas/core/indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -1283,7 +1283,8 @@ def _validate_read_indexer(
return

# Count missing values:
missing = (indexer < 0).sum()
missing_mask = indexer < 0
missing = (missing_mask).sum()

if missing:
if missing == len(indexer):
Expand All @@ -1302,10 +1303,12 @@ def _validate_read_indexer(
# code, so we want to avoid warning & then
# just raising
if not ax.is_categorical():
not_found = list(key[missing_mask])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of doing this you can create an Index which has the natural formatters

Copy link
Contributor Author

@timhunderwood timhunderwood Jun 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, and I now also use the option context to limit the number of items displayed.

Note, the if raise_missing block just above this uses a list to print the missing keys here. Independently of this PR, if raise_missing==False, an error would be raised directly below regardless (unless the axis is categorical). Not sure if this is the expected/desired behaviour? Perhaps one for another ticket/PR.

raise KeyError(
"Passing list-likes to .loc or [] with any missing labels "
"is no longer supported, see "
"https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike" # noqa:E501
"is no longer supported. "
f"The following labels were missing: {not_found}. "
"See https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike" # noqa:E501
)


Expand Down
9 changes: 9 additions & 0 deletions pandas/tests/indexing/test_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -1075,3 +1075,12 @@ def test_setitem_with_bool_mask_and_values_matching_n_trues_in_length():
result = ser
expected = pd.Series([None] * 3 + list(range(5)) + [None] * 2).astype("object")
tm.assert_series_equal(result, expected)


def test_missing_labels_inside_loc():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u also try with a lot of missing labels (testing that message is constrained)

you will need to set the max line width via option context

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done. I added a test for many labels and a test for very long labels.

Note because we are raising a KeyError the message is wrapped in quotes (see stack overflow and https://bugs.python.org/issue2651 . This means the \n characters appear in the message.

The pre-existing error message text here is also quite long as it includes a link to the docs and text on "no longer supported". Let me know if you think it should be shortened.

# GH34272
s = pd.Series({"a": 1, "b": 2, "c": 3})
with pytest.raises(KeyError) as e:
s.loc[["a", "b", "missing_0", "c", "missing_1", "missing_2"]]
missing_labels = ["missing_0", "missing_1", "missing_2"]
assert all(missing_label in str(e.value) for missing_label in missing_labels)