Skip to content

Conversation

@phofl
Copy link
Member

@phofl phofl commented Feb 14, 2022

@phofl phofl added Index Related to the Index class or subclasses Indexing Related to indexing on series/frames, not to indexes themselves labels Feb 14, 2022
# time below from 3.8 ms to 496 µs
# if we already have ndarray[bool], the overhead is 1.4 µs or .25%
key = np.asarray(key, dtype=bool)
if is_extension_array_dtype(getattr(key, "dtype", None)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clearer or more performant to do isinstance(key, ExtensionArray)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought about this too, but does not work unfortunately, since key is a Series here

# GH#45806
ser = pd.Series([True, False, pd.NA], dtype="boolean")
result = ser.index[ser]
expected = Index([0])
Copy link
Member

@jbrockmendel jbrockmendel Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a discussion where it was decided that this is the correct behavior? could plausibly raise.

could also use ser._values or Index(ser) for the key?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works for a regular bool series, so thought that this should work too. We can add these test cases too, but if Series should work we have to keep that one

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phofl just so im clear, your comment is responding to the "could also use..." part of mine? If so that's fine. I'm unclear on the other part of why we're not raising on pd.NA

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah now I got you.

Why should we raise on pd.NA?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we raise on pd.NA?

I think of idx[mask] as akin to [idx[n] for n in range(len(mask)) if mask[n]] which would raise.

@jreback jreback added this to the 1.5 milestone Feb 16, 2022
@jreback jreback merged commit d633abd into pandas-dev:main Feb 16, 2022
@phofl phofl deleted the 45806 branch February 16, 2022 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Index Related to the Index class or subclasses Indexing Related to indexing on series/frames, not to indexes themselves

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Cannot slice index using boolean type series

3 participants