-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
PERF: improve perf. of Categorical.searchsorted #28795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: improve perf. of Categorical.searchsorted #28795
Conversation
doc/source/whatsnew/v1.0.0.rst
Outdated
| - Performance improvement in :meth:`DataFrame.corr` when ``method`` is ``"spearman"`` (:issue:`28139`) | ||
| - Performance improvement in :meth:`DataFrame.replace` when provided a list of values to replace (:issue:`28099`) | ||
| - Performance improvement in :meth:`DataFrame.select_dtypes` by using vectorization instead of iterating over a loop (:issue:`28317`) | ||
| - Performance improvement in :meth:`Categorical.searchsorted` and :meth:`CategoricalIndex.searchsorted` when searching for a single scalar value (:issue:`XXXXX`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just reference the PR as the issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, fixed.
0f46d60 to
27bd6f7
Compare
jreback
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, small comment, ping on green.
|
|
||
| codes = codes[0] if is_scalar(value) else codes | ||
|
|
||
| if is_scalar(value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, i would add a comment here that this is perf sensitive
|
Comments addressed. |
|
thanks @topper-123 |
black pandasgit diff upstream/master -u -- "*.py" | flake8 --diffImproves performance of
Categorical.searchsortedby avoiding expensive data convertions.Also,
CategoricalIndex.searchsortednow callsself.values.searchsorteddirectly instead of going throughalgorithms.searchsorted, which always ends up callingself.values.searchsortedanyway. This ends up getting performance to 5.5 µs instead of 12 µs.