Skip to content

Conversation

@jbrockmendel
Copy link
Member

In [1]: from asv_bench.benchmarks.frame_ctor import *

In [2]: cls = FromArrays

In [3]: self = cls()

In [4]: self.setup()

In [5]: %timeit self.time_frame_from_arrays_int()
6.21 ms ± 184 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # <- master
3.27 ms ± 182 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # <- PR

@jreback jreback added the Performance Memory or execution speed performance label Jul 22, 2021
@jreback jreback added this to the 1.4 milestone Jul 22, 2021
@jreback jreback added the Constructors Series/DataFrame/Index/pd.array Constructors label Jul 22, 2021
@jreback
Copy link
Contributor

jreback commented Jul 22, 2021

cool i would actually add a release note about this as this is a significant perf improvement. does this hold more generally with other construction asv's?

also doe you think its the removing asserts or the changes in not using is_sparse that are the big wins here? (guessing its the asserts)

@jbrockmendel
Copy link
Member Author

also doe you think its the removing asserts or the changes in not using is_sparse that are the big wins here? (guessing its the asserts)

I forget the exact numbers, but remember being surprised how big a difference the is_sparse change made

@jreback jreback merged commit 9d6f8fe into pandas-dev:master Jul 23, 2021
@jbrockmendel jbrockmendel deleted the perf-construction branch July 23, 2021 23:24
@rhshadrach
Copy link
Member

I'm seeing failure for the groupby.apply doctest on master. The condition that leads to raising an error is

if newb.shape != self.shape:

Looks like shape is being cached here - might be related?

@jbrockmendel
Copy link
Member Author

good catch, that caching should be reverted (and ideally a non-doctest test implemented)

simonjayhawkins added a commit that referenced this pull request Jul 24, 2021
simonjayhawkins added a commit that referenced this pull request Jul 24, 2021
@jbrockmendel jbrockmendel mentioned this pull request Jul 24, 2021
4 tasks
CGe0516 pushed a commit to CGe0516/pandas that referenced this pull request Jul 29, 2021
CGe0516 pushed a commit to CGe0516/pandas that referenced this pull request Jul 29, 2021
feefladder pushed a commit to feefladder/pandas that referenced this pull request Sep 7, 2021
feefladder pushed a commit to feefladder/pandas that referenced this pull request Sep 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Constructors Series/DataFrame/Index/pd.array Constructors Performance Memory or execution speed performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants