- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 19.2k
BUG: Redefine IndexOpsMixin.size, fix #25580. #25584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: HE, Tao <[email protected]>
Signed-off-by: HE, Tao <[email protected]>
| Codecov Report
 @@            Coverage Diff             @@
##           master   #25584      +/-   ##
==========================================
+ Coverage   91.26%   91.26%   +<.01%     
==========================================
  Files         173      173              
  Lines       52966    52968       +2     
==========================================
+ Hits        48337    48340       +3     
+ Misses       4629     4628       -1
 
 Continue to review full report at Codecov. 
 | 
| Codecov Report
 @@             Coverage Diff             @@
##           master   #25584       +/-   ##
===========================================
- Coverage   91.26%   41.71%   -49.55%     
===========================================
  Files         173      173               
  Lines       52966    52968        +2     
===========================================
- Hits        48337    22094    -26243     
- Misses       4629    30874    +26245
 
 Continue to review full report at Codecov. 
 | 
Signed-off-by: HE, Tao <[email protected]>
| @sighingnow can you avoid force pushing? Makes it a bit easier to review incrementally. It'd be good to include a release note for the bug this is fixing too. @jorisvandenbossche do you remember if we discussed a  | 
| Noted, thanks @TomAugspurger. The  | 
| Whoops, yes it should return an integer (I was mixing it up with shape). That said, in the name of keeping a smaller API footprint, I'm leaning towards redefining IndexOpsMixin.size to use  I believe that this will work for MultiIndex (have to be careful about unused levels) def size(self):
    return np.prod(self.remove_unused_levels().levshape) | 
| I agree to move redefine  >>> a = pd.MultiIndex.from_arrays(np.array([[1,2,3], [4,5,6], [7,8,9]]))
>>> a.size
3
>>> np.prod(a.remove_unused_levels().levshape)
27I just want to confirm whether current behavior of  Edit: if redefine  | 
| Oh, hmm that's surprising. But I don't think we can easily change it at this point... I guess that means that MultiIndex.size can continue to use IndexOpsMixin.size. | 
Signed-off-by: HE, Tao <[email protected]>
| +1 on for now not adding  The  so a MultiIndex is regarded as a 1D object, not 2D (for sure can be confusing :-)) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add a test for the actual bug in size? (because it was surfaced first through resample, but size itself is also public API which currently fails)
| """ | ||
| Return the number of bytes in the underlying data. | ||
| """ | ||
| return self._values.nbytes | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the wrong property that is being changed? (nbytes, not size)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, will fix that soon.
        
          
                doc/source/whatsnew/v0.24.2.rst
              
                Outdated
          
        
      | - Fixed regression in ``IntervalDtype`` construction where passing an incorrect string with 'Interval' as a prefix could result in a ``RecursionError``. (:issue:`25338`) | ||
| - Fixed regression in :class:`Categorical`, where constructing it from a categorical ``Series`` and an explicit ``categories=`` that differed from that in the ``Series`` created an invalid object which could trigger segfaults. (:issue:`25318`) | ||
| - Fixed pip installing from source into an environment without NumPy (:issue:`25193`) | ||
| - Fixed bug where :meth:`core.base.IndexOpsMixin.size` could not return correct result for :class:`api.extensions.ExtensionArray` (:issue:`25580`) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's best to mention public API, so instead of core.base.IndexOpsMixin.size, I would use Series.size or Index.size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| I think we should add .size to ea generally this has come a number of times (and to be honest it’s a trivial addition to the base class ) | 
Signed-off-by: HE, Tao <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this lgtm. small whatsnew change; ping on green.
        
          
                doc/source/whatsnew/v0.24.2.rst
              
                Outdated
          
        
      | - Fixed regression in ``IntervalDtype`` construction where passing an incorrect string with 'Interval' as a prefix could result in a ``RecursionError``. (:issue:`25338`) | ||
| - Fixed regression in :class:`Categorical`, where constructing it from a categorical ``Series`` and an explicit ``categories=`` that differed from that in the ``Series`` created an invalid object which could trigger segfaults. (:issue:`25318`) | ||
| - Fixed pip installing from source into an environment without NumPy (:issue:`25193`) | ||
| - Fixed bug where :meth:`Index.size` could not return correct result for :class:`api.extensions.ExtensionArray` (:issue:`25580`) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not correct, it raised previously ot 'returned an incorrect value'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is also a bug fix and not a regression, can you move to the appopriate section (bug fixes numeric is ok)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually move to resampling bug fixes is good (you could mention that in the whatsnew note)
| lgtm. though something failing. | 
| It's the same sparse failure that should be fixed on master. | 
| @sighingnow thanks! | 
| MeeseeksDev doesn't seem to be picking this one up (and I also assume there will be a conflict in the whatsnew), so will do a manual backport | 
…5584) Signed-off-by: HE, Tao <[email protected]> (cherry picked from commit 6375549)
| @jorisvandenbossche many thanks for helping me get this PR merged into master! Sorry for the late response. BTW, I'm wondering if there were conflicts between pull request and master, how should I resolve that? Should I merge the master into the pull request, or rebase the pull request on top of master? | 
| @sighingnow you should merge master into the branch | 
| Noted, thanks! @WillAyd | 
git diff upstream/master -u -- "*.py" | flake8 --diff