-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
BUG: Fix bug with symmetric difference of two equal MultiIndexes GH12490 #13504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix bug with symmetric difference of two equal MultiIndexes GH12490 #13504
Conversation
Current coverage is 84.34%@@ master #13504 diff @@
==========================================
Files 138 138
Lines 51107 51122 +15
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 43103 43117 +14
- Misses 8004 8005 +1
Partials 0 0
|
pandas/indexes/multi.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make a constructor that creates an empty MultiIndex._create_as_empty() and just call here for both of these
doc/source/whatsnew/v0.18.2.txt
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put your whatsnew higher up (in an empty space in Bug Fixes) this way you dont have conflicts
|
If Travis passes again I think I got all your comments fixed
|
pandas/indexes/multi.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that this is internal. add a Parameters section
| return self._create_as_empty(names=result_name) | ||
|
|
||
| difference = sorted(set((self.difference(other)). | ||
| union(other.difference(self)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment:
This sorting breaks with mixed-int values (same issue as in #13432). I suppose this line may be safely replaced with the code from PR #13514, slightly changed, namely:
this = self._get_unique_index()
other = other._get_unique_index()
indexer = this.get_indexer(other)
# {this} minus {other}
common_indexer = indexer.take((indexer != -1).nonzero()[0])
left_indexer = np.setdiff1d(np.arange(this.size), common_indexer,
assume_unique=True)
left_diff = this.values.take(left_indexer)
# {other} minus {this}
right_indexer = (indexer == -1).nonzero()[0]
right_diff = other.values.take(right_indexer)
the_diff = _concat._concat_compat([left_diff, right_diff])Then if the_diff is empty, an empty MI should be returned, otherwise:
return self._shallow_copy(the_diff, names=result_name).sortlevel()(In comparison to Index.symmetric_difference sorting is postponed until the end.)
Similar change may be applied to MultiIndex.difference, I guess.
|
can you rebase / update? |
|
closing, but pls reopen / comment if you can continue |
git diff upstream/master | flake8 --diffFixes a bug where the symmetric difference of two equal MultiIndexes would raise a TypeError. MultiIndex used to use the
Index.symmetric_difference. With this PR it it's own implementation that is more likeMultiIndex.difference.