Skip to content

Conversation

@lstout
Copy link

@lstout lstout commented Jun 23, 2016

Fixes a bug where the symmetric difference of two equal MultiIndexes would raise a TypeError. MultiIndex used to use the Index.symmetric_difference. With this PR it it's own implementation that is more like MultiIndex.difference.

@codecov-io
Copy link

codecov-io commented Jun 23, 2016

Current coverage is 84.34%

Merging #13504 into master will increase coverage by <.01%

@@             master     #13504   diff @@
==========================================
  Files           138        138          
  Lines         51107      51122    +15   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43103      43117    +14   
- Misses         8004       8005     +1   
  Partials          0          0          

Powered by Codecov. Last updated by bf4786a...8cb445f

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make a constructor that creates an empty MultiIndex._create_as_empty() and just call here for both of these

@jreback jreback added Bug Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex labels Jun 23, 2016
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put your whatsnew higher up (in an empty space in Bug Fixes) this way you dont have conflicts

@lstout
Copy link
Author

lstout commented Jun 24, 2016

If Travis passes again I think I got all your comments fixed

  • Move whatsnew
  • Move tests to test_multi.py
  • Constructor for empty MultiIndex (MultiIndex._create_as_empty(nlevels))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that this is internal. add a Parameters section

return self._create_as_empty(names=result_name)

difference = sorted(set((self.difference(other)).
union(other.difference(self))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a comment:

This sorting breaks with mixed-int values (same issue as in #13432). I suppose this line may be safely replaced with the code from PR #13514, slightly changed, namely:

        this = self._get_unique_index()
        other = other._get_unique_index()
        indexer = this.get_indexer(other)

        # {this} minus {other}
        common_indexer = indexer.take((indexer != -1).nonzero()[0])
        left_indexer = np.setdiff1d(np.arange(this.size), common_indexer,
                                    assume_unique=True)
        left_diff = this.values.take(left_indexer)

        # {other} minus {this}
        right_indexer = (indexer == -1).nonzero()[0]
        right_diff = other.values.take(right_indexer)

        the_diff = _concat._concat_compat([left_diff, right_diff])

Then if the_diff is empty, an empty MI should be returned, otherwise:

        return self._shallow_copy(the_diff, names=result_name).sortlevel()

(In comparison to Index.symmetric_difference sorting is postponed until the end.)

Similar change may be applied to MultiIndex.difference, I guess.

@jreback
Copy link
Contributor

jreback commented Sep 9, 2016

can you rebase / update?

@jreback
Copy link
Contributor

jreback commented Nov 16, 2016

closing, but pls reopen / comment if you can continue

@jreback jreback closed this Nov 16, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Symmetric difference on equal MultiIndexes raises TypeError

4 participants