Skip to content

Conversation

@evanpw
Copy link
Contributor

@evanpw evanpw commented May 14, 2015

Example:

>>> df = pd.DataFrame({'a' : pd.Categorical('xyxy'), 'b' : 1, 'c' : 2})
>>> df.groupby(['a', 'b']).get_group(('x', 1))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/evanpw/Workspace/pandas/pandas/core/groupby.py", line 601, in get_group
    inds = self._get_index(name)
  File "/home/evanpw/Workspace/pandas/pandas/core/groupby.py", line 429, in _get_index
    sample = next(iter(self.indices))
  File "/home/evanpw/Workspace/pandas/pandas/core/groupby.py", line 414, in indices
    return self.grouper.indices
  File "pandas/src/properties.pyx", line 34, in pandas.lib.cache_readonly.__get__ (pandas/lib.c:41912)
  File "/home/evanpw/Workspace/pandas/pandas/core/groupby.py", line 1305, in indices
    return _get_indices_dict(label_list, keys)
  File "/home/evanpw/Workspace/pandas/pandas/core/groupby.py", line 3762, in _get_indices_dict
    return lib.indices_fast(sorter, group_index, keys, sorted_labels)
  File "pandas/lib.pyx", line 1385, in pandas.lib.indices_fast (pandas/lib.c:23843)
TypeError: Cannot convert Categorical to numpy.ndarray

The problem is that Grouping.group_index is a CategoricalIndex, so calling get_values() gives you a Categorical, which needs one more application of get_values() to get an ndarray

@jreback
Copy link
Contributor

jreback commented May 14, 2015

can you put the self-contained example in the top of the PR?

@jreback jreback added Bug Groupby Categorical Categorical Data Type labels May 14, 2015
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of this, i think that .get_values() needs to be defined for CategoricalIndex (.get_values is defined for a regular index and CategoricalIndex is just inhertiting).

@jreback
Copy link
Contributor

jreback commented Jun 2, 2015

can you update according to comments

@evanpw
Copy link
Contributor Author

evanpw commented Jun 3, 2015

Done, and moved to 0.16.2 whatsnew. Does this need an entry in the API changes section now, or is this change too minor?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you removed the wrong character :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incremental progress :)

@jorisvandenbossche
Copy link
Member

This seems a bug fix to me, so the whatsnew entry is fine!

@jreback
Copy link
Contributor

jreback commented Jun 3, 2015

yep, pls squash. ping when green.

@evanpw
Copy link
Contributor Author

evanpw commented Jun 4, 2015

Tests are green

@jreback
Copy link
Contributor

jreback commented Jun 4, 2015

@evanpw thanks. Soon, waiting on travis to finish up its builds of a bunch of stuff.

@jreback jreback added this to the 0.16.2 milestone Jun 5, 2015
jreback added a commit that referenced this pull request Jun 5, 2015
BUG: get_group fails when multi-grouping with a categorical
@jreback jreback merged commit 08b1511 into pandas-dev:master Jun 5, 2015
@jreback
Copy link
Contributor

jreback commented Jun 5, 2015

@evanpw thanks!

@evanpw evanpw deleted the cat_multigroup branch June 5, 2015 22:08
@evanpw evanpw restored the cat_multigroup branch September 19, 2015 00:34
@evanpw evanpw deleted the cat_multigroup branch September 19, 2015 00:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Categorical Categorical Data Type Groupby

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants