Skip to content

Conversation

@topper-123
Copy link
Contributor

@topper-123 topper-123 commented May 24, 2020

This proposes changing the StringDtype repr to show the fill_value's repr rather than its string representation:

>>> arr = pd.arrays.SparseArray([0, 1])
>>> arr.dtype
Sparse[int64, 0]  # master & this PR, unchanged
>>> arr.astype(str).dtype
Sparse[object, 0]  # master
Sparse[object, '0']  # this PR, notice quotes

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

def test_array_repr(self, data, size):
super().test_array_repr(data, size)

def test_fillna_repr(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put this test in one of the files in tests/arrays/sparse/.. (eg in test_dtype.py)?

(trying to keep those extension tests only the shared common ones inherted from the base tests, any custom test goes into tests/arrays instead of tests/extension/)

@topper-123
Copy link
Contributor Author

Ok, I've moved the tests.

@topper-123 topper-123 added the Output-Formatting __repr__ of pandas objects, to_string label May 25, 2020
@topper-123 topper-123 added this to the 1.1 milestone May 25, 2020
@jreback jreback added the Sparse Sparse Data Type label May 25, 2020
@jreback
Copy link
Contributor

jreback commented May 25, 2020

looks good can you add a release note (sparse bug fix section is ok). ping on green.

@topper-123 topper-123 merged commit f5ab5a8 into pandas-dev:master May 25, 2020
@topper-123 topper-123 deleted the SparseDtype_repr branch May 25, 2020 19:47
@jorisvandenbossche
Copy link
Member

Thanks!

@jreback
Copy link
Contributor

jreback commented May 25, 2020

btw if would be helpful to survey other dtypes (mainly category) to see if we are consistent for various types

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented May 25, 2020

Non of the other dtypes have the concept of a "fill_value" (or print another value in its dtype repr), so I am not sure what should be checked?

@jreback
Copy link
Contributor

jreback commented May 25, 2020

of course; i am speaking generally

we expose other things in the repr eg categories

@jorisvandenbossche
Copy link
Member

The CategoricalDtype is fine:

In [9]: pd.Categorical([1, 2, '3']).dtype 
Out[9]: CategoricalDtype(categories=[1, 2, '3'], ordered=False)

Categorical itself is not:

In [11]: pd.Categorical([1, 2, '3']) 
Out[11]: 
[1, 2, 3]
Categories (3, object): [1, 2, 3]

but that is actually already being handled in #34222

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Output-Formatting __repr__ of pandas objects, to_string Sparse Sparse Data Type

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants