Skip to content

Conversation

toobaz
Copy link
Member

@toobaz toobaz commented Jul 17, 2017

This completes what was attempted in #14656 with tests and docstring.

In principle, set_axis could be mentioned somewhere in the docs (set_index is)... but I'm dubious because the API is horrible and inconsistent, in particular with Series, where it obliges the user to do s.set_axis(0, [...]). In other words, while I tended to agree with the suggestion that once .set_axis supports inplace=False, we wouldn't need any .relabel(), now that I see how it looks, I changed my mind (this is not the best place to discuss this, just explaining why I'm against "publicizing" set_axis too much).

Notice that in order to alleviate the problem, I changed the signature: axis and labels are now named parameters, so that one can do s.set_axis(labels=[...]). I admit the utility of this is debatable.

@gfyoung gfyoung added Enhancement Indexing Related to indexing on series/frames, not to indexes themselves Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Jul 17, 2017
- :func:`Series.to_dict` and :func:`DataFrame.to_dict` now support an ``into`` keyword which allows you to specify the ``collections.Mapping`` subclass that you would like returned. The default is ``dict``, which is backwards compatible. (:issue:`16122`)
- :func:`RangeIndex.append` now returns a ``RangeIndex`` object when possible (:issue:`16212`)
- :func:`Series.rename_axis` and :func:`DataFrame.rename_axis` with ``inplace=True`` now return ``None`` while renaming the axis inplace. (:issue:`15704`)
- :func:`Series.set_axis` now supports the ``inplace=`` parameter. (:issue:`14656`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the "=" after "inplace"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

labels: list-like or Index
The values for the new index
inplace : boolean, default False
Whether to return a new %(klass)s. If True then value of copy is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. But there is no copy parameter, so you can remove that second sentence.

  2. "%(klass)s" --> "%(klass)s instance"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Whether to return a new %(klass)s. If True then value of copy is
ignored.
WARNING: inplace=None currently falls back to to True, but
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually would set inplace=False as the default explicitly, but this is fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commentary for future reference

Returns
-------
renamed : %(klass)s (new object) if inplace=False, None otherwise
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the description (after the "klass") to a newline (and reword slightly so that it makes sense by itself)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

0 1 4
1 2 5
2 3 6
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an example for "inplace=True"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


# omitting the "axis=" parameter:
result = df.copy()
with tm.assert_produces_warning(warn):
Copy link
Member

@gfyoung gfyoung Jul 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's be more explicit about this warning value, as it is not part of the for loop.

result = df.set_axis(axis, list('abc'), inplace=False)
tm.assert_frame_equal(expected[axis], result)

# omitting the "axis=" parameter:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the "=" and ":"

tm.assert_frame_equal(expected[axis], result)

# omitting the "axis=" parameter:
result = df.copy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: add newline beneath this (just for readability)

tm.assert_series_equal(expected, result)

# omitting the "axis=" parameter:
result = s.copy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above

result = s.set_axis(0, list('abcd'), inplace=False)
tm.assert_series_equal(expected, result)

# omitting the "axis=" parameter:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as comment as above


# omitting the "axis=" parameter:
result = s.copy()
with tm.assert_produces_warning(warn):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above

result.set_axis(axis, list('abc'), **kwargs)
tm.assert_frame_equal(result, expected[axis])

# GH14636 (inplace=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just reference the issue below the test definition line. This comment can be reduced to "inplace=False"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? The scope of the test is broader (since set_axis() wasn't tested at all) than just the issue, which specifically refers to inplace=False.

Copy link
Member

@gfyoung gfyoung Jul 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's the issue that triggered the entire test, which is why I proposed that it be moved to the top of the function definition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's the issue that triggered the entire test

Well, in fact I just decided to test something which wasn't tested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you are adding a new parameter to the signature based on the diff.

result.set_axis(0, list('abcd'), **kwargs)
tm.assert_series_equal(result, expected)

# GH14636 (inplace=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above

expected['columns'] = expected[1]

for axis in expected:
# inplace=True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A sentence (on a newline) about why we issue FutureWarning when inplace=None would be good here.

expected.index = list('abcd')

for axis in 0, 'index':
# inplace=True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above

@codecov
Copy link

codecov bot commented Jul 17, 2017

Codecov Report

Merging #16994 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16994      +/-   ##
==========================================
- Coverage   90.99%   90.97%   -0.02%     
==========================================
  Files         161      161              
  Lines       49290    49299       +9     
==========================================
  Hits        44851    44851              
- Misses       4439     4448       +9
Flag Coverage Δ
#multiple 88.75% <100%> (ø) ⬆️
#single 40.19% <27.27%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 92.33% <100%> (+0.03%) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.76% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0bd871f...02c370f. Read the comment docs.

@codecov
Copy link

codecov bot commented Jul 17, 2017

Codecov Report

Merging #16994 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16994      +/-   ##
==========================================
- Coverage   91.02%   91.01%   -0.02%     
==========================================
  Files         161      161              
  Lines       49351    49363      +12     
==========================================
+ Hits        44924    44927       +3     
- Misses       4427     4436       +9
Flag Coverage Δ
#multiple 88.78% <100%> (ø) ⬆️
#single 40.25% <15%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/core/groupby.py 92.07% <100%> (ø) ⬆️
pandas/core/generic.py 92.05% <100%> (+0.05%) ⬆️
pandas/core/reshape/pivot.py 95.18% <100%> (ø) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.75% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 811bcbc...8fb9d0f. Read the comment docs.

@toobaz toobaz force-pushed the set_axis_inplace branch from 02c370f to d2816ea Compare July 17, 2017 22:50
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this looks fine. We could in principle switch the order of the parameters, e.g.

def set_axis(self, labels, axis, inplace=None)

to make this much more consistent (and Series could default axis=0).

I think this could done in a completely backward compat manner (as the axis must be a scalar and the labels must be a list).

axis : int or string, default 0
labels: list-like or Index
The values for the new index
inplace : boolean, default False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default is None (currently), it is set to True if its None (and the warning happens)

df = df.reset_index()

def test_set_axis(self):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the issue number

def test_set_axis(self):

s = Series(np.arange(4), index=[1, 3, 5, 7], dtype='int64')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue number

@toobaz
Copy link
Member Author

toobaz commented Jul 18, 2017

We could in principle switch the order of the parameters, e.g.

Aha, right! Would I emit a DeprecationWarning with the old signature?

@jreback
Copy link
Contributor

jreback commented Jul 18, 2017

Aha, right! Would I emit a DeprecationWarning with the old signature?

FutureWarning

@toobaz toobaz force-pushed the set_axis_inplace branch from d2816ea to 06e03c7 Compare July 18, 2017 16:37
@pep8speaks
Copy link

pep8speaks commented Jul 18, 2017

Hello @toobaz! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on July 23, 2017 at 20:58 Hours UTC

@toobaz
Copy link
Member Author

toobaz commented Jul 18, 2017

Should I document the signature change in the "Other API Changes" section of the whatsnew (despite the old version still being supported)?

@toobaz toobaz force-pushed the set_axis_inplace branch from 06e03c7 to 51912fb Compare July 18, 2017 16:44
@gfyoung
Copy link
Member

gfyoung commented Jul 18, 2017

Should I document the signature change in the "Other API Changes" section of the whatsnew (despite the old version still being supported)?

Yes, you should. Users might pass in these arguments positionally, so this signature change would break their code unless they were informed.

@toobaz
Copy link
Member Author

toobaz commented Jul 18, 2017

Users might pass in these arguments positionally, so this signature change would break their code unless they were informed.

We don't break anything (for now). This is precisely why I'm asking.

@gfyoung
Copy link
Member

gfyoung commented Jul 18, 2017

We don't break anything (for now). This is precisely why I'm asking.

True, but we still should let people know because there are many people who use the library, and we have no idea whose code we would break.

@toobaz
Copy link
Member Author

toobaz commented Jul 18, 2017

True, but we still should let people know because there are many people who use the library, and we have no idea whose code we would break.

The number of users is not a measure of the quality of refactoring... the question is just "should we mention an API change despite it not breaking anything? Or should we mention it only in the future, when the compatibility mode is removed?"

You're welcome to suggest a sentence.

@gfyoung
Copy link
Member

gfyoung commented Jul 18, 2017

the question is just "should we mention an API change despite it not breaking anything?

You should mention it now because you changed the signature by switching the position of the arguments (that's all you need to mention that it's a signature change). That's going to break anyone's code whose function calls use positional arguments.

@toobaz
Copy link
Member Author

toobaz commented Jul 18, 2017

you changed the signature by switching the position of the arguments

Maybe I see the misunderstanding: have you actually seen the code, the tests or the discussion above?

@gfyoung
Copy link
Member

gfyoung commented Jul 18, 2017

@toobaz : I have in fact, and my point still stands. You may have added deprecation warnings, but you should still mention the change as an API change because the new signature is the recommended way to pass in arguments.

@toobaz
Copy link
Member Author

toobaz commented Jul 18, 2017

the new signature is the recommended way

Great, now we are on the same page. Will try to add a sentence.

@toobaz toobaz force-pushed the set_axis_inplace branch from 51912fb to bfa7466 Compare July 18, 2017 18:10
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small changes. If this is going to complement remove the need for .relabel. Then I would

  • add to api.rst
  • add docs on using this (with the new API), near set_index section in indexing
  • add sub-section in whatsnew where you show signature changing AND how it could/should be used, with ref to section above.

""" public verson of axis assignment """
setattr(self, self._get_axis_name(axis), labels)
_shared_docs['set_axis'] = """Assign desired index to given axis
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a note that the signature changed in 0.21.0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(done: line 483)


@Appender(_shared_docs['set_axis'] % dict(klass='NDFrame'))
def set_axis(self, labels, axis=0, inplace=None):
if isinstance(labels, (int, str)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use is_scalar

# inplace=False
result = df.set_axis(list('abc'), axis=axis, inplace=False)
tm.assert_frame_equal(expected[axis], result)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a test for invalid axis (e.g. 'foo', or 2)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test with axis ='index' and axis='columns' (should work)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a test for invalid axis (e.g. 'foo', or 2)

(done: line 947)

s = Series(np.arange(4), index=[1, 3, 5, 7], dtype='int64')

expected = s.copy()
expected.index = list('abcd')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test for invalid axis (1 or 'foo')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test with axis='index' (should work)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test for invalid axis (1 or 'foo')

(done: line 266)

@toobaz
Copy link
Member Author

toobaz commented Jul 19, 2017

test with axis ='index' and axis='columns' (should work)

Already there (see lines 923-924)

@toobaz
Copy link
Member Author

toobaz commented Jul 19, 2017

If this is going to complement remove the need for .relabel

Despite the new signature, I'm not 100% convinced. Can we wait for a bit more discussion? I will add there a summary of pros and cons and, once a decision is made, take care of the docs you mention.

@toobaz
Copy link
Member Author

toobaz commented Jul 20, 2017

@jreback Some change requests still show up, but I should have solved them all in the last rebase.

(except the additional docs, for which I will wait #14829 and then provide another PR)

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment. pls add to api.rst as well where appropriate.

Returns
-------
renamed : %(klass)s or None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs updating

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean... anyway, I just made the following line more explicit

def set_axis(self, labels, axis=0, inplace=None):
if is_scalar(labels):
warnings.warn(
"set_axis now takes \"labels\" as first argument, and "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just use r to avoid needing the quotes, or we simply use ' as the quote marker.

@jreback jreback added this to the 0.21.0 milestone Jul 21, 2017
@toobaz toobaz force-pushed the set_axis_inplace branch from 95d6093 to 8fb9d0f Compare July 23, 2017 20:58
@toobaz
Copy link
Member Author

toobaz commented Jul 23, 2017

minor comment. pls add to api.rst as well where appropriate.

Done for Series, was already present for DataFrame.

@jreback
Copy link
Contributor

jreback commented Jul 24, 2017

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Indexing Related to indexing on series/frames, not to indexes themselves Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API: set_axis does not have inplace=False
4 participants