-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
ENH: provide "inplace" argument to set_axis() #16994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
doc/source/whatsnew/v0.21.0.txt
Outdated
- :func:`Series.to_dict` and :func:`DataFrame.to_dict` now support an ``into`` keyword which allows you to specify the ``collections.Mapping`` subclass that you would like returned. The default is ``dict``, which is backwards compatible. (:issue:`16122`) | ||
- :func:`RangeIndex.append` now returns a ``RangeIndex`` object when possible (:issue:`16212`) | ||
- :func:`Series.rename_axis` and :func:`DataFrame.rename_axis` with ``inplace=True`` now return ``None`` while renaming the axis inplace. (:issue:`15704`) | ||
- :func:`Series.set_axis` now supports the ``inplace=`` parameter. (:issue:`14656`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the "=" after "inplace"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
pandas/core/generic.py
Outdated
labels: list-like or Index | ||
The values for the new index | ||
inplace : boolean, default False | ||
Whether to return a new %(klass)s. If True then value of copy is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
But there is no copy parameter, so you can remove that second sentence.
-
"%(klass)s" --> "%(klass)s instance"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
Whether to return a new %(klass)s. If True then value of copy is | ||
ignored. | ||
WARNING: inplace=None currently falls back to to True, but |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually would set inplace=False
as the default explicitly, but this is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commentary for future reference
pandas/core/generic.py
Outdated
Returns | ||
------- | ||
renamed : %(klass)s (new object) if inplace=False, None otherwise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move the description (after the "klass") to a newline (and reword slightly so that it makes sense by itself)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
0 1 4 | ||
1 2 5 | ||
2 3 6 | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an example for "inplace=True
"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
|
||
# omitting the "axis=" parameter: | ||
result = df.copy() | ||
with tm.assert_produces_warning(warn): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's be more explicit about this warning value, as it is not part of the for
loop.
result = df.set_axis(axis, list('abc'), inplace=False) | ||
tm.assert_frame_equal(expected[axis], result) | ||
|
||
# omitting the "axis=" parameter: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the "=" and ":"
tm.assert_frame_equal(expected[axis], result) | ||
|
||
# omitting the "axis=" parameter: | ||
result = df.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: add newline beneath this (just for readability)
tm.assert_series_equal(expected, result) | ||
|
||
# omitting the "axis=" parameter: | ||
result = s.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above
result = s.set_axis(0, list('abcd'), inplace=False) | ||
tm.assert_series_equal(expected, result) | ||
|
||
# omitting the "axis=" parameter: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as comment as above
|
||
# omitting the "axis=" parameter: | ||
result = s.copy() | ||
with tm.assert_produces_warning(warn): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above
result.set_axis(axis, list('abc'), **kwargs) | ||
tm.assert_frame_equal(result, expected[axis]) | ||
|
||
# GH14636 (inplace=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just reference the issue below the test definition line. This comment can be reduced to "inplace=False"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure? The scope of the test is broader (since set_axis()
wasn't tested at all) than just the issue, which specifically refers to inplace=False
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it's the issue that triggered the entire test, which is why I proposed that it be moved to the top of the function definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it's the issue that triggered the entire test
Well, in fact I just decided to test something which wasn't tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you are adding a new parameter to the signature based on the diff.
result.set_axis(0, list('abcd'), **kwargs) | ||
tm.assert_series_equal(result, expected) | ||
|
||
# GH14636 (inplace=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above
expected['columns'] = expected[1] | ||
|
||
for axis in expected: | ||
# inplace=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A sentence (on a newline) about why we issue FutureWarning
when inplace=None
would be good here.
expected.index = list('abcd') | ||
|
||
for axis in 0, 'index': | ||
# inplace=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above
Codecov Report
@@ Coverage Diff @@
## master #16994 +/- ##
==========================================
- Coverage 90.99% 90.97% -0.02%
==========================================
Files 161 161
Lines 49290 49299 +9
==========================================
Hits 44851 44851
- Misses 4439 4448 +9
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #16994 +/- ##
==========================================
- Coverage 91.02% 91.01% -0.02%
==========================================
Files 161 161
Lines 49351 49363 +12
==========================================
+ Hits 44924 44927 +3
- Misses 4427 4436 +9
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this looks fine. We could in principle switch the order of the parameters, e.g.
def set_axis(self, labels, axis, inplace=None)
to make this much more consistent (and Series could default axis=0).
I think this could done in a completely backward compat manner (as the axis must be a scalar and the labels must be a list).
pandas/core/generic.py
Outdated
axis : int or string, default 0 | ||
labels: list-like or Index | ||
The values for the new index | ||
inplace : boolean, default False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default is None (currently), it is set to True if its None (and the warning happens)
df = df.reset_index() | ||
|
||
def test_set_axis(self): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the issue number
def test_set_axis(self): | ||
|
||
s = Series(np.arange(4), index=[1, 3, 5, 7], dtype='int64') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue number
Aha, right! Would I emit a |
|
Hello @toobaz! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on July 23, 2017 at 20:58 Hours UTC |
Should I document the signature change in the "Other API Changes" section of the whatsnew (despite the old version still being supported)? |
Yes, you should. Users might pass in these arguments positionally, so this signature change would break their code unless they were informed. |
We don't break anything (for now). This is precisely why I'm asking. |
True, but we still should let people know because there are many people who use the library, and we have no idea whose code we would break. |
The number of users is not a measure of the quality of refactoring... the question is just "should we mention an API change despite it not breaking anything? Or should we mention it only in the future, when the compatibility mode is removed?" You're welcome to suggest a sentence. |
You should mention it now because you changed the signature by switching the position of the arguments (that's all you need to mention that it's a signature change). That's going to break anyone's code whose function calls use positional arguments. |
Maybe I see the misunderstanding: have you actually seen the code, the tests or the discussion above? |
@toobaz : I have in fact, and my point still stands. You may have added deprecation warnings, but you should still mention the change as an API change because the new signature is the recommended way to pass in arguments. |
Great, now we are on the same page. Will try to add a sentence. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small changes. If this is going to complement remove the need for .relabel
. Then I would
- add to api.rst
- add docs on using this (with the new API), near set_index section in indexing
- add sub-section in whatsnew where you show signature changing AND how it could/should be used, with ref to section above.
""" public verson of axis assignment """ | ||
setattr(self, self._get_axis_name(axis), labels) | ||
_shared_docs['set_axis'] = """Assign desired index to given axis | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a note that the signature changed in 0.21.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(done: line 483)
pandas/core/generic.py
Outdated
|
||
@Appender(_shared_docs['set_axis'] % dict(klass='NDFrame')) | ||
def set_axis(self, labels, axis=0, inplace=None): | ||
if isinstance(labels, (int, str)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use is_scalar
# inplace=False | ||
result = df.set_axis(list('abc'), axis=axis, inplace=False) | ||
tm.assert_frame_equal(expected[axis], result) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a test for invalid axis (e.g. 'foo', or 2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test with axis ='index' and axis='columns' (should work)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a test for invalid axis (e.g. 'foo', or 2)
(done: line 947)
s = Series(np.arange(4), index=[1, 3, 5, 7], dtype='int64') | ||
|
||
expected = s.copy() | ||
expected.index = list('abcd') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test for invalid axis (1 or 'foo')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test with axis='index' (should work)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test for invalid axis (1 or 'foo')
(done: line 266)
Already there (see lines 923-924) |
Despite the new signature, I'm not 100% convinced. Can we wait for a bit more discussion? I will add there a summary of pros and cons and, once a decision is made, take care of the docs you mention. |
@jreback Some change requests still show up, but I should have solved them all in the last rebase. (except the additional docs, for which I will wait #14829 and then provide another PR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comment. pls add to api.rst as well where appropriate.
Returns | ||
------- | ||
renamed : %(klass)s or None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs updating
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean... anyway, I just made the following line more explicit
pandas/core/generic.py
Outdated
def set_axis(self, labels, axis=0, inplace=None): | ||
if is_scalar(labels): | ||
warnings.warn( | ||
"set_axis now takes \"labels\" as first argument, and " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just use r
to avoid needing the quotes, or we simply use '
as the quote marker.
Done for |
thanks! |
git diff master -u -- "*.py" | flake8 --diff
This completes what was attempted in #14656 with tests and docstring.
In principle,
set_axis
could be mentioned somewhere in the docs (set_index
is)... but I'm dubious because the API is horrible and inconsistent, in particular withSeries
, where it obliges the user to dos.set_axis(0, [...])
. In other words, while I tended to agree with the suggestion that once.set_axis
supportsinplace=False
, we wouldn't need any.relabel()
, now that I see how it looks, I changed my mind (this is not the best place to discuss this, just explaining why I'm against "publicizing"set_axis
too much).Notice that in order to alleviate the problem, I changed the signature:
axis
andlabels
are now named parameters, so that one can dos.set_axis(labels=[...])
. I admit the utility of this is debatable.