Skip to content

Conversation

@mroeschke
Copy link
Member

@mroeschke mroeschke added Bug Groupby Window rolling, ewma, expanding labels Apr 5, 2021
@mroeschke mroeschke added this to the 1.3 milestone Apr 5, 2021
@phofl
Copy link
Member

phofl commented Apr 6, 2021

Does this fix #31007? Haven't looked closely but seen this in the past

@mroeschke
Copy link
Member Author

@phofl unfortunately doesnt look like it

In [1]: import pandas as pd
   ...:
   ...: data = {
   ...:     'groupby_col': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', ],
   ...:     'agg_col': [1, 1, 0, 1, 0, 0, 0, 0, 1, 0],
   ...: }
   ...: df = pd.DataFrame(data)
   ...: df.groupby(['groupby_col'], as_index=False).rolling(4).agg({'agg_col': 'mean'})
Out[1]:
               agg_col
groupby_col
A           0      NaN
            1      NaN
            2      NaN
            3     0.75
            4     0.50
B           5      NaN
            6      NaN
            7      NaN
            8     0.25
            9     0.25

agg still goes partially through groupby code I think to calculate the index result.

@phofl
Copy link
Member

phofl commented Apr 6, 2021

Pitty, thanks for checking


result.index = result_index
if not self._as_index:
result = result.reset_index(level=list(range(len(groupby_keys))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens here when the groupby is on an explicit list, e.g. in your test use groupby(["A", "A", "B", "B"]) instead. What is groupby_keys in this case?

Copy link
Member Author

@mroeschke mroeschke Apr 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the result

In [4]: df.groupby(["A", "A", "B", "B"], as_index=False).rolling(window=2, min_periods=1).mean()
> /Users/matthewroeschke/pandas-mroeschke/pandas/core/window/rolling.py(580)_apply()
-> result_index_names = groupby_keys + grouped_index_name
(Pdb) groupby_keys
[None]
(Pdb) c
Out[4]:
           level_0    num
date
2018-01-01       A  100.0
2018-01-02       A  150.0
2018-01-01       B  150.0
2018-01-02       B  200.0

In [5]: df.groupby(["A", "A", "B", "B"], as_index=False).mean()
Out[5]:
     num
0  150.0
1  200.0

Not sure if the normal groupby has the expected result but appears that groupby.rolling brings the list into the dataframe as a column

)
tm.assert_series_equal(result, expected)

def test_as_index_false(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the multi-key groupby here as well (parameterize if you can)

df_res2 = df.groupby([df.id, df.index.weekday], as_index=False).rolling(window=2, min_periods=1).mean()
df_res3 = df.groupby([df.id]).rolling(window=2, min_periods=1).mean()
df_res4 = df.groupby([df.id], as_index=False).rolling(window=2, min_periods=1).mean()

e.g. 2 & 4 (we likley have 1 & 3 covered, but wouldn't object to those included as well)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm all these tested (except the as_index=True cases which are tested everywhere else). So you just want it parameterized?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if they are elsewhere then don't (just the cases that is covering in this PR are fine). parameterize if you can.

@jreback jreback merged commit cd2aaa3 into pandas-dev:master Apr 9, 2021
@jreback
Copy link
Contributor

jreback commented Apr 9, 2021

thanks @mroeschke

@mroeschke mroeschke deleted the bug/groupby_rolling_as_index branch April 10, 2021 05:24
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Groupby Window rolling, ewma, expanding

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: groupby.rolling: Originial index is not being preserved when using date_part of DatetimeIndex and as_index key word seems to have no effect

4 participants