Skip to content

Conversation

RI-Jeyaprathap
Copy link

Bug Reference: closes #62443

Bug Description

DataFrame.aggregate has inconsistent behavior for empty DataFrames:

  • When aggregating along columns (axis='columns') on an empty DataFrame, it raises a ValueError instead of returning an empty Series.
  • Aggregating a non-empty DataFrame works as expected, returning a Series.
  • This inconsistency breaks workflows where the result is expected to be a single aggregated column.

Fix

  • Updated DataFrame.aggregate to handle empty DataFrames consistently.

  • Now, empty DataFrames return an empty Series with the correct index, regardless of axis.

  • This aligns the behavior with the documented return types:

    Returns:
    scalar, Series or DataFrame

    • scalar: when Series.agg is called with a single function
    • Series: when DataFrame.agg is called with a single function
    • DataFrame: when DataFrame.agg is called with several functions

Test

  • Added test_aggregate_empty_dataframe_returns_series in test_aggregate.py to ensure empty DataFrames return an empty Series along both axes (0 and 'columns').

Checklist

@skalwaghe-56
Copy link
Contributor

Please run pre-commit locally first and fix any issues there.

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, however I'm negative on this approach as it will break certain working cases for users today. I think what is first necessary is a solid proposal of how pandas treats empty objects across the API. This is #47959.

@skalwaghe-56
Copy link
Contributor

Thanks for the PR, however I'm negative on this approach as it will break certain working cases for users today. I think what is first necessary is a solid proposal of how pandas treats empty objects across the API. This is #47959.

Hmm, I agree.

@RI-Jeyaprathap
Copy link
Author

Please run pre-commit locally first and fix any issues there.

Thanks for the feedback! I've now run pre-commit locally and addressed the issues flagged. Let me know if anything else needs adjustment.

@RI-Jeyaprathap
Copy link
Author

Thanks for the PR, however I'm negative on this approach as it will break certain working cases for users today. I think what is first necessary is a solid proposal of how pandas treats empty objects across the API. This is #47959.

Thanks for the thoughtful review and for pointing me to #47959. I understand the concern about breaking existing behavior, and I agree that a consistent, well-defined approach to handling empty objects across the API is essential before making changes like this.

I’ll take a closer look at the ongoing discussion in that issue and see how this case fits into the broader proposal. Happy to revisit this PR once there's more clarity or alignment on the expected behavior. Appreciate your guidance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Dataframe.aggregate has inconsistent behaviour for empty dataframe
3 participants