-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-1915] [SQL] AverageFunction should not count if the evaluated value is null. #862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. All automated tests passed. |
|
All automated tests passed. |
|
Hi @ueshin, would you please give some pointer about the semantics of For a quick proof, here is a sample session I ran under Hive 0.12.0: The |
|
@liancheng Thank you for your comment. |
|
Thanks. I've merged this into master & branch-1.0 |
…value is null. Average values are difference between the calculation is done partially or not partially. Because `AverageFunction` (in not-partially calculation) counts even if the evaluated value is null. Author: Takuya UESHIN <[email protected]> Closes #862 from ueshin/issues/SPARK-1915 and squashes the following commits: b1ff3c0 [Takuya UESHIN] Modify AverageFunction not to count if the evaluated value is null. (cherry picked from commit 3b0baba) Signed-off-by: Reynold Xin <[email protected]>
|
@ueshin Sorry, my bad, misunderstood your PR description. And I think you are right. On the other hand, it seems that It seems that currently |
|
Ah, realized what's wrong, I need at least 1 non-partial aggregation: And it does lead to the wrong answer. |
…value is null. Average values are difference between the calculation is done partially or not partially. Because `AverageFunction` (in not-partially calculation) counts even if the evaluated value is null. Author: Takuya UESHIN <[email protected]> Closes apache#862 from ueshin/issues/SPARK-1915 and squashes the following commits: b1ff3c0 [Takuya UESHIN] Modify AverageFunction not to count if the evaluated value is null.
…value is null. Average values are difference between the calculation is done partially or not partially. Because `AverageFunction` (in not-partially calculation) counts even if the evaluated value is null. Author: Takuya UESHIN <[email protected]> Closes apache#862 from ueshin/issues/SPARK-1915 and squashes the following commits: b1ff3c0 [Takuya UESHIN] Modify AverageFunction not to count if the evaluated value is null.
Average values are difference between the calculation is done partially or not partially.
Because
AverageFunction(in not-partially calculation) counts even if the evaluated value is null.