Skip to content

Conversation

@jiayue-zhang
Copy link
Contributor

@jiayue-zhang jiayue-zhang commented Dec 9, 2016

What changes were proposed in this pull request?

Allow DataFrame.replace() to replace with None/null values.

How was this patch tested?

Python doctest and unit test. Scala unit test.

@HyukjinKwon
Copy link
Member

Would we need a test for Scala too? I checked this by myself and it seems working fine with Scala though. I could argue that this affects the language-specific functions of both Python and Scala as a not strong opinion.

case _ => replacement.map { case (k, v) => (convertToDouble(k), convertToDouble(v)) }
}
val replacementMap: Map[_, _] =
if (replacement.head._2 == null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just write case null => here and thereby avoid the if else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that but failed. Scala doesn't allow pattern matching with null, scala.Nothing and scala.Null

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wait case null => works. I tried case v null which doesn't. Let me modify this. Thanks!

@SparkQA
Copy link

SparkQA commented Dec 10, 2016

Test build #3490 has finished for PR 16225 at commit 0b15c8f.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

ok to test

@gatorsmile
Copy link
Member

@bravo-zhang Could you resolve the conflicts? I will review it then. Thanks!

@SparkQA
Copy link

SparkQA commented May 23, 2017

Test build #77261 has finished for PR 16225 at commit 2c532c3.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@jiayue-zhang
Copy link
Contributor Author

Thanks for taking a look, @gatorsmile The conflicts have been resolved.
I appreciate if @zero323 can take a look as well since you made improvement on this function recently.

@SparkQA
Copy link

SparkQA commented May 24, 2017

Test build #77273 has finished for PR 16225 at commit 43fb6bd.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 24, 2017

Test build #77282 has finished for PR 16225 at commit b5424d9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@holdenk holdenk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, sorry its taken so long to review. I did a first read through the Python side and I've got two minor questions. Hopefully @zero323, @davie , or @gatorsmile can also have some time once the current release is finished to take a look.

"Got {0}".format(type(to_replace)))

if not isinstance(value, valid_types) and not isinstance(to_replace, dict):
if not isinstance(value, valid_types) and value is not None \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So slightly jet lagged style question: would this be clearer if we just add type(None) to L1398? I know PEP8 says we should only use is is not for checking if something is none rather than depending on the implicit conversion to boolean -- but since really checking the type here we aren't really in danger of that. (This is just a suggestion to make it easier to read - if others think its easier to read this way thats fine :)). @davies ?

to_replace = [to_replace]

if isinstance(value, (float, int, long, basestring)):
if isinstance(value, (float, int, long, basestring)) or value is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

if not any(all_of_type(rep_dict.keys()) and all_of_type(rep_dict.values())
if not any(all_of_type(rep_dict.keys())
and (all_of_type(rep_dict.values())
or list(rep_dict.values()).count(None) == len(rep_dict))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Scala code null is allowed in to be the replacement value for some of the elements but not in the Python code. Is this intentional? If so we should document it clearly and expand on the error message bellow (otherwise we should make it more flexible).

@gatorsmile
Copy link
Member

We are closing it due to inactivity. please do reopen if you want to push it forward. Thanks!

@asfgit asfgit closed this in b32bd00 Jun 27, 2017
@jiayue-zhang
Copy link
Contributor Author

@holdenk Thanks for review. I'll combine type(None) in the isinstance. I also made Scala and Python to accept null more generally and in the same way.
PR is reopened at: #18820
@gatorsmile please also take a look.

zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 22, 2025
## What changes were proposed in this pull request?

This PR proposes to close stale PRs, mostly the same instances with apache#18017

I believe the author in apache#14807 removed his account.

Closes apache#7075
Closes apache#8927
Closes apache#9202
Closes apache#9366
Closes apache#10861
Closes apache#11420
Closes apache#12356
Closes apache#13028
Closes apache#13506
Closes apache#14191
Closes apache#14198
Closes apache#14330
Closes apache#14807
Closes apache#15839
Closes apache#16225
Closes apache#16685
Closes apache#16692
Closes apache#16995
Closes apache#17181
Closes apache#17211
Closes apache#17235
Closes apache#17237
Closes apache#17248
Closes apache#17341
Closes apache#17708
Closes apache#17716
Closes apache#17721
Closes apache#17937

Added:
Closes apache#14739
Closes apache#17139
Closes apache#17445
Closes apache#18042
Closes apache#18359

Added:
Closes apache#16450
Closes apache#16525
Closes apache#17738

Added:
Closes apache#16458
Closes apache#16508
Closes apache#17714

Added:
Closes apache#17830
Closes apache#14742

## How was this patch tested?

N/A

Author: hyukjinkwon <[email protected]>

Closes apache#18417 from HyukjinKwon/close-stale-pr.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants