Skip to content

Conversation

@HyukjinKwon
Copy link
Member

What changes were proposed in this pull request?

Scala:

scala> spark.conf.get("hey", null)
res1: String = null
scala> spark.conf.get("spark.sql.sources.partitionOverwriteMode", null)
res2: String = null

Python:

Before

>>> spark.conf.get("hey", None)
...
py4j.protocol.Py4JJavaError: An error occurred while calling o30.get.
: java.util.NoSuchElementException: hey
...
>>> spark.conf.get("spark.sql.sources.partitionOverwriteMode", None)
u'STATIC'

After

>>> spark.conf.get("hey", None) is None
True
>>> spark.conf.get("spark.sql.sources.partitionOverwriteMode", None) is None
True

*Note that this PR preserves the case below:

>>> spark.conf.get("spark.sql.sources.partitionOverwriteMode")
u'STATIC'

How was this patch tested?

Manually tested and unit tests were added.

@HyukjinKwon
Copy link
Member Author

cc @cloud-fan, @viirya, @ueshin, @BryanCutler who I can directly think of for now.

@SparkQA
Copy link

SparkQA commented Mar 16, 2018

Test build #88291 has finished for PR 20841 at commit 1a6cfce.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Mar 16, 2018

Test build #88294 has finished for PR 20841 at commit 1a6cfce.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

def getConf(self, key, defaultValue=_NoValue):
"""Returns the value of Spark SQL configuration property for the given key.
If the key is not set and defaultValue is not None, return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update this doc.

@SparkQA
Copy link

SparkQA commented Mar 16, 2018

Test build #88298 has finished for PR 20841 at commit d8ee18f.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member

viirya commented Mar 16, 2018

retest this please.

@SparkQA
Copy link

SparkQA commented Mar 16, 2018

Test build #88300 has finished for PR 20841 at commit d8ee18f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for one comment.


self.assertRaisesRegexp(Exception, "hyukjin", lambda: spark.conf.get("hyukjin"))
self.assertEqual(spark.conf.get("hyukjin", None), None)
self.assertEqual(spark.conf.get("spark.sql.sources.partitionOverwriteMode", None), None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add one more test to get "spark.sql.sources.partitionOverwriteMode" without None to see the difference between with and without None? Hopefully with a simple comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@BryanCutler
Copy link
Member

Shouldn't it be the same as in the application conf here https://github.com/apache/spark/blob/master/python/pyspark/conf.py#L174?

Here the default is None and it returns None if the conf doesn't contain the and the user doesn't explicitly set a default. There isn't really any value in getting a PY4JError right?

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Mar 17, 2018

I think we should better match Python side behaviour to Scala side in general and throw a better exception if possible. I think the application conf was done like that because there wasn't _NoValue concept at that time. Both sql conf and application conf were not quite the same already ..

BTW, could I ask to fix it separately please? I actually have found several places that we might have to consider _NoValue ..

@BryanCutler
Copy link
Member

BryanCutler commented Mar 17, 2018

Yeah, true that is how the Scala side works so I suppose that is best, but I kind of view this api similar to the python dict.get which returns None and won't raise an error, and that would act like getOption in scala. I don't feel strongly though, so up to you.

@HyukjinKwon
Copy link
Member Author

Ah, I got the point. So, you mean a dictionary-like approach - get("non-existant-key") should return None always instead of throwing an error, language-specifically? I think we might consider to allow other dictionary-like operations together similarly but otherwise it might be fine to leave it consistent with Scala side.

Strictly, I blieve this PR doesn't change that behaviour if I understood correctly but just preserves it anyway.

spark.conf.unset("bogo")
self.assertEqual(spark.conf.get("bogo", "colombia"), "colombia")

self.assertRaisesRegexp(Exception, "hyukjin", lambda: spark.conf.get("hyukjin"))
Copy link
Member Author

@HyukjinKwon HyukjinKwon Mar 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BryanCutler, so this test case prompted you. It's not related with this PR:

Before

>>> spark.conf.get("hyukjin")
...
: java.util.NoSuchElementException: hyukjin
...

After

>>> spark.conf.get("hyukjin")
...
: java.util.NoSuchElementException: hyukjin
...

Let me take this out.

@SparkQA
Copy link

SparkQA commented Mar 17, 2018

Test build #88336 has finished for PR 20841 at commit 3a8de04.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Mar 17, 2018

Test build #88339 has finished for PR 20841 at commit 3a8de04.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@BryanCutler BryanCutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@viirya
Copy link
Member

viirya commented Mar 18, 2018

LGTM.

@asfgit asfgit closed this in 61487b3 Mar 18, 2018
@HyukjinKwon
Copy link
Member Author

Merged to master and branch-2.3.

Thank you @ueshin, @BryanCutler and @viirya for reviewing this.

asfgit pushed a commit that referenced this pull request Mar 18, 2018
…uce None in PySpark

Scala:

```
scala> spark.conf.get("hey", null)
res1: String = null
```

```
scala> spark.conf.get("spark.sql.sources.partitionOverwriteMode", null)
res2: String = null
```

Python:

**Before**

```
>>> spark.conf.get("hey", None)
...
py4j.protocol.Py4JJavaError: An error occurred while calling o30.get.
: java.util.NoSuchElementException: hey
...
```

```
>>> spark.conf.get("spark.sql.sources.partitionOverwriteMode", None)
u'STATIC'
```

**After**

```
>>> spark.conf.get("hey", None) is None
True
```

```
>>> spark.conf.get("spark.sql.sources.partitionOverwriteMode", None) is None
True
```

*Note that this PR preserves the case below:

```
>>> spark.conf.get("spark.sql.sources.partitionOverwriteMode")
u'STATIC'
```

Manually tested and unit tests were added.

Author: hyukjinkwon <[email protected]>

Closes #20841 from HyukjinKwon/spark-conf-get.

(cherry picked from commit 61487b3)
Signed-off-by: hyukjinkwon <[email protected]>
mstewart141 pushed a commit to mstewart141/spark that referenced this pull request Mar 24, 2018
…uce None in PySpark

## What changes were proposed in this pull request?

Scala:

```
scala> spark.conf.get("hey", null)
res1: String = null
```

```
scala> spark.conf.get("spark.sql.sources.partitionOverwriteMode", null)
res2: String = null
```

Python:

**Before**

```
>>> spark.conf.get("hey", None)
...
py4j.protocol.Py4JJavaError: An error occurred while calling o30.get.
: java.util.NoSuchElementException: hey
...
```

```
>>> spark.conf.get("spark.sql.sources.partitionOverwriteMode", None)
u'STATIC'
```

**After**

```
>>> spark.conf.get("hey", None) is None
True
```

```
>>> spark.conf.get("spark.sql.sources.partitionOverwriteMode", None) is None
True
```

*Note that this PR preserves the case below:

```
>>> spark.conf.get("spark.sql.sources.partitionOverwriteMode")
u'STATIC'
```

## How was this patch tested?

Manually tested and unit tests were added.

Author: hyukjinkwon <[email protected]>

Closes apache#20841 from HyukjinKwon/spark-conf-get.
peter-toth pushed a commit to peter-toth/spark that referenced this pull request Oct 6, 2018
…uce None in PySpark

Scala:

```
scala> spark.conf.get("hey", null)
res1: String = null
```

```
scala> spark.conf.get("spark.sql.sources.partitionOverwriteMode", null)
res2: String = null
```

Python:

**Before**

```
>>> spark.conf.get("hey", None)
...
py4j.protocol.Py4JJavaError: An error occurred while calling o30.get.
: java.util.NoSuchElementException: hey
...
```

```
>>> spark.conf.get("spark.sql.sources.partitionOverwriteMode", None)
u'STATIC'
```

**After**

```
>>> spark.conf.get("hey", None) is None
True
```

```
>>> spark.conf.get("spark.sql.sources.partitionOverwriteMode", None) is None
True
```

*Note that this PR preserves the case below:

```
>>> spark.conf.get("spark.sql.sources.partitionOverwriteMode")
u'STATIC'
```

Manually tested and unit tests were added.

Author: hyukjinkwon <[email protected]>

Closes apache#20841 from HyukjinKwon/spark-conf-get.

(cherry picked from commit 61487b3)
Signed-off-by: hyukjinkwon <[email protected]>
@HyukjinKwon HyukjinKwon deleted the spark-conf-get branch October 16, 2018 12:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants