Skip to content

Conversation

@gatorsmile
Copy link
Member

@gatorsmile gatorsmile commented Jun 28, 2016

What changes were proposed in this pull request?

@koertkuipers identified the PR #13727 changed the behavior of load API. After the change, the load API does not add the value of path into the options. Thank you!

This PR is to add the option path back to load() API in DataFrameReader, if and only if users specify one and only one path in the load API. For example, users can see the path option after the following API call,

spark.read
  .format("parquet")
  .load("/test")

How was this patch tested?

Added test cases.

@gatorsmile
Copy link
Member Author

cc @rxin @tdas @koertkuipers

@SparkQA
Copy link

SparkQA commented Jun 28, 2016

Test build #61333 has finished for PR 13933 at commit 15eb528.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 28, 2016

Test build #61342 has finished for PR 13933 at commit bf1c9b5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

cc @rxin The code is ready for review. Thanks!

@rxin
Copy link
Contributor

rxin commented Jun 28, 2016

LGTM -- cc @tdas to take a look since he wrote the original patch.

@liancheng
Copy link
Contributor

LGTM.

@rxin
Copy link
Contributor

rxin commented Jun 28, 2016

Merging in master/2.0.

asfgit pushed a commit that referenced this pull request Jun 28, 2016
#### What changes were proposed in this pull request?
koertkuipers identified the PR #13727 changed the behavior of `load` API. After the change, the `load` API does not add the value of `path` into the `options`. Thank you!

This PR is to add the option `path` back to `load()` API in `DataFrameReader`, if and only if users specify one and only one `path` in the `load` API. For example, users can see the `path` option after the following API call,
```Scala
spark.read
  .format("parquet")
  .load("/test")
```

#### How was this patch tested?
Added test cases.

Author: gatorsmile <[email protected]>

Closes #13933 from gatorsmile/optionPath.

(cherry picked from commit 25520e9)
Signed-off-by: Reynold Xin <[email protected]>
@asfgit asfgit closed this in 25520e9 Jun 28, 2016
@zsxwing
Copy link
Member

zsxwing commented Jun 28, 2016

I noticed that the Python API is inconsistent here:

if type(path) != list:

It always calls def load(paths: String*).

@gatorsmile
Copy link
Member Author

@zsxwing Let me try to fix it. Thanks!

@gatorsmile
Copy link
Member Author

The problem also exists in the other APIs:
like parquet, json and others.

def parquet(path: String): DataFrame = {
// This method ensures that calls that explicit need single argument works, see SPARK-16009
parquet(Seq(path): _*)
}

Let me fix them in the same PR. Thanks!

@zsxwing
Copy link
Member

zsxwing commented Jun 29, 2016

@gatorsmile parquet, json or other file formats support both path and paths options. So that's not a problem.

@gatorsmile
Copy link
Member Author

@zsxwing If we just provide one path in the function input, it will not put path into the options. The API parquet(path: String)still calls load(paths : _*), instead of load(path). Thus, we will introduce inconsistent behavior, compared with load(path: String).

Could you review the new PR I just submitted? Let me know if anything is not appropriate. #13965. Thanks!

@koertkuipers
Copy link
Contributor

For parquet, json etc. path not being put in options is not an issue since
they don't retrieve it from the options
On Jun 29, 2016 2:31 AM, "Xiao Li" [email protected] wrote:

@zsxwing https://github.com/zsxwing If we just provide one path in the
function input, it will not put path into the options. The API parquet(path:
String)still calls load(paths : _*), instead of load(path). Thus, we will
introduce inconsistent behavior, compared with load(path: String).

Could you review the new PR I just submitted? Let me know if anything is
not appropriate. #13965 #13965.
Thanks!


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#13933 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAyIJPT3agmWQyqBUqrzFW5F2eE095Wtks5qQhFPgaJpZM4I_nrj
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants