-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-16126] [SQL] Better Error Message When using DataFrameReader without path
#13837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
8d021e4
5e4a3c6
2643715
cfc0188
3007fe6
a1ae724
635046a
6bf0779
b6bdf92
4511037
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -35,6 +35,7 @@ import org.apache.spark.sql.execution.datasources.csv.CSVFileFormat | |
| import org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider | ||
| import org.apache.spark.sql.execution.datasources.json.JsonFileFormat | ||
| import org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat | ||
| import org.apache.spark.sql.execution.datasources.text.TextFileFormat | ||
| import org.apache.spark.sql.execution.streaming._ | ||
| import org.apache.spark.sql.sources._ | ||
| import org.apache.spark.sql.streaming.OutputMode | ||
|
|
@@ -322,6 +323,9 @@ case class DataSource( | |
| val equality = sparkSession.sessionState.conf.resolver | ||
| StructType(schema.filterNot(f => partitionColumns.exists(equality(_, f.name)))) | ||
| }.orElse { | ||
| if (allPaths.isEmpty && !format.isInstanceOf[TextFileFormat]) { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @gatorsmile, would this be better if we explain here text data source is excluded because text datasource always uses a schema consisting of a string field if the schema is not explicitly given? BTW, should we maybe change |
||
| throw new IllegalArgumentException("'path' is not specified") | ||
| } | ||
| format.inferSchema( | ||
| sparkSession, | ||
| caseInsensitiveOptions, | ||
|
|
@@ -369,6 +373,8 @@ case class DataSource( | |
| val path = new Path(allPaths.head) | ||
| val fs = path.getFileSystem(sparkSession.sessionState.newHadoopConf()) | ||
| path.makeQualified(fs.getUri, fs.getWorkingDirectory) | ||
| } else if (allPaths.length < 1) { | ||
| throw new IllegalArgumentException("'path' is not specified") | ||
| } else { | ||
| throw new IllegalArgumentException("Expected exactly one path to be specified, but " + | ||
| s"got: ${allPaths.mkString(", ")}") | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -40,7 +40,7 @@ private[parquet] class ParquetOptions( | |||||||
| if (!shortParquetCompressionCodecNames.contains(codecName)) { | ||||||||
| val availableCodecs = shortParquetCompressionCodecNames.keys.map(_.toLowerCase) | ||||||||
| throw new IllegalArgumentException(s"Codec [$codecName] " + | ||||||||
| s"is not available. Available codecs are ${availableCodecs.mkString(", ")}.") | ||||||||
| s"is not available. Known codecs are ${availableCodecs.mkString(", ")}.") | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why this change?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to make it consistent with the output of the other cases. See the code: spark/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/CompressionCodecs.scala Lines 49 to 51 in d6dc12e
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||||||||
| } | ||||||||
| shortParquetCompressionCodecNames(codecName).name() | ||||||||
| } | ||||||||
|
|
||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recall this test is intentionally testing without path argument?
cc @HyukjinKwon
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for cc'ing me. Yes, I did. It seems the changes are reasonable as it seems this checking applies to the data sources that need
path.