Skip to content

Commit 5ae1c65

Browse files
HyukjinKwonsrowen
authored andcommitted
[SPARK-19707][SPARK-18922][TESTS][SQL][CORE] Fix test failures/the invalid path check for sc.addJar on Windows
## What changes were proposed in this pull request? This PR proposes two things: - A follow up for SPARK-19707 (Improving the invalid path check for sc.addJar on Windows as well). ``` org.apache.spark.SparkContextSuite: - add jar with invalid path *** FAILED *** (32 milliseconds) 2 was not equal to 1 (SparkContextSuite.scala:309) ... ``` - Fix path vs URI related test failures on Windows. ``` org.apache.spark.storage.LocalDirsSuite: - SPARK_LOCAL_DIRS override also affects driver *** FAILED *** (0 milliseconds) new java.io.File("/NONEXISTENT_PATH").exists() was true (LocalDirsSuite.scala:50) ... - Utils.getLocalDir() throws an exception if any temporary directory cannot be retrieved *** FAILED *** (15 milliseconds) Expected exception java.io.IOException to be thrown, but no exception was thrown. (LocalDirsSuite.scala:64) ... ``` ``` org.apache.spark.sql.hive.HiveSchemaInferenceSuite: - orc: schema should be inferred and saved when INFER_AND_SAVE is specified *** FAILED *** (203 milliseconds) java.net.URISyntaxException: Illegal character in opaque part at index 2: C:\projects\spark\target\tmp\spark-dae61ab3-a851-4dd3-bf4e-be97c501f254 ... - parquet: schema should be inferred and saved when INFER_AND_SAVE is specified *** FAILED *** (203 milliseconds) java.net.URISyntaxException: Illegal character in opaque part at index 2: C:\projects\spark\target\tmp\spark-fa3aff89-a66e-4376-9a37-2a9b87596939 ... - orc: schema should be inferred but not stored when INFER_ONLY is specified *** FAILED *** (141 milliseconds) java.net.URISyntaxException: Illegal character in opaque part at index 2: C:\projects\spark\target\tmp\spark-fb464e59-b049-481b-9c75-f53295c9fc2c ... - parquet: schema should be inferred but not stored when INFER_ONLY is specified *** FAILED *** (125 milliseconds) java.net.URISyntaxException: Illegal character in opaque part at index 2: C:\projects\spark\target\tmp\spark-9487568e-80a4-42b3-b0a5-d95314c4ccbc ... - orc: schema should not be inferred when NEVER_INFER is specified *** FAILED *** (156 milliseconds) java.net.URISyntaxException: Illegal character in opaque part at index 2: C:\projects\spark\target\tmp\spark-0d2dfa45-1b0f-4958-a8be-1074ed0135a ... - parquet: schema should not be inferred when NEVER_INFER is specified *** FAILED *** (547 milliseconds) java.net.URISyntaxException: Illegal character in opaque part at index 2: C:\projects\spark\target\tmp\spark-6d95d64e-613e-4a59-a0f6-d198c5aa51ee ... ``` ``` org.apache.spark.sql.execution.command.DDLSuite: - create temporary view using *** FAILED *** (15 milliseconds) org.apache.spark.sql.AnalysisException: Path does not exist: file:/C:projectsspark arget mpspark-3881d9ca-561b-488d-90b9-97587472b853 mp; ... - insert data to a data source table which has a non-existing location should succeed *** FAILED *** (109 milliseconds) file:/C:projectsspark%09arget%09mpspark-4cad3d19-6085-4b75-b407-fe5e9d21df54 did not equal file:///C:/projects/spark/target/tmp/spark-4cad3d19-6085-4b75-b407-fe5e9d21df54 (DDLSuite.scala:1869) ... - insert into a data source table with a non-existing partition location should succeed *** FAILED *** (94 milliseconds) file:/C:projectsspark%09arget%09mpspark-4b52e7de-e3aa-42fd-95d4-6d4d58d1d95d did not equal file:///C:/projects/spark/target/tmp/spark-4b52e7de-e3aa-42fd-95d4-6d4d58d1d95d (DDLSuite.scala:1910) ... - read data from a data source table which has a non-existing location should succeed *** FAILED *** (93 milliseconds) file:/C:projectsspark%09arget%09mpspark-f8c281e2-08c2-4f73-abbf-f3865b702c34 did not equal file:///C:/projects/spark/target/tmp/spark-f8c281e2-08c2-4f73-abbf-f3865b702c34 (DDLSuite.scala:1937) ... - read data from a data source table with non-existing partition location should succeed *** FAILED *** (110 milliseconds) java.lang.IllegalArgumentException: Can not create a Path from an empty string ... - create datasource table with a non-existing location *** FAILED *** (94 milliseconds) file:/C:projectsspark%09arget%09mpspark-387316ae-070c-4e78-9b78-19ebf7b29ec8 did not equal file:///C:/projects/spark/target/tmp/spark-387316ae-070c-4e78-9b78-19ebf7b29ec8 (DDLSuite.scala:1982) ... - CTAS for external data source table with a non-existing location *** FAILED *** (16 milliseconds) java.lang.IllegalArgumentException: Can not create a Path from an empty string ... - CTAS for external data source table with a existed location *** FAILED *** (15 milliseconds) java.lang.IllegalArgumentException: Can not create a Path from an empty string ... - data source table:partition column name containing a b *** FAILED *** (125 milliseconds) java.lang.IllegalArgumentException: Can not create a Path from an empty string ... - data source table:partition column name containing a:b *** FAILED *** (143 milliseconds) java.lang.IllegalArgumentException: Can not create a Path from an empty string ... - data source table:partition column name containing a%b *** FAILED *** (109 milliseconds) java.lang.IllegalArgumentException: Can not create a Path from an empty string ... - data source table:partition column name containing a,b *** FAILED *** (109 milliseconds) java.lang.IllegalArgumentException: Can not create a Path from an empty string ... - location uri contains a b for datasource table *** FAILED *** (94 milliseconds) file:/C:projectsspark%09arget%09mpspark-5739cda9-b702-4e14-932c-42e8c4174480a%20b did not equal file:///C:/projects/spark/target/tmp/spark-5739cda9-b702-4e14-932c-42e8c4174480/a%20b (DDLSuite.scala:2084) ... - location uri contains a:b for datasource table *** FAILED *** (78 milliseconds) file:/C:projectsspark%09arget%09mpspark-9bdd227c-840f-4f08-b7c5-4036638f098da:b did not equal file:///C:/projects/spark/target/tmp/spark-9bdd227c-840f-4f08-b7c5-4036638f098d/a:b (DDLSuite.scala:2084) ... - location uri contains a%b for datasource table *** FAILED *** (78 milliseconds) file:/C:projectsspark%09arget%09mpspark-62bb5f1d-fa20-460a-b534-cb2e172a3640a%25b did not equal file:///C:/projects/spark/target/tmp/spark-62bb5f1d-fa20-460a-b534-cb2e172a3640/a%25b (DDLSuite.scala:2084) ... - location uri contains a b for database *** FAILED *** (16 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); ... - location uri contains a:b for database *** FAILED *** (15 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); ... - location uri contains a%b for database *** FAILED *** (0 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); ... ``` ``` org.apache.spark.sql.hive.execution.HiveDDLSuite: - create hive table with a non-existing location *** FAILED *** (16 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); ... - CTAS for external hive table with a non-existing location *** FAILED *** (16 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); ... - CTAS for external hive table with a existed location *** FAILED *** (16 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); ... - partition column name of parquet table containing a b *** FAILED *** (156 milliseconds) java.lang.IllegalArgumentException: Can not create a Path from an empty string ... - partition column name of parquet table containing a:b *** FAILED *** (94 milliseconds) java.lang.IllegalArgumentException: Can not create a Path from an empty string ... - partition column name of parquet table containing a%b *** FAILED *** (125 milliseconds) java.lang.IllegalArgumentException: Can not create a Path from an empty string ... - partition column name of parquet table containing a,b *** FAILED *** (110 milliseconds) java.lang.IllegalArgumentException: Can not create a Path from an empty string ... - partition column name of hive table containing a b *** FAILED *** (15 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); ... - partition column name of hive table containing a:b *** FAILED *** (16 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); ... - partition column name of hive table containing a%b *** FAILED *** (16 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); ... - partition column name of hive table containing a,b *** FAILED *** (0 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); ... - hive table: location uri contains a b *** FAILED *** (0 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); ... - hive table: location uri contains a:b *** FAILED *** (0 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); ... - hive table: location uri contains a%b *** FAILED *** (0 milliseconds) org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string); ... ``` ``` org.apache.spark.sql.sources.PathOptionSuite: - path option also exist for write path *** FAILED *** (94 milliseconds) file:/C:projectsspark%09arget%09mpspark-2870b281-7ac0-43d6-b6b6-134e01ab6fdc did not equal file:///C:/projects/spark/target/tmp/spark-2870b281-7ac0-43d6-b6b6-134e01ab6fdc (PathOptionSuite.scala:98) ... ``` ``` org.apache.spark.sql.CachedTableSuite: - SPARK-19765: UNCACHE TABLE should un-cache all cached plans that refer to this table *** FAILED *** (110 milliseconds) java.lang.IllegalArgumentException: Can not create a Path from an empty string ... ``` ``` org.apache.spark.sql.execution.DataSourceScanExecRedactionSuite: - treeString is redacted *** FAILED *** (250 milliseconds) "file:/C:/projects/spark/target/tmp/spark-3ecc1fa4-3e76-489c-95f4-f0b0500eae28" did not contain "C:\projects\spark\target\tmp\spark-3ecc1fa4-3e76-489c-95f4-f0b0500eae28" (DataSourceScanExecRedactionSuite.scala:46) ... ``` ## How was this patch tested? Tested via AppVeyor for each and checked it passed once each. These should be retested via AppVeyor in this PR. Author: hyukjinkwon <[email protected]> Closes #17987 from HyukjinKwon/windows-20170515. (cherry picked from commit e9f983d) Signed-off-by: Sean Owen <[email protected]>
1 parent 022a495 commit 5ae1c65

File tree

9 files changed

+145
-73
lines changed

9 files changed

+145
-73
lines changed

core/src/main/scala/org/apache/spark/SparkContext.scala

Lines changed: 23 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1801,40 +1801,39 @@ class SparkContext(config: SparkConf) extends Logging {
18011801
* an HTTP, HTTPS or FTP URI, or local:/path for a file on every worker node.
18021802
*/
18031803
def addJar(path: String) {
1804+
def addJarFile(file: File): String = {
1805+
try {
1806+
if (!file.exists()) {
1807+
throw new FileNotFoundException(s"Jar ${file.getAbsolutePath} not found")
1808+
}
1809+
if (file.isDirectory) {
1810+
throw new IllegalArgumentException(
1811+
s"Directory ${file.getAbsoluteFile} is not allowed for addJar")
1812+
}
1813+
env.rpcEnv.fileServer.addJar(file)
1814+
} catch {
1815+
case NonFatal(e) =>
1816+
logError(s"Failed to add $path to Spark environment", e)
1817+
null
1818+
}
1819+
}
1820+
18041821
if (path == null) {
18051822
logWarning("null specified as parameter to addJar")
18061823
} else {
1807-
var key = ""
1808-
if (path.contains("\\")) {
1824+
val key = if (path.contains("\\")) {
18091825
// For local paths with backslashes on Windows, URI throws an exception
1810-
key = env.rpcEnv.fileServer.addJar(new File(path))
1826+
addJarFile(new File(path))
18111827
} else {
18121828
val uri = new URI(path)
18131829
// SPARK-17650: Make sure this is a valid URL before adding it to the list of dependencies
18141830
Utils.validateURL(uri)
1815-
key = uri.getScheme match {
1831+
uri.getScheme match {
18161832
// A JAR file which exists only on the driver node
1817-
case null | "file" =>
1818-
try {
1819-
val file = new File(uri.getPath)
1820-
if (!file.exists()) {
1821-
throw new FileNotFoundException(s"Jar ${file.getAbsolutePath} not found")
1822-
}
1823-
if (file.isDirectory) {
1824-
throw new IllegalArgumentException(
1825-
s"Directory ${file.getAbsoluteFile} is not allowed for addJar")
1826-
}
1827-
env.rpcEnv.fileServer.addJar(new File(uri.getPath))
1828-
} catch {
1829-
case NonFatal(e) =>
1830-
logError(s"Failed to add $path to Spark environment", e)
1831-
null
1832-
}
1833+
case null | "file" => addJarFile(new File(uri.getPath))
18331834
// A JAR file which exists locally on every worker node
1834-
case "local" =>
1835-
"file:" + uri.getPath
1836-
case _ =>
1837-
path
1835+
case "local" => "file:" + uri.getPath
1836+
case _ => path
18381837
}
18391838
}
18401839
if (key != null) {

core/src/test/scala/org/apache/spark/SparkContextSuite.scala

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -300,13 +300,13 @@ class SparkContextSuite extends SparkFunSuite with LocalSparkContext with Eventu
300300
sc = new SparkContext(new SparkConf().setAppName("test").setMaster("local"))
301301
sc.addJar(tmpJar.getAbsolutePath)
302302

303-
// Invaid jar path will only print the error log, will not add to file server.
303+
// Invalid jar path will only print the error log, will not add to file server.
304304
sc.addJar("dummy.jar")
305305
sc.addJar("")
306306
sc.addJar(tmpDir.getAbsolutePath)
307307

308-
sc.listJars().size should be (1)
309-
sc.listJars().head should include (tmpJar.getName)
308+
assert(sc.listJars().size == 1)
309+
assert(sc.listJars().head.contains(tmpJar.getName))
310310
}
311311

312312
test("Cancelling job group should not cause SparkContext to shutdown (SPARK-6414)") {

core/src/test/scala/org/apache/spark/storage/LocalDirsSuite.scala

Lines changed: 30 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -37,27 +37,50 @@ class LocalDirsSuite extends SparkFunSuite with BeforeAndAfter {
3737
Utils.clearLocalRootDirs()
3838
}
3939

40+
private def assumeNonExistentAndNotCreatable(f: File): Unit = {
41+
try {
42+
assume(!f.exists() && !f.mkdirs())
43+
} finally {
44+
Utils.deleteRecursively(f)
45+
}
46+
}
47+
4048
test("Utils.getLocalDir() returns a valid directory, even if some local dirs are missing") {
4149
// Regression test for SPARK-2974
42-
assert(!new File("/NONEXISTENT_PATH").exists())
50+
val f = new File("/NONEXISTENT_PATH")
51+
assumeNonExistentAndNotCreatable(f)
52+
4353
val conf = new SparkConf(false)
4454
.set("spark.local.dir", s"/NONEXISTENT_PATH,${System.getProperty("java.io.tmpdir")}")
4555
assert(new File(Utils.getLocalDir(conf)).exists())
56+
57+
// This directory should not be created.
58+
assert(!f.exists())
4659
}
4760

4861
test("SPARK_LOCAL_DIRS override also affects driver") {
49-
// Regression test for SPARK-2975
50-
assert(!new File("/NONEXISTENT_PATH").exists())
62+
// Regression test for SPARK-2974
63+
val f = new File("/NONEXISTENT_PATH")
64+
assumeNonExistentAndNotCreatable(f)
65+
5166
// spark.local.dir only contains invalid directories, but that's not a problem since
5267
// SPARK_LOCAL_DIRS will override it on both the driver and workers:
5368
val conf = new SparkConfWithEnv(Map("SPARK_LOCAL_DIRS" -> System.getProperty("java.io.tmpdir")))
5469
.set("spark.local.dir", "/NONEXISTENT_PATH")
5570
assert(new File(Utils.getLocalDir(conf)).exists())
71+
72+
// This directory should not be created.
73+
assert(!f.exists())
5674
}
5775

5876
test("Utils.getLocalDir() throws an exception if any temporary directory cannot be retrieved") {
5977
val path1 = "/NONEXISTENT_PATH_ONE"
6078
val path2 = "/NONEXISTENT_PATH_TWO"
79+
val f1 = new File(path1)
80+
val f2 = new File(path2)
81+
assumeNonExistentAndNotCreatable(f1)
82+
assumeNonExistentAndNotCreatable(f2)
83+
6184
assert(!new File(path1).exists())
6285
assert(!new File(path2).exists())
6386
val conf = new SparkConf(false).set("spark.local.dir", s"$path1,$path2")
@@ -67,5 +90,9 @@ class LocalDirsSuite extends SparkFunSuite with BeforeAndAfter {
6790
// If any temporary directory could not be retrieved under the given paths above, it should
6891
// throw an exception with the message that includes the paths.
6992
assert(message.contains(s"$path1,$path2"))
93+
94+
// These directories should not be created.
95+
assert(!f1.exists())
96+
assert(!f2.exists())
7097
}
7198
}

sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -647,7 +647,7 @@ class CachedTableSuite extends QueryTest with SQLTestUtils with SharedSQLContext
647647
withTable("t") {
648648
withTempPath { path =>
649649
Seq(1 -> "a").toDF("i", "j").write.parquet(path.getCanonicalPath)
650-
sql(s"CREATE TABLE t USING parquet LOCATION '$path'")
650+
sql(s"CREATE TABLE t USING parquet LOCATION '${path.toURI}'")
651651
spark.catalog.cacheTable("t")
652652
spark.table("t").select($"i").cache()
653653
checkAnswer(spark.table("t").select($"i"), Row(1))

sql/core/src/test/scala/org/apache/spark/sql/execution/DataSourceScanExecRedactionSuite.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ class DataSourceScanExecRedactionSuite extends QueryTest with SharedSQLContext {
3838

3939
val rootPath = df.queryExecution.sparkPlan.find(_.isInstanceOf[FileSourceScanExec]).get
4040
.asInstanceOf[FileSourceScanExec].relation.location.rootPaths.head
41-
assert(rootPath.toString.contains(basePath.toString))
41+
assert(rootPath.toString.contains(dir.toURI.getPath.stripSuffix("/")))
4242

4343
assert(!df.queryExecution.sparkPlan.treeString(verbose = true).contains(rootPath.getName))
4444
assert(!df.queryExecution.executedPlan.treeString(verbose = true).contains(rootPath.getName))

0 commit comments

Comments
 (0)