You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-26794][SQL] SparkSession enableHiveSupport does not point to hive but in-memory while the SparkContext exists
## What changes were proposed in this pull request?
```java
public class SqlDemo {
public static void main(final String[] args) throws Exception {
SparkConf conf = new SparkConf().setAppName("spark-sql-demo");
JavaSparkContext sc = new JavaSparkContext(conf);
SparkSession ss = SparkSession.builder().enableHiveSupport().getOrCreate();
ss.sql("show databases").show();
}
}
```
Before https://issues.apache.org/jira/browse/SPARK-20946, the demo above point to the right hive metastore if the hive-site.xml is present. But now it can only point to the default in-memory one.
Catalog is now as a variable shared across SparkSessions, it is instantiated with SparkContext's conf. After https://issues.apache.org/jira/browse/SPARK-20946, Session level configs are not pass to SparkContext's conf anymore, so the enableHiveSupport API takes no affect on the catalog instance.
You can set spark.sql.catalogImplementation=hive application wide to solve the problem, or never create a sc before you call SparkSession.builder().enableHiveSupport().getOrCreate()
Here we respect the SparkSession level configuration at the first time to generate catalog within SharedState
## How was this patch tested?
1. add ut
2. manually
```scala
test("enableHiveSupport has right to determine the catalog while using an existing sc") {
val conf = new SparkConf().setMaster("local").setAppName("SharedState Test")
val sc = SparkContext.getOrCreate(conf)
val ss = SparkSession.builder().enableHiveSupport().getOrCreate()
assert(ss.sharedState.externalCatalog.unwrapped.isInstanceOf[HiveExternalCatalog],
"The catalog should be hive ")
val ss2 = SparkSession.builder().getOrCreate()
assert(ss2.sharedState.externalCatalog.unwrapped.isInstanceOf[HiveExternalCatalog],
"The catalog should be shared across sessions")
}
```
Without this fix, the above test will fail.
You can apply it to `org.apache.spark.sql.hive.HiveSharedStateSuite`,
and run,
```sbt
./build/sbt -Phadoop-2.7 -Phive "hive/testOnly org.apache.spark.sql.hive.HiveSharedStateSuite"
```
to verify.
Closes#23709 from yaooqinn/SPARK-26794.
Authored-by: Kent Yao <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
0 commit comments