Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.sql.jdbc

import java.util.Locale

private case class HiveDialect() extends JdbcDialect {
override def canHandle(url: String): Boolean =
url.toLowerCase(Locale.ROOT).startsWith("jdbc:hive2")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I understand your proposal, I'm not sure HiveDialect is a valid name in Apache Spark community because Apache Spark ThriftServer also uses jdbc:hive2. You want to achieve to introduce Hive-specific syntax via this HiveDialect instead of Spark Thrift Server, right?

Copy link
Member Author

@xleoken xleoken Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @dongjoon-hyun, thanks for your review.
Your considerations are correct, but this patch is applicable to both Hive Thrift Server and Spart Thrift Server.

You want to achieve to introduce Hive-specific syntax via this HiveDialect instead of Spark Thrift Server, right?

Actually, it's not. I used sbin/start-thriftserver.sh in the production environment.

I'm not sure HiveDialect is a valid name in Apache Spark community

OK, HiveDialect seems better for jdbc:hive2. In the future, if encountering Hive-specific syntax or SparkSQL-specific syntax issue, we can distinguish between Hive and Spark in specific methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are trying to use this for Spark Thrift Server, this should be SparkDialect in Spark community. However, in that case, it will look very weird because Apache Spark needs a direct to access itself. That's the meaning why we don't want to add any SparkDialect or HiveDialect.

Actually, it's not. I used sbin/start-thriftserver.sh in the production environment.

Copy link
Member Author

@xleoken xleoken Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @dongjoon-hyun, we want to query data from two independent data centers, so we use multiple spark jdbc catalogs.

image


override def quoteIdentifier(colName: String): String = {
s"`$colName`"
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -787,6 +787,7 @@ class JDBCSuite extends QueryTest with SharedSparkSession {

test("Default jdbc dialect registration") {
assert(JdbcDialects.get("jdbc:mysql://127.0.0.1/db") === MySQLDialect())
assert(JdbcDialects.get("jdbc:hive2://127.0.0.1/db") === HiveDialect())
assert(JdbcDialects.get("jdbc:postgresql://127.0.0.1/db") === PostgresDialect())
assert(JdbcDialects.get("jdbc:db2://127.0.0.1/db") === DB2Dialect())
assert(JdbcDialects.get("jdbc:sqlserver://127.0.0.1/db") === MsSqlServerDialect())
Expand Down