Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,10 @@ trait Catalog {

def lookupRelation(tableIdent: TableIdentifier, alias: Option[String] = None): LogicalPlan

def setCurrentDatabase(databaseName: String): Unit = {
throw new UnsupportedOperationException
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Why create a dummy implementation here? This is not consistent with the way Catalog is currently written. The other option creates more overhead though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not all catalog support database concept. So, inherited catalog can choose to implement it or not. If we don't provide a dummy implementation here, all catalogs need to implement it even it doesn't support database.

}

/**
* Returns tuples of (tableName, isTemporary) for all tables in the given database.
* isTemporary is a Boolean value indicates if a table is a temporary or not.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ private[sql] class SparkQl(conf: ParserConf = SimpleParserConf()) extends Cataly
getClauses(Seq("TOK_QUERY", "FORMATTED", "EXTENDED"), explainArgs)
ExplainCommand(nodeToPlan(query), extended = extended.isDefined)

case Token("TOK_SWITCHDATABASE", Token(database, Nil) :: Nil) =>
SetDatabaseCommand(cleanIdentifier(database))

case Token("TOK_DESCTABLE", describeArgs) =>
// Reference: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
val Some(tableType) :: formatted :: extended :: pretty :: Nil =
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -408,3 +408,13 @@ case class DescribeFunction(
}
}
}

case class SetDatabaseCommand(databaseName: String) extends RunnableCommand {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are introducing the concept of a database change in Catalyst. Shouldn't we also have SetDatabaseCommand in Catalyst? Much like ShowFunctions and DescribeFunction?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean should we move parsing TOK_SWITCHDATABASE to CatalystQl, as currently I see SQLContext use SparkQl not CatalystQl, this should not be a problem now. I have no strong opinion about this. We can move it now or when Catalyst supports database. How do you think?


override def run(sqlContext: SQLContext): Seq[Row] = {
sqlContext.catalog.setCurrentDatabase(databaseName)
Seq.empty[Row]
}

override val output: Seq[Attribute] = Seq.empty
}
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ class CliSuite extends SparkFunSuite with BeforeAndAfterAll with Logging {
"CREATE DATABASE hive_test_db;"
-> "OK",
"USE hive_test_db;"
-> "OK",
-> "",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a dumb idea: we could return Ok...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return OK will break hive compatibility test. I've tried in previous commits.

"CREATE TABLE hive_test(key INT, val STRING);"
-> "OK",
"SHOW TABLES;"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -705,6 +705,10 @@ private[hive] class HiveMetastoreCatalog(val client: ClientInterface, hive: Hive
}

override def unregisterAllTables(): Unit = {}

override def setCurrentDatabase(databaseName: String): Unit = {
client.setCurrentDatabase(databaseName)
}
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -156,8 +156,6 @@ private[hive] class HiveQl(conf: ParserConf) extends SparkQl(conf) with Logging
"TOK_SHOWLOCKS",
"TOK_SHOWPARTITIONS",

"TOK_SWITCHDATABASE",

"TOK_UNLOCKTABLE"
)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,9 @@ private[hive] trait ClientInterface {
/** Returns the name of the active database. */
def currentDatabase: String
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we also have a currentDatabase function in the Catalog?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we should - the only thing is that the simple catalog in non-Hive doesn't support databases yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I think we don't need to address database support for all catalogs in this PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that when an interface allows us to set the current database, it should also allow us read it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add a default implementation.


/** Sets the name of current database. */
def setCurrentDatabase(databaseName: String): Unit

/** Returns the metadata for specified database, throwing an exception if it doesn't exist */
def getDatabase(name: String): HiveDatabase = {
getDatabaseOption(name).getOrElse(throw new NoSuchDatabaseException)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ import org.apache.hadoop.hive.shims.{HadoopShims, ShimLoader}
import org.apache.hadoop.security.UserGroupInformation

import org.apache.spark.{Logging, SparkConf, SparkException}
import org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException
import org.apache.spark.sql.catalyst.expressions.Expression
import org.apache.spark.sql.execution.QueryExecutionException
import org.apache.spark.util.{CircularBuffer, Utils}
Expand Down Expand Up @@ -229,6 +230,14 @@ private[hive] class ClientWrapper(
state.getCurrentDatabase
}

override def setCurrentDatabase(databaseName: String): Unit = withHiveState {
if (getDatabaseOption(databaseName).isDefined) {
state.setCurrentDatabase(databaseName)
} else {
throw new NoSuchDatabaseException
}
}

override def createDatabase(database: HiveDatabase): Unit = withHiveState {
client.createDatabase(
new Database(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ import org.scalatest.BeforeAndAfter

import org.apache.spark.{SparkException, SparkFiles}
import org.apache.spark.sql.{AnalysisException, DataFrame, Row}
import org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException
import org.apache.spark.sql.catalyst.expressions.Cast
import org.apache.spark.sql.catalyst.plans.logical.Project
import org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoin
Expand Down Expand Up @@ -1262,6 +1263,21 @@ class HiveQuerySuite extends HiveComparisonTest with BeforeAndAfter {

}

test("use database") {
val currentDatabase = sql("select current_database()").first().getString(0)

sql("CREATE DATABASE hive_test_db")
sql("USE hive_test_db")
assert("hive_test_db" == sql("select current_database()").first().getString(0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current_database() command is currently added in the HiveContext.

We could move the command to catalyst, and add the command itself in the SQLContext.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we already have database support in SQLContext?


intercept[NoSuchDatabaseException] {
sql("USE not_existing_db")
}

sql(s"USE $currentDatabase")
assert(currentDatabase == sql("select current_database()").first().getString(0))
}

test("lookup hive UDF in another thread") {
val e = intercept[AnalysisException] {
range(1).selectExpr("not_a_udf()")
Expand Down