Skip to content

Commit b959dab

Browse files
drewrobbyanboliang
authored andcommitted
[SPARK-17986][ML] SQLTransformer should remove temporary tables
## What changes were proposed in this pull request? A call to the method `SQLTransformer.transform` previously would create a temporary table and never delete it. This change adds a call to `dropTempView()` that deletes this temporary table before returning the result so that the table will not remain in spark's table catalog. Because `tableName` is randomized and not exposed, there should be no expected use of this table outside of the `transform` method. ## How was this patch tested? A single new assertion was added to the existing test of the `SQLTransformer.transform` method that all temporary tables are removed. Without the corresponding code change, this new assertion fails. I am not aware of any circumstances in which removing this temporary view would be bad for performance or correctness in other ways, but some expertise here would be helpful. Author: Drew Robb <[email protected]> Closes #15526 from drewrobb/SPARK-17986. (cherry picked from commit ab3363e) Signed-off-by: Yanbo Liang <[email protected]>
1 parent a0c03c9 commit b959dab

File tree

2 files changed

+4
-1
lines changed

2 files changed

+4
-1
lines changed

mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,9 @@ class SQLTransformer @Since("1.6.0") (@Since("1.6.0") override val uid: String)
6767
val tableName = Identifiable.randomUID(uid)
6868
dataset.createOrReplaceTempView(tableName)
6969
val realStatement = $(statement).replace(tableIdentifier, tableName)
70-
dataset.sparkSession.sql(realStatement)
70+
val result = dataset.sparkSession.sql(realStatement)
71+
dataset.sparkSession.catalog.dropTempView(tableName)
72+
result
7173
}
7274

7375
@Since("1.6.0")

mllib/src/test/scala/org/apache/spark/ml/feature/SQLTransformerSuite.scala

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ class SQLTransformerSuite
4343
assert(result.schema.toString == resultSchema.toString)
4444
assert(resultSchema == expected.schema)
4545
assert(result.collect().toSeq == expected.collect().toSeq)
46+
assert(original.sparkSession.catalog.listTables().count() == 0)
4647
}
4748

4849
test("read/write") {

0 commit comments

Comments
 (0)