-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-17986] [ML] SQLTransformer should remove temporary tables #15526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| assert(result.schema.toString == resultSchema.toString) | ||
| assert(resultSchema == expected.schema) | ||
| assert(result.collect().toSeq == expected.collect().toSeq) | ||
| assert(original.sparkSession.catalog.listTables().count() == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does assert(original.sparkSession.catalog.listTables().isEmpty) work here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I believe isEmpty is only currently defined for an RDD, not a Dataset, you could do collect().isEmpty but that would be equally verbose I think.
|
@yanboliang would you be able to review this patch? |
|
cc @srowen @MLnick @jkbradley @rxin Could you add @drewrobb to whitelist? I can not trigger the Jenkins job. Thanks. |
|
Jenkins add to whitelist |
|
Test build #67327 has finished for PR 15526 at commit
|
|
Good catch! LGTM, merged into master and branch-2.0. Thanks! @drewrobb @leoromanovsky |
## What changes were proposed in this pull request? A call to the method `SQLTransformer.transform` previously would create a temporary table and never delete it. This change adds a call to `dropTempView()` that deletes this temporary table before returning the result so that the table will not remain in spark's table catalog. Because `tableName` is randomized and not exposed, there should be no expected use of this table outside of the `transform` method. ## How was this patch tested? A single new assertion was added to the existing test of the `SQLTransformer.transform` method that all temporary tables are removed. Without the corresponding code change, this new assertion fails. I am not aware of any circumstances in which removing this temporary view would be bad for performance or correctness in other ways, but some expertise here would be helpful. Author: Drew Robb <[email protected]> Closes #15526 from drewrobb/SPARK-17986. (cherry picked from commit ab3363e) Signed-off-by: Yanbo Liang <[email protected]>
## What changes were proposed in this pull request? A call to the method `SQLTransformer.transform` previously would create a temporary table and never delete it. This change adds a call to `dropTempView()` that deletes this temporary table before returning the result so that the table will not remain in spark's table catalog. Because `tableName` is randomized and not exposed, there should be no expected use of this table outside of the `transform` method. ## How was this patch tested? A single new assertion was added to the existing test of the `SQLTransformer.transform` method that all temporary tables are removed. Without the corresponding code change, this new assertion fails. I am not aware of any circumstances in which removing this temporary view would be bad for performance or correctness in other ways, but some expertise here would be helpful. Author: Drew Robb <[email protected]> Closes apache#15526 from drewrobb/SPARK-17986.
## What changes were proposed in this pull request? A call to the method `SQLTransformer.transform` previously would create a temporary table and never delete it. This change adds a call to `dropTempView()` that deletes this temporary table before returning the result so that the table will not remain in spark's table catalog. Because `tableName` is randomized and not exposed, there should be no expected use of this table outside of the `transform` method. ## How was this patch tested? A single new assertion was added to the existing test of the `SQLTransformer.transform` method that all temporary tables are removed. Without the corresponding code change, this new assertion fails. I am not aware of any circumstances in which removing this temporary view would be bad for performance or correctness in other ways, but some expertise here would be helpful. Author: Drew Robb <[email protected]> Closes apache#15526 from drewrobb/SPARK-17986.
What changes were proposed in this pull request?
A call to the method
SQLTransformer.transformpreviously would create a temporary table and never delete it. This change adds a call todropTempView()that deletes this temporary table before returning the result so that the table will not remain in spark's table catalog. BecausetableNameis randomized and not exposed, there should be no expected use of this table outside of thetransformmethod.How was this patch tested?
A single new assertion was added to the existing test of the
SQLTransformer.transformmethod that all temporary tables are removed. Without the corresponding code change, this new assertion fails. I am not aware of any circumstances in which removing this temporary view would be bad for performance or correctness in other ways, but some expertise here would be helpful.