Skip to content

Conversation

@drewrobb
Copy link
Contributor

@drewrobb drewrobb commented Oct 18, 2016

What changes were proposed in this pull request?

A call to the method SQLTransformer.transform previously would create a temporary table and never delete it. This change adds a call to dropTempView() that deletes this temporary table before returning the result so that the table will not remain in spark's table catalog. Because tableName is randomized and not exposed, there should be no expected use of this table outside of the transform method.

How was this patch tested?

A single new assertion was added to the existing test of the SQLTransformer.transform method that all temporary tables are removed. Without the corresponding code change, this new assertion fails. I am not aware of any circumstances in which removing this temporary view would be bad for performance or correctness in other ways, but some expertise here would be helpful.

assert(result.schema.toString == resultSchema.toString)
assert(resultSchema == expected.schema)
assert(result.collect().toSeq == expected.collect().toSeq)
assert(original.sparkSession.catalog.listTables().count() == 0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does assert(original.sparkSession.catalog.listTables().isEmpty) work here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I believe isEmpty is only currently defined for an RDD, not a Dataset, you could do collect().isEmpty but that would be equally verbose I think.

@drewrobb
Copy link
Contributor Author

@yanboliang would you be able to review this patch?

@yanboliang
Copy link
Contributor

cc @srowen @MLnick @jkbradley @rxin Could you add @drewrobb to whitelist? I can not trigger the Jenkins job. Thanks.

@MLnick
Copy link
Contributor

MLnick commented Oct 21, 2016

Jenkins add to whitelist

@SparkQA
Copy link

SparkQA commented Oct 21, 2016

Test build #67327 has finished for PR 15526 at commit d5c3b41.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yanboliang
Copy link
Contributor

Good catch! LGTM, merged into master and branch-2.0. Thanks! @drewrobb @leoromanovsky

asfgit pushed a commit that referenced this pull request Oct 22, 2016
## What changes were proposed in this pull request?

A call to the method `SQLTransformer.transform` previously would create a temporary table and never delete it. This change adds a call to `dropTempView()` that deletes this temporary table before returning the result so that the table will not remain in spark's table catalog. Because `tableName` is randomized and not exposed, there should be no expected use of this table outside of the `transform` method.

## How was this patch tested?

A single new assertion was added to the existing test of the `SQLTransformer.transform` method that all temporary tables are removed. Without the corresponding code change, this new assertion fails. I am not aware of any circumstances in which removing this temporary view would be bad for performance or correctness in other ways, but some expertise here would be helpful.

Author: Drew Robb <[email protected]>

Closes #15526 from drewrobb/SPARK-17986.

(cherry picked from commit ab3363e)
Signed-off-by: Yanbo Liang <[email protected]>
@asfgit asfgit closed this in ab3363e Oct 22, 2016
robert3005 pushed a commit to palantir/spark that referenced this pull request Nov 1, 2016
## What changes were proposed in this pull request?

A call to the method `SQLTransformer.transform` previously would create a temporary table and never delete it. This change adds a call to `dropTempView()` that deletes this temporary table before returning the result so that the table will not remain in spark's table catalog. Because `tableName` is randomized and not exposed, there should be no expected use of this table outside of the `transform` method.

## How was this patch tested?

A single new assertion was added to the existing test of the `SQLTransformer.transform` method that all temporary tables are removed. Without the corresponding code change, this new assertion fails. I am not aware of any circumstances in which removing this temporary view would be bad for performance or correctness in other ways, but some expertise here would be helpful.

Author: Drew Robb <[email protected]>

Closes apache#15526 from drewrobb/SPARK-17986.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
## What changes were proposed in this pull request?

A call to the method `SQLTransformer.transform` previously would create a temporary table and never delete it. This change adds a call to `dropTempView()` that deletes this temporary table before returning the result so that the table will not remain in spark's table catalog. Because `tableName` is randomized and not exposed, there should be no expected use of this table outside of the `transform` method.

## How was this patch tested?

A single new assertion was added to the existing test of the `SQLTransformer.transform` method that all temporary tables are removed. Without the corresponding code change, this new assertion fails. I am not aware of any circumstances in which removing this temporary view would be bad for performance or correctness in other ways, but some expertise here would be helpful.

Author: Drew Robb <[email protected]>

Closes apache#15526 from drewrobb/SPARK-17986.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants