-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-15367] [SQL] Add refreshTable back #13156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| */ | ||
| def refreshTable(tableName: String): Unit = { | ||
| sparkSession.catalog.refreshTable(tableName) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method did not exist in SQLContext. It's in HiveContext.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will remove it. Thanks!
|
Test build #58725 has finished for PR 13156 at commit
|
| .saveAsTable("arrayInParquet") | ||
|
|
||
| sessionState.refreshTable("arrayInParquet") | ||
| sparkSession.catalog.refreshTable("arrayInParquet") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we don't call refreshTable through SessionState, do we still need to keep SessionState.refreshTable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I also want to remove invalidateTable, which is a duplicate name of refreshTable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, invalidateTable and refreshTable do have different meanings. The current implementation of HiveMetastoreCatalog.refreshTable is HiveMetastoreCatalog.invalidateTable (and then we retrieve the new metadata lazily). But, it does not mean that refreshTable and invalidateTable have the same semantic. If we should remove any of invalidateTable or refreshTable should be discussed in a different thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Thanks!
|
Test build #58740 has finished for PR 13156 at commit
|
| sparkSession.cacheManager.tryUncacheQuery(df, blocking = true) | ||
| // Cache it again. | ||
| sparkSession.cacheManager.cacheQuery(df, Some(tableIdent.table)) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above logics are moved to sparkSession.catalog.refreshTable
|
Test build #58805 has finished for PR 13156 at commit
|
|
Test build #58808 has finished for PR 13156 at commit
|
| val _hc = hc | ||
| import _hc.implicits._ | ||
|
|
||
| withTempPath { tempDir => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, let me remove it. : )
|
LGTM except the test stuff, thanks for working on it! I agree that we should remove |
|
Test build #58831 has finished for PR 13156 at commit
|
| * | ||
| * @since 1.3.0 | ||
| */ | ||
| def refreshTable(tableName: String): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if invalidateTable has different meaning than refreshTable, should we also add it to HiveContext? cc @yhuai
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class is for the compatibility purpose. Let's leave it as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
LGTM other than the renaming. We shouldn't have |
|
I see. |
|
Test build #58905 has finished for PR 13156 at commit
|
|
retest this please |
|
Test build #58934 has finished for PR 13156 at commit
|
|
retest this please |
|
LGTM, pending jenkins |
|
retest this please |
|
Test build #58940 has finished for PR 13156 at commit
|
#### What changes were proposed in this pull request? `refreshTable` was a method in `HiveContext`. It was deleted accidentally while we were migrating the APIs. This PR is to add it back to `HiveContext`. In addition, in `SparkSession`, we put it under the catalog namespace (`SparkSession.catalog.refreshTable`). #### How was this patch tested? Changed the existing test cases to use the function `refreshTable`. Also added a test case for refreshTable in `hivecontext-compatibility` Author: gatorsmile <[email protected]> Closes #13156 from gatorsmile/refreshTable. (cherry picked from commit 39fd469) Signed-off-by: Wenchen Fan <[email protected]>
|
thanks, merging to master and 2.0! |
|
Thank you! |
What changes were proposed in this pull request?
refreshTablewas a method inHiveContext. It was deleted accidentally while we were migrating the APIs. This PR is to add it back toHiveContext.In addition, in
SparkSession, we put it under the catalog namespace (SparkSession.catalog.refreshTable).How was this patch tested?
Changed the existing test cases to use the function
refreshTable. Also added a test case for refreshTable inhivecontext-compatibility