[SPARK-34039][SQL][3.1] ReplaceTable should invalidate cache #31100
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This is a backport of #31081 to branch-3.1.
This changes
ReplaceTableExec/AtomicReplaceTableExec, and uncaches the target table before it is dropped. In addition, this includes some refactoring by moving theuncacheTablemethod toDataSourceV2Strategyso that we don't need to pass a Spark session to the v2 exec.Why are the changes needed?
Similar to SPARK-33492 (#30429). When a table is refreshed, the associated cache should be invalidated to avoid potential incorrect results.
Does this PR introduce any user-facing change?
Yes. Now When a data source v2 is cached (either directly or indirectly), all the relevant caches will be refreshed or invalidated if the table is replaced.
How was this patch tested?
Added a new unit test.