Skip to content

Conversation

@sunchao
Copy link
Member

@sunchao sunchao commented Nov 24, 2020

What changes were proposed in this pull request?

This replaces Spark session and DataSourceV2Relation in V2 write plans by replacing them with a callback afterWrite.

Why are the changes needed?

Per discussion in #30429, it's better to not pass Spark session and DataSourceV2Relation through Spark plans. Instead we can use a callback which makes the interface cleaner.

Does this PR introduce any user-facing change?

No

How was this patch tested?

N/A

@sunchao sunchao force-pushed the SPARK-33492-followup branch from de468e5 to 7a9b72d Compare November 24, 2020 20:26
@github-actions github-actions bot added the SQL label Nov 24, 2020
@SparkQA
Copy link

SparkQA commented Nov 24, 2020

Test build #131696 has started for PR 30491 at commit 7a9b72d.

Copy link
Contributor

@rdblue rdblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I'll merge this in a few days, unless there are objections from others.

writeOptions: CaseInsensitiveStringMap,
query: SparkPlan) extends V2TableWriteExec with BatchWriteHelper {
query: SparkPlan,
afterWrite: () => Unit = () => ()) extends V2TableWriteExec with BatchWriteHelper {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to not have the default parameter value when unnecessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for @cloud-fan 's comment.

r.table.asWritable match {
case v1 if v1.supports(TableCapability.V1_BATCH_WRITE) =>
AppendDataExecV1(v1, writeOptions.asOptions, query, r) :: Nil
AppendDataExecV1(v1, writeOptions.asOptions, query, afterWrite = refreshCache) :: Nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we omit afterWrite = ?

@sunchao sunchao changed the title [SPARK-33492][SQL][FOLLOWUP] Use callback instead of passing Spark session and v2 relation [SPARK-33567][SQL] DSv2: Use callback instead of passing Spark session and v2 relation for refreshing cache Nov 25, 2020
@SparkQA
Copy link

SparkQA commented Nov 26, 2020

Test build #131817 has finished for PR 30491 at commit c754b85.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

writeOptions: CaseInsensitiveStringMap,
plan: LogicalPlan,
v2Relation: DataSourceV2Relation) extends V1FallbackWriters {
afterWrite: () => Unit) extends V1FallbackWriters {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we use a consistent name "refreshCache"?

writeOptions: CaseInsensitiveStringMap,
plan: LogicalPlan,
v2Relation: DataSourceV2Relation) extends V1FallbackWriters {
afterWrite: () => Unit) extends V1FallbackWriters {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

session, r.table.asWritable, r, writeOptions.asOptions, planLater(query)) :: Nil
r.table.asWritable, writeOptions.asOptions, planLater(query), refreshCache(r)) :: Nil

case DeleteFromTable(relation, condition) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For DeleteFromTable, do we need to invalidate cache too? Doesn't this command also update table data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes good catch! I think we should. I'll work on this in a separate PR.

@SparkQA
Copy link

SparkQA commented Nov 28, 2020

Test build #131911 has finished for PR 30491 at commit 98e9ede.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

catalog.invalidateTable(ident)

// invalidate all caches referencing the given table
// TODO(SPARK-33437): re-cache the table itself once we support caching a DSv2 table
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sunchao let's also fix this TODO in a separate PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Will do that soon.

@cloud-fan
Copy link
Contributor

GA passed, merging to master!

@cloud-fan cloud-fan closed this in feda729 Nov 30, 2020
@SparkQA
Copy link

SparkQA commented Nov 30, 2020

Test build #131951 has finished for PR 30491 at commit 98e9ede.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sunchao sunchao deleted the SPARK-33492-followup branch November 30, 2020 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants