Skip to content

Conversation

@aokolnychyi
Copy link
Contributor

What changes were proposed in this pull request?

This PR makes StreamExecution use the Write abstraction introduced in SPARK-33779.

Note: we will need separate plans for streaming writes in order to support the required distribution and ordering in SS. This change only migrates to the Write abstraction.

Why are the changes needed?

These changes prevent exceptions from data sources that implement only the build method in WriteBuilder.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing tests.

@SparkQA
Copy link

SparkQA commented Jan 8, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38432/

@SparkQA
Copy link

SparkQA commented Jan 8, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38432/

Copy link
Contributor

@jaceklaskowski jaceklaskowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (non-binding)

@SparkQA
Copy link

SparkQA commented Jan 8, 2021

Test build #133843 has finished for PR 31093 at commit 0eb559b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

this
}

override def buildForBatch(): BatchWrite = writer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this change is optional (shouldn't break anything even this is not change), as we're trying to guarantee backward compatibility, right? I think it's better to change this - just wanted to check about compatibility.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a test module, it's okay to change.

Copy link
Contributor Author

@aokolnychyi aokolnychyi Jan 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we could leave it as is. I changed because it would previously fail if a data source implemented the build method only.

In the next PR, I am going to implement RequiredDistributionAndOrdering.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion. I didn't mean we should leave it as it is. I just meant to confirm whether the test won't fail even without the fix, to make sure we guarantee compatibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I confirm it is compatible, @HeartSaVioR.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @aokolnychyi and all.
Merged to master for Apache Spark 3.2.0.

@aokolnychyi
Copy link
Contributor Author

Thanks everyone!

flyrain pushed a commit to flyrain/spark that referenced this pull request Sep 21, 2021
…tion

### What changes were proposed in this pull request?

This PR makes `StreamExecution` use the `Write` abstraction introduced in SPARK-33779.

Note: we will need separate plans for streaming writes in order to support the required distribution and ordering in SS. This change only migrates to the `Write` abstraction.

### Why are the changes needed?

These changes prevent exceptions from data sources that implement only the `build` method in `WriteBuilder`.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing tests.

Closes apache#31093 from aokolnychyi/spark-34049.

Authored-by: Anton Okolnychyi <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants