Skip to content

Conversation

@gatorsmile
Copy link
Member

@gatorsmile gatorsmile commented Mar 6, 2020

What changes were proposed in this pull request?

Based on the discussion in the mailing list [Proposal] Modification to Spark's Semantic Versioning Policy , this PR is to add back the following APIs whose maintenance cost are relatively small.

  • SQLContext.applySchema
  • SQLContext.parquetFile
  • SQLContext.jsonFile
  • SQLContext.jsonRDD
  • SQLContext.load
  • SQLContext.jdbc

Why are the changes needed?

Avoid breaking the APIs that are commonly used.

Does this PR introduce any user-facing change?

Adding back the APIs that were removed in 3.0 branch does not introduce the user-facing changes, because Spark 3.0 has not been released.

How was this patch tested?

The existing tests.

/**
* @deprecated As of 1.3.0, replaced by `createDataFrame()`.
*/
@deprecated("Use createDataFrame instead.", "1.3.0")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @marmbrus and @srowen .
Do you really need to keep 1.3.0 API?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree; I also don't feel strongly about it as there isn't much overhead to keeping it, beyond API noise.
@gatorsmile may have some additional info or data about this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rubric says nothing about how old the deprecation warning is on purpose. It says we should think about usage.

@gatorsmile do you have any reason to believe these are commonly used functions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, All (including reviewers).
Although this is marked as Draft clearly, I want to make it sure that we need to hold on this until we have a vote.

@srowen
Copy link
Member

srowen commented Mar 6, 2020

Yes, is it simpler to just list the things being proposed to change and why (i.e. any data or arguments to support it) first? It does seem like a whole lot is being added back, which is at odds with what I understand to be the conclusion of the last thread on this.

@SparkQA
Copy link

SparkQA commented Mar 7, 2020

Test build #119482 has finished for PR 27839 at commit d3b5845.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-25821][SQL][FOLLOW-UP] Add Back the Deprecated SQLContext methods [SPARK-31086][SQL] Add Back the Deprecated SQLContext methods Mar 9, 2020
@dongjoon-hyun
Copy link
Member

Also, cc @marmbrus .
And, cc @rxin since he is the release manager for 3.0.0.

@gatorsmile gatorsmile marked this pull request as ready for review March 24, 2020 06:06
@gatorsmile
Copy link
Member Author

CC @dongjoon-hyun @marmbrus @rxin @srowen This PR is ready for review.

@SparkQA
Copy link

SparkQA commented Mar 24, 2020

Test build #120243 has finished for PR 27839 at commit de7811d.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 24, 2020

Test build #120233 has finished for PR 27839 at commit 2b1dcb6.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class DeprecatedAPISuite extends QueryTest with SharedSparkSession

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly about this either way. I think these APIs are going to be here forever if kept here now, and I don't see evidence they are used, so, why were they even deprecated and replaced? Deprecation loses meaning. But, they impose almost no maintenance burden as aliases, so "who cares?" can cut either way.

@SparkQA
Copy link

SparkQA commented Mar 24, 2020

Test build #4999 has finished for PR 27839 at commit de7811d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@cloud-fan cloud-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as these are easy to maintain.

gatorsmile added a commit that referenced this pull request Mar 27, 2020
### What changes were proposed in this pull request?

Based on the discussion in the mailing list [[Proposal] Modification to Spark's Semantic Versioning Policy](http://apache-spark-developers-list.1001551.n3.nabble.com/Proposal-Modification-to-Spark-s-Semantic-Versioning-Policy-td28938.html) , this PR is to add back the following APIs whose maintenance cost are relatively small.

- SQLContext.applySchema
- SQLContext.parquetFile
- SQLContext.jsonFile
- SQLContext.jsonRDD
- SQLContext.load
- SQLContext.jdbc

### Why are the changes needed?
Avoid breaking the APIs that are commonly used.

### Does this PR introduce any user-facing change?
Adding back the APIs that were removed in 3.0 branch does not introduce the user-facing changes, because Spark 3.0 has not been released.

### How was this patch tested?
The existing tests.

Closes #27839 from gatorsmile/addAPIBackV3.

Lead-authored-by: gatorsmile <[email protected]>
Co-authored-by: yi.wu <[email protected]>
Signed-off-by: gatorsmile <[email protected]>
(cherry picked from commit b7e4cc7)
Signed-off-by: gatorsmile <[email protected]>
@gatorsmile
Copy link
Member Author

Thanks! Merged to master/3.0

sjincho pushed a commit to sjincho/spark that referenced this pull request Apr 15, 2020
### What changes were proposed in this pull request?

Based on the discussion in the mailing list [[Proposal] Modification to Spark's Semantic Versioning Policy](http://apache-spark-developers-list.1001551.n3.nabble.com/Proposal-Modification-to-Spark-s-Semantic-Versioning-Policy-td28938.html) , this PR is to add back the following APIs whose maintenance cost are relatively small.

- SQLContext.applySchema
- SQLContext.parquetFile
- SQLContext.jsonFile
- SQLContext.jsonRDD
- SQLContext.load
- SQLContext.jdbc

### Why are the changes needed?
Avoid breaking the APIs that are commonly used.

### Does this PR introduce any user-facing change?
Adding back the APIs that were removed in 3.0 branch does not introduce the user-facing changes, because Spark 3.0 has not been released.

### How was this patch tested?
The existing tests.

Closes apache#27839 from gatorsmile/addAPIBackV3.

Lead-authored-by: gatorsmile <[email protected]>
Co-authored-by: yi.wu <[email protected]>
Signed-off-by: gatorsmile <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants