Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Jul 31, 2020

What changes were proposed in this pull request?

Replace Jdbc by JDBC in classes and files. I leaved as is JdbcRDD and JdbcDialects because they are a part of semi-public APIs.

Why are the changes needed?

To eliminate inconsistency in naming JDBC files and classes. There are 7 files with the Jdbc prefix:

$  apache-spark git:(master) find . -name "Jdbc*.scala" -type f
./core/src/test/scala/org/apache/spark/rdd/JdbcRDDSuite.scala
./core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala
./sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtilsSuite.scala
./sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcRelationProvider.scala
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala
./sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/JdbcConnectionUriSuite.scala

and 8 files start from the JDBC:

$  apache-spark git:(master) find . -name "JDBC*.scala" -type f
./sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala
./sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCWriteSuite.scala
./sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTableCatalogSuite.scala
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCOptions.scala
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRelation.scala
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTableCatalog.scala
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTable.scala

Does this PR introduce any user-facing change?

Should not.

How was this patch tested?

By existing test suites like JDBCSuite.

@SparkQA
Copy link

SparkQA commented Jul 31, 2020

Test build #126898 has finished for PR 29323 at commit fa2489b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class JDBCConnectionUriSuite extends HiveThriftServer2Test

@MaxGekk
Copy link
Member Author

MaxGekk commented Aug 1, 2020

@maropu Please, review this PR.

@SparkQA
Copy link

SparkQA commented Aug 1, 2020

Test build #126914 has finished for PR 29323 at commit 7251ba2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@maropu maropu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion though, it looks good in terms of naming consistency. cc: @HyukjinKwon

HyukjinKwon
HyukjinKwon previously approved these changes Aug 2, 2020
@HyukjinKwon
Copy link
Member

Same as @maropu. There's inconsistency in JSON vs Json, CSV vs Csv. Orc might have to be ORC too. The change itself looks fine as one time thing but not sure if we should do it for all. Also, JdbcDialects seems an API that we can't change as well so this PR happened to be unable to address the issues all.

@HyukjinKwon HyukjinKwon dismissed their stale review August 2, 2020 04:28

I am going to dismiss my approval since I don't have a strong opinion here.

* Util functions for JDBC tables.
*/
object JdbcUtils extends Logging {
object JDBCUtils extends Logging {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaxGekk . As you see the change on PostgreSQL dialect, this will break the downstream custom dialect. Shall we avoid to change this?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer not to change overall because we cannot be completed in any way after this PR.

I leaved as is JdbcRDD and JdbcDialects because they are a part of semi-public APIs.

Also, this PR can break the downstream 3rd party JDBC driver easily although it's under execution package. As we see, one example is JdbcUtils.scala which already breaks PostgreDialect and this PR needed to adapt from the source code.

After removing JdbcUtils.scala from this PR's scope, we may re-consider this PR again.

BTW, cc @gatorsmile and @cloud-fan .

@maropu
Copy link
Member

maropu commented Aug 3, 2020

Ah, I missed that consideration..., thanks, @dongjoon-hyun. As you suggested above, we cannot rename the class names in the main codebase.

@HyukjinKwon
Copy link
Member

@dongjoon-hyun, as you said JdbcUtils isn't an API and under the private package. I don't think we should block a PR for this reason. But sure I don't have a strong feeling on this. Let's don't change it for now.

@MaxGekk
Copy link
Member Author

MaxGekk commented Aug 3, 2020

After off-line discussion with @HyukjinKwon, I am closing this PR.

@MaxGekk MaxGekk closed this Aug 3, 2020
@MaxGekk MaxGekk deleted the rename-Jdbc-to-JDBC branch December 11, 2020 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants