Upgrade to Spark 3.4.0 #235

pp-akursar · 2023-08-02T13:37:19Z

Changes similar to those of the previous upgrade to 3.3 at #197

Bump Spark to 3.4.0
Bumped the version of the library to 1.4.0
Updated the readme accordingly
- Added the 1.4.0 artifact to the version table

Fixes #227, specifically java.lang.NoSuchMethodError: 'org.apache.spark.sql.types.StructType org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(java.sql.ResultSet, org.apache.spark.sql.jdbc.JdbcDialect, boolean)'

pp-akursar · 2023-08-02T13:39:26Z

@microsoft-github-policy-service agree company="PulsePoint"

shivsood · 2023-09-21T22:37:25Z

@pp-akursar The change looks good. can u attach test results on 3.4.0.

shivsood

Please add test results for 3.4 run. Post that this can be merged.

SteffenMangold · 2023-09-22T07:11:12Z

Please review and build! :)
Highly needed

shivsood · 2023-09-22T21:18:47Z

Test results on Spark 3.4.0 - Pass.

Run :

/*

SparkConnTestMain
Main class for the test jar.
@ arg0 : database user name
@ arg1 : database user password
@ arg2 : AD user principal
@ arg3 : AD user keytab name
@ arg4 : connector type to use ("JDBC" or "com.microsoft.sqlserver.jdbc.spark")
@ arg5 : database name
@ arg6 : data source for data pool
@ arg7 : test suite to run -GCI(0), CI(1), Perf(2), All(9)
@ arg8 : run datapool test (true or false)
@ arg9 : sql server URL
@ arg10: sql server port
@ arg11: AD domain
*/

val test_obj = new SparkConnTest("connectoradmin", "", "", "", "com.microsoft.sqlserver.jdbc.spark", "testconn","", "0", "false".toBoolean, "database.windows.net", 1433,"");
test_obj.test_sqlmaster();

Results: Passed
test_gci_twoPartName_owar : Entered
Tablename is mssqlspark.test_gci_twoPartName_owar
Operation Overwrite, append and read
test_gci_twoPartName_owar : Passed
test_gci_tbNameInBracket_owar : Entered
Table name is [test_gci_tbNameInBracket_owar]
Operation Overwrite, append and read
test_gci_tbNameInBracket_owar : Passed
test_gci_tabLock_write : Entered
test_gci_tabLock_write : Passed
test_gci_secureURL_write : Entered
test_gci_secureURL_write : Passed
test_gci_reordered_columns : Entered
test_gci_reordered_columns : Created table
test_gci_reordered_columns : Append succcessful
test_gci_reordered_columns : Read back table and confirmed data is added succcessful
test_gci_reordered_columns : Reordered Write overwrite with truncate
test_gci_reordered_columns : Reordered write append
test_gci_reordered_columns : Reordered Write overwrite without truncate Passed
test_write_parallel : Entered
test_write_parallel : Passed
test_gci_empty_dataframe : Entered
test_gci_empty_dataframe : Passed
test_gci_read_write : Entered
test_basic_read_write : Passed
test_gci_read_write : Passed
test_gci_null_values : Entered
test_gci_null_values : Passed
test_gci_append_rows : Entered
test_gci_append_rows : Passed
test_gci_truncate_table : Entered
test_gci_truncate_table : Passed
test_gci_case_sensitivity : Entered
test_gci_case_sensitivity : exit
test_gci_precision_scale : Entered
test_gci_precision_scale : exit
test_isolation_level : Entered
test_isolation_level : READ_UNCOMMITTED succeded
test_isolation_level : READ_COMMITTED succeded
test_isolation_level : REPEATABLE_READ succeded
test_isolation_level : SNAPShort write start
test_isolation_level : SNAPShort write done
test_isolation_level : SNAPShort read done
test_isolation_level : SNAPShort 5 5
Assert counts
test_isolation_level : SNAPSHOT succeded
test_isolation_level : isoLevel = ONE
test_isolation_level : isoLevel = NONE Exception
test_isolation_level : all done
test_isolation_level : exit
test_gci_limit_escape : Multiple read test Entered
test_gci_limit_escape : Passed
test_gci_threePartName_owar : Entered
Tablename is testconn.mssqlspark.test_gci_threePartName_owar
Operation Overwrite, append and read
test_gci_threePartName_owar : Passed

shivsood · 2023-09-22T21:25:39Z

Test Pass: Reliability Mode on.

/*

SparkConnTestMain
Main class for the test jar.
@ arg0 : database user name
@ arg1 : database user password
@ arg2 : AD user principal
@ arg3 : AD user keytab name
@ arg4 : connector type to use ("JDBC" or "com.microsoft.sqlserver.jdbc.spark")
@ arg5 : database name
@ arg6 : data source for data pool
@ arg7 : test suite to run -GCI(0), CI(1), Perf(2), All(9)
@ arg8 : run datapool test (true or false)
@ arg9 : sql server URL
@ arg10: sql server port
@ arg11: AD domain
*/

Test :

test_obj.test_sqlmaster_reliable_connector()

Results : Pass
test_isolation_level : SNAPSHOT succeded
test_isolation_level : isoLevel = NONE
test_isolation_level : isoLevel = NONE Exception
test_isolation_level : all done
test_isolation_level : exit
test_gci_limit_escape : Multiple read test Entered
test_gci_limit_escape : Passed
test_gci_threePartName_owar : Entered
Tablename is testconn.mssqlspark.test_gci_threePartName_owar
Operation Overwrite, append and read
test_gci_threePartName_owar : Passed
test_gci_twoPartName_owar : Entered
Tablename is mssqlspark.test_gci_twoPartName_owar
Operation Overwrite, append and read
test_gci_twoPartName_owar : Passed
test_gci_tbNameInBracket_owar : Entered
Table name is [test_gci_tbNameInBracket_owar]
Operation Overwrite, append and read
test_gci_tbNameInBracket_owar : Passed
test_gci_tabLock_write : Entered
test_gci_tabLock_write : Passed
test_gci_secureURL_write : Entered
test_gci_secureURL_write : Passed
test_gci_reordered_columns : Entered
test_gci_reordered_columns : Created table
test_gci_reordered_columns : Append succcessful
test_gci_reordered_columns : Read back table and confirmed data is added succcessful
test_gci_reordered_columns : Reordered Write overwrite with truncate
test_gci_reordered_columns : Reordered write append
test_gci_reordered_columns : Reordered Write overwrite without truncate Passed
test_write_parallel : Entered
test_write_parallel : Passed
test_gci_empty_dataframe : Entered
test_gci_empty_dataframe : Passed
test_gci_read_write : Entered
test_basic_read_write : Passed
test_gci_read_write : Passed
test_gci_null_values : Entered
test_gci_null_values : Passed
test_gci_append_rows : Entered
test_gci_append_rows : Passed
Results : Pass

test_isolation_level : SNAPSHOT succeded
test_isolation_level : isoLevel = NONE
test_isolation_level : isoLevel = NONE Exception
test_isolation_level : all done
test_isolation_level : exit
test_gci_limit_escape : Multiple read test Entered
test_gci_limit_escape : Passed
test_gci_threePartName_owar : Entered
Tablename is testconn.mssqlspark.test_gci_threePartName_owar
Operation Overwrite, append and read
test_gci_threePartName_owar : Passed
test_gci_twoPartName_owar : Entered
Tablename is mssqlspark.test_gci_twoPartName_owar
Operation Overwrite, append and read
test_gci_twoPartName_owar : Passed
test_gci_tbNameInBracket_owar : Entered
Table name is [test_gci_tbNameInBracket_owar]
Operation Overwrite, append and read
test_gci_tbNameInBracket_owar : Passed
test_gci_tabLock_write : Entered
test_gci_tabLock_write : Passed
test_gci_secureURL_write : Entered
test_gci_secureURL_write : Passed
test_gci_reordered_columns : Entered
test_gci_reordered_columns : Created table
test_gci_reordered_columns : Append succcessful
test_gci_reordered_columns : Read back table and confirmed data is added succcessful
test_gci_reordered_columns : Reordered Write overwrite with truncate
test_gci_reordered_columns : Reordered write append
test_gci_reordered_columns : Reordered Write overwrite without truncate Passed
test_write_parallel : Entered
test_write_parallel : Passed
test_gci_empty_dataframe : Entered
test_gci_empty_dataframe : Passed
test_gci_read_write : Entered
test_basic_read_write : Passed
test_gci_read_write : Passed
test_gci_null_values : Entered
test_gci_null_values : Passed
test_gci_append_rows : Entered
test_gci_append_rows : Passed

luxu1-ms · 2023-09-22T21:34:58Z

README.md

 | Spark 3.0.x compatible connector | `com.microsoft.azure:spark-mssql-connector_2.12:1.1.0` | 2.12          |
 | Spark 3.1.x compatible connector | `com.microsoft.azure:spark-mssql-connector_2.12:1.2.0` | 2.12          |
 | Spark 3.3.x compatible connector | `com.microsoft.azure:spark-mssql-connector_2.12:1.3.0` | 2.12          |
+| Spark 3.4.x compatible connector | `com.microsoft.azure:spark-mssql-connector_2.12:1.4.0` | 2.12          |


Might also need to add spark 3.4 in Versions Supported

yes, as we release the beta.

luxu1-ms · 2023-09-22T21:36:40Z

pom.xml

    <profiles>
        <profile>
-            <id>spark33</id>
+            <id>spark34</id>


@shivsood Could you confirm that 3.4 spark DBR environment could not use 3.3 connector? If 3.4 DBR could still use 3.3 connector, then there is no need for a new release maybe?

1.3.0-BETA release ( Spark 3.3) throws an exception for get schema type with DBR Spark 3.4.0.

SteffenMangold · 2023-09-25T12:15:37Z

Nice! thanks for reacting this quickly.

RolandASc · 2023-09-27T12:39:20Z

@shivsood , can you tell us when the beta will be released and available through maven? We are wondering whether we should revert back to DBR 12.2 for now or if we can just wait a couple more days and switch to the new connector.

And can we assume that for 3.3 it will stay 1.3.0-BETA?

Also, it would be nice if the README could be updated in these two places (even for the previous release):

There are three version sets of the connector available through Maven, a 2.4.x, a 3.0.x and a 3.1.x compatible version.
Current Releases

And 👍 for your work!

JessicaBL · 2023-10-25T13:25:20Z

Hello,
Also wondering when the newest version 1.4.0 for Spark 3.4 will be available on Maven Central - I can't see it on Databricks for Download and it's highly needed!

Thanks :)

atul-delphix · 2024-05-01T11:16:04Z

@pp-akursar / @shivsood When are we planning to GA release this version 1.4.0 to support Apache Spark 3.4? And when can we expect support for Apache Spark 3.5.X ? as all previous version has lots of vulnerabilities.

It highly needed, You can also redirect me to concerning person who can better answer.

dbeavon · 2024-10-08T01:08:41Z

@shivsood

Should azure databricks customers contact Azure support? It would be great to get GA versions of the connector. The default JDBC connector is not as fast. (even with the batchsize option/customization)

What team at Microsoft sponsors this project? Is it HDInsight? Or Synapse?
Or azure databricks?

Upgrade to Spark 3.4.0

60aadf5

shivsood approved these changes Sep 21, 2023

View reviewed changes

luxu1-ms reviewed Sep 22, 2023

View reviewed changes

luxu1-ms approved these changes Sep 22, 2023

View reviewed changes

shivsood merged commit eb60462 into microsoft:master Sep 22, 2023

grihabor mentioned this pull request Dec 12, 2024

Upgrade to spark 3.5 #269

Open

Upgrade to Spark 3.4.0 #235

Upgrade to Spark 3.4.0 #235

Uh oh!

Conversation

pp-akursar commented Aug 2, 2023

Uh oh!

pp-akursar commented Aug 2, 2023

Uh oh!

shivsood commented Sep 21, 2023

Uh oh!

shivsood left a comment

Choose a reason for hiding this comment

Uh oh!

SteffenMangold commented Sep 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shivsood commented Sep 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shivsood commented Sep 22, 2023

Uh oh!

luxu1-ms Sep 22, 2023

Choose a reason for hiding this comment

Uh oh!

shivsood Sep 22, 2023

Choose a reason for hiding this comment

Uh oh!

luxu1-ms Sep 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shivsood Sep 22, 2023

Choose a reason for hiding this comment

Uh oh!

SteffenMangold commented Sep 25, 2023

Uh oh!

RolandASc commented Sep 27, 2023

Uh oh!

JessicaBL commented Oct 25, 2023

Uh oh!

atul-delphix commented May 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dbeavon commented Oct 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

SteffenMangold commented Sep 22, 2023 •

edited

Loading

shivsood commented Sep 22, 2023 •

edited

Loading

luxu1-ms Sep 22, 2023 •

edited

Loading

atul-delphix commented May 1, 2024 •

edited

Loading

dbeavon commented Oct 8, 2024 •

edited

Loading