Skip to content

Conversation

@EnricoMi
Copy link
Contributor

@EnricoMi EnricoMi commented Jun 15, 2023

What changes were proposed in this pull request?

Implements upsert mode for SaveMode.Append of the MsSql, Postgres, Derby, H2 and oracle JDBC source.

This uses MERGE INTO in combination with a temporary table. A batch of rows is inserted into the temporary table (rather than the target table) and merged into the target table with one MERGE INTO command per batch.

See #41518 for an alternative for databases not supporting MERGE INTO syntax.

Why are the changes needed?

The JDBC writer only supports either truncating the existing table or inserting. Duplicates, i.e. rows with identical values in the primary or unique index columns, cause an exception, permitting updating existing and inserting new rows.

Re-evaluating a partition due to executor loss will insert rows that have been inserted in an earlier attempt, which kills the entier Spark job.

Does this PR introduce any user-facing change?

This adds upsert and upsertKeyColumns options for SaveMode.Append of the JDBC source.

How was this patch tested?

Tests in JdbcSuite and integration suites.

@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch from 7a0e4d2 to 9cc6b39 Compare June 15, 2023 09:39
@EnricoMi EnricoMi changed the title [SPARK-38200][SQL] JDBC upsert MERGE INTO using temp table [SPARK-38200][SQL] Add upserts for writing to JDBC using MERGE INTO with temp table Jun 15, 2023
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch 3 times, most recently from 1ade6b4 to 76d0429 Compare June 16, 2023 06:57
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch from 76d0429 to 0439e5d Compare June 23, 2023 10:23
@EnricoMi EnricoMi changed the title [SPARK-38200][SQL] Add upserts for writing to JDBC using MERGE INTO with temp table [SPARK-19335][SPARK-38200][SQL] Add upserts for writing to JDBC using MERGE INTO with temp table Jun 23, 2023
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch from 95dc877 to 265dd1f Compare June 30, 2023 09:34
@github-actions github-actions bot removed the CORE label Jun 30, 2023
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch from 265dd1f to 2c3faec Compare July 18, 2023 12:46
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch from cdb889d to db2d78d Compare July 26, 2023 12:32
@github-actions github-actions bot removed the INFRA label Jul 26, 2023
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch from db2d78d to 464a19a Compare October 9, 2023 13:39
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch 2 times, most recently from 7658bbc to 80a2a9c Compare October 26, 2023 10:21
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch from 80a2a9c to d064da1 Compare December 1, 2023 15:31
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch from d064da1 to 5e69410 Compare January 10, 2024 19:46
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch 3 times, most recently from 38ce6b9 to 60e41ca Compare January 23, 2024 09:04
@github-actions github-actions bot removed the CONNECT label Jan 23, 2024
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch from 10fd22b to e3ade6e Compare April 20, 2024 16:19
@github-actions github-actions bot removed the DOCS label Apr 20, 2024
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch from e3ade6e to 4edb30a Compare June 7, 2024 20:46
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch from 4edb30a to 3c4f7e9 Compare July 30, 2024 08:36
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch 3 times, most recently from 637ca4f to 72527be Compare September 2, 2024 16:43
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch 2 times, most recently from 9d72693 to acb6131 Compare November 22, 2024 15:35
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch from acb6131 to 5146227 Compare January 16, 2025 14:20
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch from 5146227 to d1f32ef Compare March 5, 2025 13:16
@EnricoMi EnricoMi force-pushed the jdbc-upsert-merge-temp-table branch from d1f32ef to 583d349 Compare April 28, 2025 04:49
@github-actions
Copy link

github-actions bot commented Aug 7, 2025

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Aug 7, 2025
@github-actions github-actions bot closed this Aug 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant