Reuse shadowJar for spark client bundle jar maven publish #1857

gh-yzou · 2025-06-11T03:28:55Z

We previously added a special check in PublishingHelperPlugin.kt to check specifically for the jar job for polaris-spark project, and publish the artifact output for the ShadowJar Task added. However, we already have a shadowJar infra that takes care of the maven publish.
In this PR, we switch to reuse the shardowJar Infra, and reverted the change we added before.

snazy · 2025-06-11T17:39:10Z

plugins/spark/v3.5/spark/build.gradle.kts

testJar??

sorry for the confusion, that is a bad name. I am just referring the original default jar job, updated the classifier to "defaultJar" to be more clear.

dimas-b · 2025-06-11T17:53:31Z

plugins/spark/v3.5/spark/build.gradle.kts

What is it when we do not override it to null?

the original name is something like polaris-spark-3.5_2.12-0.11.0-beta-incubating-SNAPSHOT-bundle.jar, we did this because the jar without classifier is taken by the default jar job with name polaris-spark-3.5_2.12-0.11.0-beta-incubating-SNAPSHOT.jar. However, Spark does not support using classifier in the package config, so we make this jar the jar for this project, since this jar is the actual jar needed by spark, i think it actually should be the jar project without any classifier

Yes, I understand the intent :) My question is about the need to set archiveClassifier to null... Do we have to use null here?

Oh, sorry, we don't have to, the default is null. i was putting it there to be clear, and I can remove it if preferred, but I think it might be better to be more explicit in the code.

From my POV removing the assignment is preferable since the value is the same as default.

I'd prefer to have a comment about adding a classifier to the jar task instead.

sg! i removed the specification of the classifier, and added a comment at the place where i added the classifier for the jar task

@dimas-b after switch to use the shadowJarPlugin, i need to specify the classifier here, otherwise, it seems configuring to generate a jar with classifier "all", but I was also able to get rid of the other jar change

actually, sorry, it seems still needed, my previous gradlew build seems coming from cache. Added it back!

dimas-b · 2025-06-11T21:33:55Z

plugins/spark/v3.5/spark/build.gradle.kts

Why not remove the plain jar artifact from this module completely?

I tried that before, however, it seems the task test depends on the jar job in the default configuration. i tried to switch the test task to depends on the createPolarisSparkJar, but because that jar job did a relocation of module com.fasterxml, one of our test fails the deserialization test, because it is now looking for the shaded one, not the original one.
So far I haven't found a good solution yet, so I kept the original jar. wondering if you got some better solutions for this problem?

Thanks for the detailed analysis, @gh-yzou ! Unfortunately, I do not have a better solution off the top of my head.

How about using the internal classifier for this jar? I suppose it is not meant for reuse.

yes, it is not intended for reuse. The name "internal" make sense to me, upated

dimas-b

LGTM 👍 Thanks, @gh-yzou !

gh-yzou · 2025-06-11T23:34:34Z

@dimas-b i think you asked a question somewhere, but it doesn't show up in the PR for some reason. For the artifact, i don't think we have "client" in the artifact name, the iceberg one is called iceberg-spark-runtime-xxxx.jar, and our polaris one is called polaris-spark-xxx.jar. For iceberg, i guess the reason is that iceberg-spark is already taken by another projects, but i don't think we need to be exactly the same as iceberg.
Some of the doc description might introduce some confusion, i went one more pass to make sure the description are more consistent.

dimas-b · 2025-06-11T23:40:01Z

Re: polaris-spark-xxx.jar. it is not really related to this PR :)

I value short jar names, but at the same time it might be worth clarifying whether this jar applies to the whole of Polaris integration with Spark or just to Generic Tables.

In other words, do we foresee making any other Polaris jars to be put on the Spark class path?

If no, the current name is fine from my POV, if yes, let's discuss that naming convention on the dev ML (since it's not about this build change really).

dimas-b

Changes LGTM, but I believe the PR description is a bit off WRT actual changes now 🤔 WDYT?

gh-yzou · 2025-06-18T18:48:04Z

@dimas-b sorry, i updated the title, but forgot to update the description, also updated the description

* fix spark client * fix test failure and address feedback * fix error * update regression test * update classifier name * address comment * add change * update doc * update build and readme * add back jr * udpate dependency * add change * update * update tests * remove merge service file * update readme * update readme

…ache#1857)" This reverts commit 1f7f127.

)" (#1921) …857)" This reverts commit 1f7f127. The shadowJar plugin actually stops publish the original jar, which is not what spark client intend to publish for the --package usage. Revert it for now, will follow up with a better way to reuse the shadow jar plugin, likely with a separate bundle project

…ache#1857)" This reverts commit 1f7f127.

* fix spark client * fix test failure and address feedback * fix error * update regression test * update classifier name * address comment * add change * update doc * update build and readme * add back jr * udpate dependency * add change * update * update tests * remove merge service file * update readme * update readme

…ache#1857)" This reverts commit 40f4d36.

…untime to avoid spark compatibilities issue (#1908) * add change * add comment * update change * add comment * add change * add tests * add comment * clean up style check * update build * Revert "Reuse shadowJar for spark client bundle jar maven publish (#1857)" This reverts commit 1f7f127. * Reuse shadowJar for spark client bundle jar maven publish (#1857) * fix spark client * fix test failure and address feedback * fix error * update regression test * update classifier name * address comment * add change * update doc * update build and readme * add back jr * udpate dependency * add change * update * update tests * remove merge service file * update readme * update readme * update checkstyl * rebase with main * Revert "Reuse shadowJar for spark client bundle jar maven publish (#1857)" This reverts commit 40f4d36. * update checkstyle * revert change * address comments * trigger tests

)" (#1921) …857)" This reverts commit 1f7f127. The shadowJar plugin actually stops publish the original jar, which is not what spark client intend to publish for the --package usage. Revert it for now, will follow up with a better way to reuse the shadow jar plugin, likely with a separate bundle project

…untime to avoid spark compatibilities issue (#1908) * add change * add comment * update change * add comment * add change * add tests * add comment * clean up style check * update build * Revert "Reuse shadowJar for spark client bundle jar maven publish (#1857)" This reverts commit 1f7f127. * Reuse shadowJar for spark client bundle jar maven publish (#1857) * fix spark client * fix test failure and address feedback * fix error * update regression test * update classifier name * address comment * add change * update doc * update build and readme * add back jr * udpate dependency * add change * update * update tests * remove merge service file * update readme * update readme * update checkstyl * rebase with main * Revert "Reuse shadowJar for spark client bundle jar maven publish (#1857)" This reverts commit 40f4d36. * update checkstyle * revert change * address comments * trigger tests

gh-yzou requested review from adutra and ashvina as code owners June 11, 2025 03:28

github-project-automation bot added this to Basic Kanban Board Jun 11, 2025

gh-yzou requested review from MonkeyCanCode, RussellSpitzer, collado-mike, dennishuo, dimas-b, ebyhr, eric-maynard, flyrain, jackye1995, jbonofre, snazy, takidau and vvcephei as code owners June 11, 2025 03:28

github-project-automation bot moved this to PRs In Progress in Basic Kanban Board Jun 11, 2025

gh-yzou requested review from HonahX, ajantha-bhat, pingtimeout and singhpk234 as code owners June 11, 2025 03:28

snazy reviewed Jun 11, 2025

View reviewed changes

dimas-b reviewed Jun 11, 2025

View reviewed changes

gh-yzou force-pushed the yzou-test-plugin branch from 00ca1b7 to b6f25a7 Compare June 11, 2025 23:09

dimas-b previously approved these changes Jun 11, 2025

View reviewed changes

github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Jun 11, 2025

dimas-b added the 1.0-blocker label Jun 11, 2025

gh-yzou added 9 commits June 18, 2025 11:25

update doc

747b6f0

update build and readme

df33614

add back jr

c011531

udpate dependency

7aa6a26

add change

5979e2b

update

a1f892c

update tests

cd2a94e

remove merge service file

1fb7ccd

update readme

841bcc4

gh-yzou force-pushed the yzou-test-plugin branch from 4fecc92 to 841bcc4 Compare June 18, 2025 18:30

update readme

5ad378f

dimas-b reviewed Jun 18, 2025

View reviewed changes

dimas-b approved these changes Jun 18, 2025

View reviewed changes

flyrain approved these changes Jun 18, 2025

View reviewed changes

gh-yzou merged commit 1f7f127 into apache:main Jun 18, 2025
12 checks passed

github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Jun 18, 2025

gh-yzou added a commit to gh-yzou/polaris that referenced this pull request Jun 21, 2025

Revert "Reuse shadowJar for spark client bundle jar maven publish (ap…

635558d

…ache#1857)" This reverts commit 1f7f127.

gh-yzou added a commit to gh-yzou/polaris that referenced this pull request Jun 23, 2025

Revert "Reuse shadowJar for spark client bundle jar maven publish (ap…

e512399

…ache#1857)" This reverts commit 1f7f127.

gh-yzou added a commit to gh-yzou/polaris that referenced this pull request Jun 23, 2025

Revert "Reuse shadowJar for spark client bundle jar maven publish (ap…

82f31e7

…ache#1857)" This reverts commit 40f4d36.

Reuse shadowJar for spark client bundle jar maven publish #1857

Reuse shadowJar for spark client bundle jar maven publish #1857

Uh oh!

Conversation

gh-yzou commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gh-yzou Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b left a comment

Choose a reason for hiding this comment

Uh oh!

gh-yzou commented Jun 11, 2025

Uh oh!

dimas-b commented Jun 11, 2025

Uh oh!

dimas-b left a comment

Choose a reason for hiding this comment

Uh oh!

gh-yzou commented Jun 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

gh-yzou commented Jun 11, 2025 •

edited

Loading

gh-yzou Jun 12, 2025 •

edited

Loading

dimas-b Jun 11, 2025 •

edited

Loading