Skip to content

Conversation

@RussellSpitzer
Copy link
Member

Previously the build code would add LICENSE and NOTICE files from other shaded dependencies, some of which could potentially shade the project's LICENSE or NOTICE or cause confusion as to which file was correct.

To fix this a post-processing task is added which removes all potentially confusing files and manually adds the project LICENSE and NOTICE files back to the jar.


I tried fixing this via exclusions and include rules but for whatever reason the shadowJar plugin didn't seem to behave
well. The Duplication strategy apparently can only apply to the final jar so unless I was able to individually exclude the
licenses from all dependencies I was a bit stuck. To avoid the issue entirely, I just post process the jar. This adds about 10
seconds to the task on my machine but I figure that is not a huge cost for something that won't be run that often.

Previously the build code would add LICENSE and NOTICE files
from other shaded dependencies, some of which could potentially
 shade the project's LICENSE or NOTICE or cause confusion as to
which file was correct.

To fix this a post-processing task is added which removes all
potentially confusing files and manually adds the project
LICENSE and NOTICE files back to the jar.
@RussellSpitzer
Copy link
Member Author

Output from task -

➜  polaris git:(CleanupSparkPluginLicense) ✗ ./gradlew :polaris-spark-3.5_2.12:clean :polaris-spark-3.5_2.12:createPolarisSparkJar
Configuration on demand is an incubating feature.

> Task :polaris-spark-3.5_2.12:addLicenseFilesToJar
Processing jar: /Users/rspitzer/repos/polaris/plugins/spark/v3.5/spark/build/2.12/libs/polaris-spark-3.5_2.12-1.1.0-incubating-SNAPSHOT-bundle.jar
Using temp directory: /Users/rspitzer/repos/polaris/plugins/spark/v3.5/spark/build/2.12/tmp/jar-cleanup-polaris-spark-3.5_2.12-bundle
Removing license file: LICENSE-EDL-1.0.txt
Removing license file: LICENSE-EPL-1.0.txt
Removing license file: LICENSE
Removing license file: META-INF/LICENSE
Removing license file: META-INF/NOTICE
Removing license file: META-INF/LICENSE.txt
Removing license file: NOTICE
Added project LICENSE file
Added project NOTICE file
Recreated jar with only project LICENSE and NOTICE files

[Incubating] Problems report is available at: file:///Users/rspitzer/repos/polaris/build/reports/problems/problems-report.html

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

For more on this, please refer to https://docs.gradle.org/8.14.2/userguide/command_line_interface.html#sec:command_line_warnings in the Gradle documentation.

BUILD SUCCESSFUL in 14s
23 actionable tasks: 2 executed, 3 from cache, 18 up-to-date
➜  polaris git:(CleanupSparkPluginLicense) ✗ jar tf ./plugins/spark/v3.5/spark/build/2.12/libs/polaris-spark-3.5_2.12-1.1.0-incubating-SNAPSHOT-bundle.jar | grep -iE "LICENSE|NOTICE"
META-INF/licenses/
META-INF/licenses/org.projectnessie.nessie/
META-INF/licenses/org.projectnessie.nessie/nessie-client-0.103.3/
META-INF/licenses/org.projectnessie.nessie/nessie-model-0.103.3/
LICENSE
META-INF/FastDoubleParser-LICENSE
META-INF/FastDoubleParser-NOTICE
META-INF/bigint-LICENSE
META-INF/licenses/org.projectnessie.nessie/nessie-client-0.103.3/LICENSE
META-INF/licenses/org.projectnessie.nessie/nessie-client-0.103.3/NOTICE
META-INF/licenses/org.projectnessie.nessie/nessie-model-0.103.3/LICENSE
META-INF/licenses/org.projectnessie.nessie/nessie-model-0.103.3/NOTICE
META-INF/thirdparty-LICENSE
NOTICE
codegen/includes/license.ftl
org/eclipse/microprofile/openapi/annotations/info/License.class
org/eclipse/microprofile/openapi/models/info/License.class

include("META-INF/NOTICE*")
}
.forEach { file ->
println("Removing license file: ${file.relativeTo(tempDir)}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we remove the print statement in the build?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure let me switch that to debug logging

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, Gradle DEBUG is very noisy, i'll put it at INFO

gh-yzou
gh-yzou previously approved these changes Jun 26, 2025
Copy link
Contributor

@gh-yzou gh-yzou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RussellSpitzer Thanks for helping on the license for spark client!

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Jun 26, 2025
Change to loggers from println
Use unique directory for jar extraction
gh-yzou
gh-yzou previously approved these changes Jun 26, 2025
@flyrain
Copy link
Contributor

flyrain commented Jun 26, 2025

Thanks a lot for the quick fix @RussellSpitzer! Should we also remove these ?

META-INF/FastDoubleParser-LICENSE
META-INF/FastDoubleParser-NOTICE
META-INF/bigint-LICENSE
META-INF/licenses/org.projectnessie.nessie/nessie-client-0.103.3/LICENSE
META-INF/licenses/org.projectnessie.nessie/nessie-client-0.103.3/NOTICE
META-INF/licenses/org.projectnessie.nessie/nessie-model-0.103.3/LICENSE
META-INF/licenses/org.projectnessie.nessie/nessie-model-0.103.3/NOTICE
META-INF/thirdparty-LICENSE

Or at lease these?

META-INF/FastDoubleParser-LICENSE
META-INF/FastDoubleParser-NOTICE
META-INF/bigint-LICENSE
META-INF/thirdparty-LICENSE

Comment on lines 143 to 144
include("META-INF/LICENSE*")
include("META-INF/NOTICE*")
Copy link
Contributor

@flyrain flyrain Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we could do this to remove more

   include("META-INF/*LICENSE")
        include("META-INF/*NOTICE")

@RussellSpitzer
Copy link
Member Author

Thanks a lot for the quick fix @RussellSpitzer! Should we also remove these ?

META-INF/FastDoubleParser-LICENSE
META-INF/FastDoubleParser-NOTICE
META-INF/bigint-LICENSE
META-INF/licenses/org.projectnessie.nessie/nessie-client-0.103.3/LICENSE
META-INF/licenses/org.projectnessie.nessie/nessie-client-0.103.3/NOTICE
META-INF/licenses/org.projectnessie.nessie/nessie-model-0.103.3/LICENSE
META-INF/licenses/org.projectnessie.nessie/nessie-model-0.103.3/NOTICE
META-INF/thirdparty-LICENSE

Or at lease these?

META-INF/FastDoubleParser-LICENSE
META-INF/FastDoubleParser-NOTICE
META-INF/bigint-LICENSE
META-INF/thirdparty-LICENSE

I thought we probably should keep those? This is where I was hoping @jbonofre could chime in. I think we could clean them all out since we consolidated them right? But I also thought they weren't confusing with our core license so I left them there just incase.


// Post-processing task to add our project's LICENSE and NOTICE files to the jar and remove any
// other LICENSE or NOTICE files that were shaded in.
tasks.register("addLicenseFilesToJar") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Would this work?

  1. Run shadowJar with exclude("LICENSE")
  2. Run a simple jar task depending on shadowJar and add our own LICENSE
  3. Use output of step 3 as the published artifact.

It doubles the fat jar, but hopefully it's not too much overhead.

Copy link
Member Author

@RussellSpitzer RussellSpitzer Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope :( I tried that first

Either I don't understand Gradle or I don't understand Kotlin or some combination of both, which is probably true. Adding exclude("LICENSE") with or without wildcards would never actually exclude anything.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option: Custom transformer: https://gradleup.com/shadow/configuration/merging/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimas-b i can follow this up with excluding LICENSE and NOTICE from other dependency. My previous experience with excluding is that it may not work well under some situation, I will need to dig into it. There is also a plan to just pack it as an uber jar project, i can take care of those all together later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick demonstration -

diff --git a/plugins/spark/v3.5/spark/build.gradle.kts b/plugins/spark/v3.5/spark/build.gradle.kts
index 58ac8c98..6a9defb3 100644
--- a/plugins/spark/v3.5/spark/build.gradle.kts
+++ b/plugins/spark/v3.5/spark/build.gradle.kts
@@ -89,6 +89,9 @@ tasks.register<ShadowJar>("createPolarisSparkJar") {
   from(sourceSets.main.get().output)
   configurations = listOf(project.configurations.runtimeClasspath.get())

+  exclude("LICENSE")
+  exclude("NOTICE")
+
   // Optimization: Minimize the JAR (remove unused classes from dependencies)
   // The iceberg-spark-runtime plugin is always packaged along with our polaris-spark plugin,
   // therefore excluded from the optimization.
> Task :polaris-spark-3.5_2.12:addLicenseFilesToJar
Custom actions are attached to task ':polaris-spark-3.5_2.12:addLicenseFilesToJar'.
Caching disabled for task ':polaris-spark-3.5_2.12:addLicenseFilesToJar' because:
Gradle would require more information to cache this task
Task ':polaris-spark-3.5_2.12:addLicenseFilesToJar' is not up-to-date because:
Task has not declared any outputs despite executing actions.
Processing jar: /Users/rspitzer/repos/polaris/plugins/spark/v3.5/spark/build/2.12/libs/polaris-spark-3.5_2.12-1.1.0-incubating-SNAPSHOT-bundle.jar
Using temp directory: /Users/rspitzer/repos/polaris/plugins/spark/v3.5/spark/build/2.12/tmp/jar-cleanup-polaris-spark-3.5_2.12-bundle
Removing license file: LICENSE-EDL-1.0.txt
Removing license file: LICENSE-EPL-1.0.txt
Removing license file: LICENSE. <------------------Not Excluded :_
Removing license file: META-INF/LICENSE
Removing license file: META-INF/NOTICE
Removing license file: META-INF/LICENSE.txt
Removing license file: NOTICE . <------------------Not Excluded :_
Added project LICENSE file
Added project NOTICE file
[ant:jar] Building jar: /Users/rspitzer/repos/polaris/plugins/spark/v3.5/spark/build/2.12/libs/polaris-spark-3.5_2.12-1.1.0-incubating-SNAPSHOT-bundle.jar
Recreated jar with only project LICENSE and NOTICE files

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I'm fine with whatever works for 1.0 🤷‍♂️

... but I really hope there's a simpler solution for later :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solution here is very simple :) It's just not elegant

@dimas-b
Copy link
Contributor

dimas-b commented Jun 26, 2025

@dimas-b
Copy link
Contributor

dimas-b commented Jun 26, 2025

Ah, I guess what I mentioned in my previous comment had already been addressed 🤦

dimas-b
dimas-b previously approved these changes Jun 26, 2025
Copy link
Member

@jbonofre jbonofre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we "consolidate" all bundled dependencies LICENSE/NOTICE in our LICENSE, we should exclude:

META-INF/licenses/
META-INF/licenses/org.projectnessie.nessie/
META-INF/licenses/org.projectnessie.nessie/nessie-client-0.103.3/
META-INF/licenses/org.projectnessie.nessie/nessie-model-0.103.3/
META-INF/FastDoubleParser-LICENSE
META-INF/bigint-LICENSE
META-INF/licenses/org.projectnessie.nessie/nessie-client-0.103.3/LICENSE
META-INF/licenses/org.projectnessie.nessie/nessie-client-0.103.3/NOTICE
META-INF/licenses/org.projectnessie.nessie/nessie-model-0.103.3/LICENSE
META-INF/licenses/org.projectnessie.nessie/nessie-model-0.103.3/NOTICE
META-INF/thirdparty-LICENSE

We should have only "our" LICENSE and NOTICE.

NB: there's no need to include DISCLAIMER in jar files (only in source and binary tarballs).

@RussellSpitzer RussellSpitzer dismissed stale reviews from dimas-b and gh-yzou via 111e0bb June 27, 2025 12:22
@RussellSpitzer
Copy link
Member Author

RussellSpitzer commented Jun 27, 2025

@jbonofre Cleaned up all the licenses -


> Task :polaris-spark-3.5_2.12:addLicenseFilesToJar
Processing jar: /Users/rspitzer/repos/polaris/plugins/spark/v3.5/spark/build/2.12/libs/polaris-spark-3.5_2.12-1.1.0-incubating-SNAPSHOT-bundle.jar
Using temp directory: /Users/rspitzer/repos/polaris/plugins/spark/v3.5/spark/build/2.12/tmp/jar-cleanup-polaris-spark-3.5_2.12-bundle
Removing license file: LICENSE-EDL-1.0.txt
Removing license file: LICENSE-EPL-1.0.txt
Removing license file: LICENSE
Removing license file: META-INF/FastDoubleParser-NOTICE
Removing license file: META-INF/LICENSE
Removing license file: META-INF/licenses/org.projectnessie.nessie/nessie-client-0.103.3/LICENSE
Removing license file: META-INF/licenses/org.projectnessie.nessie/nessie-client-0.103.3/NOTICE
Removing license file: META-INF/licenses/org.projectnessie.nessie/nessie-model-0.103.3/LICENSE
Removing license file: META-INF/licenses/org.projectnessie.nessie/nessie-model-0.103.3/NOTICE
Removing license file: META-INF/FastDoubleParser-LICENSE
Removing license file: META-INF/thirdparty-LICENSE
Removing license file: META-INF/bigint-LICENSE
Removing license file: META-INF/NOTICE
Removing license file: META-INF/LICENSE.txt
Removing license file: codegen/includes/license.ftl
Removing license file: NOTICE
Removed META-INF/licenses directory
Added project LICENSE file
Added project NOTICE file

➜  polaris git:(CleanupSparkPluginLicense) ✗ jar tf ./plugins/spark/v3.5/spark/build/2.12/libs/polaris-spark-3.5_2.12-1.1.0-incubating-SNAPSHOT-bundle.jar | grep -iE "LICENSE|NOTICE"
LICENSE
NOTICE
org/eclipse/microprofile/openapi/annotations/info/License.class
org/eclipse/microprofile/openapi/models/info/License.class

Note to anyone in the future (@gh-yzou) Be careful since there are 2 ^ Class files with License in their name

@jbonofre
Copy link
Member

@RussellSpitzer awesome thanks ! I do a new pass.

@RussellSpitzer @flyrain maybe worth to include in a 1.0 rc3 ? We can reduce the voting period and close the vote as soon as we have 3 binding votes, in order to start a new vote for IPMC soon. Thoughts ?

@RussellSpitzer
Copy link
Member Author

I got worried about the codegen license template, so I decided to not touch lower-case license files.

Removing license file: LICENSE-EDL-1.0.txt
Removing license file: LICENSE-EPL-1.0.txt
Removing license file: LICENSE
Removing license file: META-INF/FastDoubleParser-NOTICE
Removing license file: META-INF/LICENSE
Removing license file: META-INF/licenses/org.projectnessie.nessie/nessie-client-0.103.3/LICENSE
Removing license file: META-INF/licenses/org.projectnessie.nessie/nessie-client-0.103.3/NOTICE
Removing license file: META-INF/licenses/org.projectnessie.nessie/nessie-model-0.103.3/LICENSE
Removing license file: META-INF/licenses/org.projectnessie.nessie/nessie-model-0.103.3/NOTICE
Removing license file: META-INF/FastDoubleParser-LICENSE
Removing license file: META-INF/thirdparty-LICENSE
Removing license file: META-INF/bigint-LICENSE
Removing license file: META-INF/NOTICE
Removing license file: META-INF/LICENSE.txt
Removing license file: NOTICE
Removed META-INF/licenses directory
Added project LICENSE file
Added project NOTICE file

@flyrain
Copy link
Contributor

flyrain commented Jun 27, 2025

@RussellSpitzer @flyrain maybe worth to include in a 1.0 rc3 ? We can reduce the voting period and close the vote as soon as we have 3 binding votes, in order to start a new vote for IPMC soon. Thoughts ?

@jbonofre , I will cut RC3 once this was merged.

@jbonofre
Copy link
Member

I found that Google Flatbuffers (https://github.com/google/flatbuffers) is missing in LICENSE (it seems it changed). It shaded in Apache Iceberg so it has to be in LICENSE. I propose to include in this PR.

Copy link
Member

@jbonofre jbonofre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exclude is good (it's OK from this standpoint).

However, the LICENSE has to be updated to document Google Flatbuffers.

@flyrain flyrain merged commit c004728 into apache:main Jun 27, 2025
11 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Jun 27, 2025
@flyrain
Copy link
Contributor

flyrain commented Jun 27, 2025

Thanks @RussellSpitzer for the fix! Thanks everyone for the review.

snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* Exclude unused dependency for polaris spark client dependency (apache#1933)

* enable ETag integration tests (apache#1935)

tests were added in 8b5dfa9 and afaict supposed to get enabled after ec97c1b

* Fix Pagination for Catalog Federation (apache#1849)

Details can be found in this issue: apache#1848

* Update doc to fix docker build inconsistency issue (apache#1946)

* Simplify install dependency doc (apache#1941)

* Simply getting start doc

* Simply install dependecy doc

* Minor words change

* Fix admin tool for quick start (apache#1945)

When attempting to use the `polaris-admin-tool.jar` to bootstrap a realm, the application fails with a `jakarta.enterprise.inject.UnsatisfiedResolutionException` because it cannot find a `javax.sql.DataSource` bean. Detail in apache#1943

This issue occurs because `quarkus.datasource.db-kind`is a build-time property in Quarkus. Its value must be defined during the application's build process to enable the datasource extension and generate the necessary CDI bean producer (ref: https://quarkus.io/guides/all-config#quarkus-datasource_quarkus-datasource-db-kind).

I think we only support postgres for now, thus, I set `quarkus.datasource.db-kind=postgresql`. This can be problematic if we later want to support more data sources other than postgres. There are couple options we have for this such as use multiple named datasources in the config during build time. But this may be out of scope of this PR. I am open for more discussion on this, but for the time being, it may be better to unblock people who are trying to use the quick start doc.

Sample output for the bootstrap container after the fix:
```
➜  polaris git:(1943) docker logs polaris-polaris-bootstrap-1
Realm 'POLARIS' successfully bootstrapped.
Bootstrap completed successfully.
```

* fix(build): Fix deprecation warnings in FeatureConfiguration (apache#1894)

* Fix NPE in listCatalogs (apache#1949)

listCatalogs is non-atomic. It first atomically lists all entities and then iterates through each one and does an individual loadEntity call. This causes an NPE when calling `CatalogEntity::new`.

I don't think it's ever useful for listCatalogsUnsafe to return null since the caller isn't expecting a certain length of elements, so I just filtered it there.

* Fix doc for sample log and default password (apache#1951)

Minor updates for the quick start doc:
1. update sample output to reflect with the latest code
2. update default password to the right value
3. remove trailing space

* Optimize the location overlap check with an index (apache#1686)

The location overlap check for "sibling" tables (those which share a parent) has been a performance bottleneck since its introduction, but we haven't historically had a good way around this other than just disabling the check. 

<hr>

### Current Behavior

The current logic is that when we create a table, we list all sibling tables and check each and every one to ensure there is no location overlap. This results in O(N^2) checks when adding N tables to a namespace, quickly becoming untenable.

With the `CreateTreeDataset` [benchmark](https://github.com/eric-maynard/polaris-tools/blob/main/benchmarks/src/gatling/scala/org/apache/polaris/benchmarks/simulations/CreateTreeDataset.scala) I tested creating 5000 sibling tables using the current code:

It is apparent that latency increases over time. Runs took between 90 and 200+ seconds, and Polaris instances with a small memory allocation were prone to crashing due to OOMs:


### Proposed change

This PR adds a new persistence API, `hasOverlappingSiblings`, which if implemented can be used to directly check for the presence of siblings at the metastore layer.

This API is implemented for the JDBC metastore in a new schema version, and some changes are made to account for an evolving schema version now and in the future.

This implementation breaks a location down into components and queries for a sibling at each of those locations, so a new table at location `s3://bucket/root/n1/nA/t1/` will require checking for an entity with location `s3://bucket/`, `s3://bucket/root/`, `s3://bucket/root/n1/`, `s3://bucket/root/n1/nA/`, and finally `s3://bucket/root/n1/nA/t1/%`. All of this can be done in a single query which makes a single pass over the data. 

The query is optimized by the introduction of a new index over a new _location_ column.

With the changes enabled, I tested creating 5000 sibling tables:

Latency is stable over time, and runs consistently completed in less than 30 seconds. I did not observe any OOMs when testing with the feature enabled.

* Add SUPPORTED_EXTERNAL_CATALOG_AUTHENTICATION_TYPES feature configuration (apache#1931)

* Add SUPPORTED_FEDERATION_AUTHENTICATION_TYPES feature configuration

* Add unit tests

* Update Helm chart version (apache#1957)

* Remove the maintainer list in Helm Chart README (apache#1962)

* Use multi-lines instead of single line (apache#1961)

* Fix invalid sample script in CLI doc (apache#1964)

* Fix hugo blockquote (apache#1967)

* Fix hugo blockquote

* Add license header

* Fix lint rules (apache#1953)

* Mutable objects used for immutable values (apache#1596)

* fix: Only include project LICENSE and NOTICE in Spark Client Jar (apache#1950)

* Add Sushant as a collaborator (apache#1956)

* Adds missing Google Flatbuffers license information (apache#1968)

* fix: Typo in Spark Client Build File (apache#1969)

debugrmation

* Python code format (apache#1954)

* test(integration): refactor PolarisRestCatalogIntegrationTest to run against any cloud provider (apache#1934)

* Make Catalog Integration Test suite cloud native

* Fix admin tool doc (apache#1977)

* Fix admin tool doc

* Fix admin tool doc

* Update release-guide.md (apache#1927)

* Add relational-jdbc to helm (apache#1937)


Motivation for the Change

Polaris needs to support relational-jdbc as the default persistence type for simpler database configuration and better cloud-native deployment experience.
Description of the Status Quo (Current Behavior)

Currently, the Helm chart only supports eclipse-link persistence type as the default, which requires complex JPA configuration with persistence.xml files.
Desired Behavior

    Add relational-jdbc persistence type support to Helm chart
    Use relational-jdbc as the default persistence type
    Inject JDBC configuration (username, password, jdbc_url) through Kubernetes Secrets as environment variables
    Maintain backward compatibility with eclipse-link

Additional Details

    Updated persistence-values.yaml for CI testing
    Updated test coverage for relational-jdbc configuration
    JDBC credentials are injected via QUARKUS_DATASOURCE_* environment variables from Secret
    Secret keys: username, password, jdbc_url

* Add CHANGELOG (apache#1952)

* Add rudimentary CHANGELOG.md

* Add the Jetbrains Changelog Gradle plugin to help managing CHANGELOG.md

* Share Polaris Community Meeting for 2025-06-26 (apache#1978)

* Correct javadoc text in generateOverlapQuery() (apache#1975)

* Fix javadoc warning: invalid input: '&'
* Correct javadoc text in generateOverlapQuery()

* Do not serialize null properties in the management model (apache#1955)

* Ignore null values in JSON output

* This may have an impact on existing client, but it is not
  likely to be substantial because normally absent properties
  should be treated the same as having `null` values.

* This change enables adding new optional fields to the
  Management API while maintaining backward compatibility in
  the future: New properties will not be exposed to clients
  unless a value for them in explicitly set.

* Add OpenHFT in Spark plugin LICENSE (apache#1979)

* Add additional unit and integration tests for etag functionality (apache#1972)

* Additional unit test for Etags

* Added a few corner case IT tests for testing etags with schema changes.

* Added IT tests to test changes after DDL and DML

* Add options to the bootstrap command to specify a schema file (apache#1942)

Instead of always using the hardcoded `schema-v1.sql` file, it would be nice if users could specify a file to bootstrap from. This is especially relevant after apache#1686 which proposes to add a new "version" of the schema.

* Added support for `s3a` scheme (apache#1932)

* Fix the sign failure (apache#1926)

* Fix doc to remove outdated note about fine-grained access controls support (apache#1983)

Minor update for the access control doc:

1. Remove the misleading section on privileges can only be granted at catalog level. I've tested the fine-grained access controls and confirmed that privileges can be applied to an individual table in the catalog.

* Add support for catalog federation in the CLI (apache#1912)

The CLI currently only supports the version of EXTERNAL catalogs that was present in 0.9.0. Now, EXTERNAL catalogs can be configured with various configurations relating to federation. This PR updates the CLI to better match the REST API so that federated catalogs can be easily set up in the CLI.

* fix: Remove db-kind in helm chart (apache#1987)

* Add a Spark session builder for the tests (apache#1985)

* Fix doc for CLI update (apache#1994)

PR for apache#1866

* Improve createPrincipal example in API docs (apache#1992)

In apache#1929 it was pointed out that the example in the Polaris docs suggests that users can provide a client ID during principal creation:

. . .


This PR attempts to fix this by adding an explicit example to the spec.

* Add doc for repair option (apache#1993)

PR for apache#1864

* Refactor relationalJdbc in helm (apache#1996)

* Add regression test coverage for Spark Client with package conf (apache#1997)

* Remove unnecessary `InputStream.close` call (apache#1982)

apache#1942 changed the way that the bootstrap init script is handled, but it added an extra `InputStream.close` call that shouldn't be needed after the BufferedReader [here](https://github.com/apache/polaris/pull/1942/files#diff-de43b240b5b5e07aba7e89f5515a417cefd908845b85432f3fcc0819911f3e2eR89) is closed. This PR removes that extra call.

* Materialize Realm ID for Session Supplier in JDBC (apache#1988)

It was discovered that the Session Supplier maps used in the MetaStoreManagerFactory implementations were passing in RealmContext objects to the supplier directly and then using the RealmContext objects to create BasePersistence implementation objects within the supplier. This supplier is cached on a per-realm basis in most MetaStoreManagerFactory implementations. RealmContext objects are request-scoped beans.

As a result, if any work is being done outside the scope of the request, such as during a Task, any calls to getOrCreateSessionSupplier for creating a BasePersistence implementation will fail as the RealmContext object is no longer available.

This PR will ensure for the JdbcMetaStoreManagerFactory that the Realm ID is materialized from the RealmContext and used inside the supplier so that the potentially deactivated RealmContext object does not need to be used in creating the BasePersistence object. Given that we are caching on a per-realm basis, this should not introduce any unforeseen behavior for the JdbcMetaStoreManagerFactory as the Realm ID must match exactly for the same supplier to be returned from the Session Supplier map.

* rebase/changes

* minor refactoring

* Last merged commit 8fa6bf2

---------

Co-authored-by: Yun Zou <[email protected]>
Co-authored-by: Christopher Lambert <[email protected]>
Co-authored-by: Rulin Xing <[email protected]>
Co-authored-by: MonkeyCanCode <[email protected]>
Co-authored-by: Alexandre Dutra <[email protected]>
Co-authored-by: Andrew Guterman <[email protected]>
Co-authored-by: Eric Maynard <[email protected]>
Co-authored-by: Pooja Nilangekar <[email protected]>
Co-authored-by: Yufei Gu <[email protected]>
Co-authored-by: fabio-rizzo-01 <[email protected]>
Co-authored-by: Russell Spitzer <[email protected]>
Co-authored-by: Sushant Raikar <[email protected]>
Co-authored-by: Jiwon Park <[email protected]>
Co-authored-by: Dmitri Bourlatchkov <[email protected]>
Co-authored-by: JB Onofré <[email protected]>
Co-authored-by: Sandhya Sundaresan <[email protected]>
Co-authored-by: Pavan Lanka <[email protected]>
Co-authored-by: CG <[email protected]>
Co-authored-by: Adnan Hemani <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants