Adjustable limit on the number of locations per storage config #1068

eric-maynard · 2025-02-25T22:57:49Z

Currently, the number of locations allowed in a storage configuration is hard-coded per cloud provider. In fact, Polaris avoids building a policy that uses every location at once and so there isn't a need to limit this number. Only if a table spanned N locations would a request to (e.g.) STS actually include a policy that has N locations.

This PR introduces a config to adjust that limit, and also raises the default.

...-core/src/main/java/org/apache/polaris/core/storage/azure/AzureStorageConfigurationInfo.java

flyrain · 2025-02-26T02:11:20Z

polaris-core/src/main/java/org/apache/polaris/core/PolarisConfiguration.java

+      PolarisConfiguration.<Integer>builder()
+          .key("STORAGE_CONFIGURATION_MAX_LOCATIONS")
+          .description("How many locations can be associated with a storage configuration")
+          .defaultValue(20)


Do we need a limit in the catalog level if a table can only be possible with three locations specified by properties location, write.data.path, and write.metadata.path?

Currently we don't use this value for those locations; this is a limit on the allowed-locations a catalog can have

Sorry I didn't express it clearly originally. My question is about why we need a limit on the allowed-locations of a catalog? Per our off-line discussion, IIUC, it's needed for table level policy used by credential vending so that the policy text won't be too long to exceed certain limit. Given that table locations are limited even there could be a large number of allowed-locations of its catalog, it seems not an issue, and no reason to put a limit. Am I understanding correctly?

That's right. I would be okay with removing a limit totally, but on the other hand @collado-mike is mentioning the possibility of having 3 limits. I think preserving at least one limit, so we have the concept of a limit, may be helpful in the future especially if we do push storage configs down to the table level.

Even with storage configs pushing down to table level, the chance of a unbounded number of table location is quite small.

For majority Iceberg use cases within Polaris, writers will only possible to use three locations specified by properties location, write.data.path, and write.metadata.path.

For migration use case, admittedly it is possible that are more than 3 locations mentioned above. However, users should be aware of the number of locations while migration, and add them to the table-level storage configs. At that time, we can enforce it by saying "that's too many locations, credential vending won't work." The limit seems better at table-level, as locations from different tables may not overlap.

In short, a limit at catalog level doesn't seem necessary now, and may not be effective in the future. I'd consider to remove it. But I'm open to be convinced by other use cases.

I will go with no limit. I think the current users should be fine as we relax the limit by doing so. Nothing should be broken. If users start to ask for a limit due to whatever reason, then we can think of adding a limit config at that time. WDYT?

Updated the PR to simply have no config for now 👍

As a matter of best practice, it is very nearly always a good idea to make behavior changes behind a config flag. It's very easy to add a config that allows replicating the exact behavior we see today and then to remove that config if users are happy with the proposed change. What's not easy to do is to bring that behavior back to existing deployments once the code is ripped out. Making small, incremental changes to ensure we understand the unintended side effects seems pretty uncontroversial to me. I think Yufei's suggestion to keep the config but mark it deprecated is reasonable and allows us the opportunity to make changes carefully. Why is that controversial?

Okay, so is everyone okay with 1 config now? If so, I will go ahead with restoring the PR to its state before this commit.

I'm not sure what exactly it means to mark a config as "deprecated", but we cannot easily remove a config once we've added one.

I'm ok to 'disagree and commit' to support one config. but I think we ought to at least have one config that defaults to the current behavior and allow users to increase as they see fit.

flyrain

+1, as a followup, we can provide an error message while the policy/rule exceeds the length, we may need to change the method getSubscopedCreds() a bit.

flyrain · 2025-03-04T04:21:07Z

Looks like this unit test failed, testExceedMaxAllowedLocations. We need to change that.

pavibhai · 2025-03-05T01:13:06Z

@eric-maynard Thanks for the changes.

I have a question with respect to skipping the configuration of allowed locations and prefix validation. Is this in scope for this PR and if so does this not require any change in the prefix validation logic?

If I am not mistaken we need handling in InMemoryStorageIntegration:validateSubpathsOfAllowedLocations

eric-maynard · 2025-03-06T00:01:40Z

Holding this for a moment to potentially rebase onto #1124

collado-mike · 2025-03-11T21:43:36Z

@eric-maynard are you ok to merge this one now, then rebase on #1124 after we have approval from the other reviewers?

eric-maynard · 2025-03-11T21:45:29Z

Unfortunately a rebase onto #1124 would change the flag from a feature flag to a behavior flag

helm/polaris/ci/fixtures/persistence.yaml

polaris-core/src/main/java/org/apache/polaris/core/PolarisConfigurationStore.java

dimas-b · 2025-03-11T21:56:50Z

polaris-core/src/main/java/org/apache/polaris/core/storage/gcp/GcpStorageConfigurationInfo.java

-    validateMaxAllowedLocations(MAX_ALLOWED_LOCATIONS);
+    validateMaxAllowedLocations(
+        PolarisConfigurationStore.getConfiguration(
+            PolarisConfiguration.STORAGE_CONFIGURATION_MAX_LOCATIONS));


If validation fails, it will cause a runtime error, making this storage configuration unusable... This seems risky to me because storage config is persisted, but the PolarisConfigurationStore.getConfiguration() value can change from run to run (it may even be different is different servers in the same cluster). WDYT?

Yeah, this concern was raised in the discussion above where @flyrain advocated for not having a config at all, but @collado-mike felt strongly that we needed one. In the case that the flag is set to e.g. 100, someone creates a storage config with 99 paths, and then the flag is lowered to 10 then yes runtime exceptions will start happening on the service as soon as the storage config is created in-memory.

This fragility is one reason why I'm trying (through #1124) to make this a behavior-change flag rather than a fully-supported feature flag.

I believe this kind of enforcement is valuable on storage config creation (and changes), but not on load (regardless of how we present the config flag itself).

I agree that the behavior here with a limit set is far from ideal.

I think this will be more clear once we close the discussion on #1124, but in essence this code should never be hit unless the user has touched a flag that's described there as:

... These flags control subtle behavior adjustments and bug fixes, not user-facing catalog settings. They are intended or internal use only, are inherently unstable, ...

Going forward with these flags, I think we should consider that sometimes the behavior you get with the flag set to a non-default value is a poor or even broken experience. The only reason the problematic code (this entire check, in this case) is left in place is due to some perceived risk or potential regression associated with removing it entirely. Eventually, we will want to remove it entirely. #1124 says of this:

Flags here are generally short-lived and should either be removed or promoted to stable feature flags before the next release.

In accordance with this guidance, this entire check should go away before the next release at which point in time I think the concern you raise above would be addressed. In the interim, we will be able to optionally switch back to the exact semantics in place before this change (the intent of a behavior change flag). WDYT?

In this context, my concern is not about how configuration works or how configuration is described. My concern is with runtime behaviour in case the new config settings make storage config validation fail.

Whether the admin user had enough warning about risks is irrelevant, IMHO, because the storage config in question is present in the database. It was stored some time ago under different rules. I believe Polaris has to honour its persisted config more than transient validation settings, because Polaris' primary function as a catalog is to manifest its persisted data to clients in order for clients to gain access to tables. In this case the validation flags are not necessary to ensure correctness of operation. They are rather arbitrary limits from the perspective of accessing tables.

I can foresee how the new flag can cause a Polaris Server to fail to provide access to tables that used to be accessible before setting the flag. However, I do not see how this code change can help Polaris Server in dealing with configurations that have too many locations (for example).

If the concern is that having too many locations is detrimental to performance, then the user will have to adjust configuration somehow, but again in that case, what is the value of failing to make unadjusted storage config available for processing? It can cause hard failures compared to hypothetical performance problems... unless I'm missing something :)

Feature flags on top of validation config/flags sounds like an overkill to me... but I will not object to that :)

Yeah that is the idea -- separate out actual features from things like this where we want to be able to "revert" easily.

Anyway, if you still think this check is problematic I don't totally disagree, and we can try to move this to the API layer so there's no issue with deserializing a config that was created with a higher limit

It looks to me that the issues I mentioned earlier are still present in the latest state of this PR (unless I'm missing something).

Advocating for the OSS user of Polaris, I believe it is preferable to move this validation to the API layer.

I will not block this PR because of that, though :)

Sounds good! I moved the check into the builder in CatalogEntity, which I think is the best narrow waist for this

The latest validation approach LGTM :) thx! Posting a couple of minor comments separately.

…-limit

polaris-core/src/main/java/org/apache/polaris/core/entity/CatalogEntity.java

dimas-b · 2025-03-13T17:10:53Z

polaris-core/src/main/java/org/apache/polaris/core/storage/gcp/GcpStorageConfigurationInfo.java

-    validateMaxAllowedLocations(MAX_ALLOWED_LOCATIONS);
+    validateMaxAllowedLocations(
+        PolarisConfigurationStore.getConfiguration(
+            PolarisConfiguration.STORAGE_CONFIGURATION_MAX_LOCATIONS));


The latest validation approach LGTM :) thx! Posting a couple of minor comments separately.

polaris-core/src/main/java/org/apache/polaris/core/entity/CatalogEntity.java

dimas-b

I'm ok merging without my minor comments if you disagree :)

…e#1068) * initial commit * autolint * change the config store access * autolint * add support for 01 * autolint * fix test * autolint * retest * rebase * autolint * change the config store access * autolint * add support for 01 * autolint * fix test * autolint * retest * fix a test * autolint * fix another test * autolint * remove catalog config for now as it's not used * changes * autolint * update test to reflect -1 default * autolint * autolint * move the check * changes per review * ready * autolint

…rage config" (apache#26) * b93e97b * Adjustable limit on the number of locations per storage config (apache#1068) * initial commit * autolint * change the config store access * autolint * add support for 01 * autolint * fix test * autolint * retest * rebase * autolint * change the config store access * autolint * add support for 01 * autolint * fix test * autolint * retest * fix a test * autolint * fix another test * autolint * remove catalog config for now as it's not used * changes * autolint * update test to reflect -1 default * autolint * autolint * move the check * changes per review * ready * autolint * spotless --------- Co-authored-by: Eric Maynard <[email protected]>

eric-maynard added 2 commits February 25, 2025 14:50

initial commit

f674cfc

autolint

ef45a4f

eric-maynard requested review from adutra, ashvina, dennishuo, dimas-b, jackye1995, jbonofre and vvcephei as code owners February 25, 2025 22:57

github-project-automation bot added this to Basic Kanban Board Feb 25, 2025

eric-maynard requested review from MonkeyCanCode, RussellSpitzer, collado-mike, ebyhr, flyrain, snazy and takidau as code owners February 25, 2025 22:57

github-project-automation bot moved this to PRs In Progress in Basic Kanban Board Feb 25, 2025

collado-mike reviewed Feb 25, 2025

View reviewed changes

...-core/src/main/java/org/apache/polaris/core/storage/azure/AzureStorageConfigurationInfo.java Show resolved Hide resolved

flyrain reviewed Feb 26, 2025

View reviewed changes

eric-maynard added 5 commits March 2, 2025 22:05

merge conflicts

8e65bb1

change the config store access

8d3c29d

autolint

f18df3b

add support for 01

412180c

autolint

f2c5d04

flyrain approved these changes Mar 4, 2025

View reviewed changes

github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Mar 4, 2025

eric-maynard changed the title ~~Adjustable limit on the number of locations per storage config~~ Remove the limit on the number of locations per storage config Mar 4, 2025

eric-maynard disabled auto-merge March 5, 2025 18:57

eric-maynard mentioned this pull request Mar 6, 2025

Introduce behavior-change configs as an alternative to feature configs #1124

Merged

collado-mike approved these changes Mar 11, 2025

View reviewed changes

dimas-b reviewed Mar 11, 2025

View reviewed changes

helm/polaris/ci/fixtures/persistence.yaml Show resolved Hide resolved

dimas-b reviewed Mar 11, 2025

View reviewed changes

polaris-core/src/main/java/org/apache/polaris/core/PolarisConfigurationStore.java Outdated Show resolved Hide resolved

dimas-b reviewed Mar 11, 2025

View reviewed changes

eric-maynard added 8 commits March 11, 2025 15:11

Merge branch 'main' of github.com:apache/polaris into adjust-location…

2f8da88

…-limit

changes

5d57658

autolint

3aabe06

update test to reflect -1 default

b0e3d64

autolint

b298fa9

rebase

7de1366

autolint

8817bb8

move the check

a2c0afa

dimas-b reviewed Mar 13, 2025

View reviewed changes

polaris-core/src/main/java/org/apache/polaris/core/entity/CatalogEntity.java Outdated Show resolved Hide resolved

dimas-b approved these changes Mar 13, 2025

View reviewed changes

changes per review

dc5ffec

eric-maynard enabled auto-merge (squash) March 13, 2025 17:44

eric-maynard added 2 commits March 13, 2025 10:44

ready

d3335e6

autolint

3cc6320

dimas-b approved these changes Mar 13, 2025

View reviewed changes

eric-maynard merged commit 21efdf6 into apache:main Mar 13, 2025
5 checks passed

github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Mar 13, 2025

Adjustable limit on the number of locations per storage config #1068

Adjustable limit on the number of locations per storage config #1068

Uh oh!

Conversation

eric-maynard commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

flyrain Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

flyrain left a comment

Choose a reason for hiding this comment

Uh oh!

flyrain commented Mar 4, 2025

Uh oh!

pavibhai commented Mar 5, 2025

Uh oh!

eric-maynard commented Mar 6, 2025

Uh oh!

collado-mike commented Mar 11, 2025

Uh oh!

eric-maynard commented Mar 11, 2025

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-maynard Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dimas-b left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

eric-maynard commented Feb 25, 2025 •

edited

Loading

flyrain Feb 26, 2025 •

edited

Loading

eric-maynard Mar 12, 2025 •

edited

Loading

dimas-b Mar 12, 2025 •

edited

Loading