-
Notifications
You must be signed in to change notification settings - Fork 333
Support Namespace/Table level RBAC for external passthrough catalogs #2223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -526,7 +526,12 @@ private void authorizeGrantOnTableLikeOperationOrThrow( | |||||||||||||
| PolarisResolvedPathWrapper tableLikeWrapper = | ||||||||||||||
| resolutionManifest.getResolvedPath( | ||||||||||||||
| identifier, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ANY_SUBTYPE, true); | ||||||||||||||
| if (!subTypes.contains(tableLikeWrapper.getRawLeafEntity().getSubType())) { | ||||||||||||||
| boolean rbacForFederatedCatalogsEnabled = | ||||||||||||||
| getCurrentPolarisContext() | ||||||||||||||
| .getRealmConfig() | ||||||||||||||
| .getConfig(FeatureConfiguration.ENABLE_SUB_CATALOG_RBAC_FOR_FEDERATED_CATALOGS); | ||||||||||||||
| if (!(resolutionManifest.getIsPassthroughFacade() && rbacForFederatedCatalogsEnabled) | ||||||||||||||
| && !subTypes.contains(tableLikeWrapper.getRawLeafEntity().getSubType())) { | ||||||||||||||
| CatalogHandler.throwNotFoundExceptionForTableLikeEntity(identifier, subTypes); | ||||||||||||||
| } | ||||||||||||||
|
|
||||||||||||||
|
|
@@ -1686,14 +1691,33 @@ public boolean grantPrivilegeOnNamespaceToRole( | |||||||||||||
| PolarisAuthorizableOperation.ADD_NAMESPACE_GRANT_TO_CATALOG_ROLE; | ||||||||||||||
| authorizeGrantOnNamespaceOperationOrThrow(op, catalogName, namespace, catalogRoleName); | ||||||||||||||
|
|
||||||||||||||
| CatalogEntity catalogEntity = | ||||||||||||||
| findCatalogByName(catalogName) | ||||||||||||||
| .orElseThrow(() -> new NotFoundException("Parent catalog %s not found", catalogName)); | ||||||||||||||
| PolarisEntity catalogRoleEntity = | ||||||||||||||
| findCatalogRoleByName(catalogName, catalogRoleName) | ||||||||||||||
| .orElseThrow(() -> new NotFoundException("CatalogRole %s not found", catalogRoleName)); | ||||||||||||||
|
|
||||||||||||||
| PolarisResolvedPathWrapper resolvedPathWrapper = resolutionManifest.getResolvedPath(namespace); | ||||||||||||||
| if (resolvedPathWrapper == null | ||||||||||||||
| || !resolvedPathWrapper.isFullyResolvedNamespace(catalogName, namespace)) { | ||||||||||||||
| throw new NotFoundException("Namespace %s not found", namespace); | ||||||||||||||
| boolean rbacForFederatedCatalogsEnabled = | ||||||||||||||
| getCurrentPolarisContext() | ||||||||||||||
| .getRealmConfig() | ||||||||||||||
| .getConfig(FeatureConfiguration.ENABLE_SUB_CATALOG_RBAC_FOR_FEDERATED_CATALOGS); | ||||||||||||||
| if (resolutionManifest.getIsPassthroughFacade() && rbacForFederatedCatalogsEnabled) { | ||||||||||||||
| resolvedPathWrapper = | ||||||||||||||
| createSyntheticNamespaceEntities(catalogEntity, namespace, resolvedPathWrapper); | ||||||||||||||
| if (resolvedPathWrapper == null | ||||||||||||||
| || !resolvedPathWrapper.isFullyResolvedNamespace(catalogName, namespace)) { | ||||||||||||||
| throw new RuntimeException( | ||||||||||||||
| String.format( | ||||||||||||||
| "Failed to create synthetic namespace entities for namespace %s in catalog %s", | ||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A plain
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there another exception you'd prefer? If so, I can change it.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The most appropriate one in this case is probably UnprocessableEntityException
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since currently we just create synthetic entities without performing a passthrough, the expectation for At this point, I am leaning towards 500 as this should actually be an illegalState after the backfill. UnprocessableEntityException as 422 may imply that repeating request will fail with the same error, but seems we cannot guarantee that here. WDYT?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah you're right, maybe not UnprocessableEntityException, I misremembered the meaning. 409 is also potentially accurate, just that it happens to be a server-side conflict where the caller doesn't necessarily need to do anything different in the request to retry it successfully. So we could return But it's also true that as long as we don't yet drop facade entities, it's an unexpected illegal state. Then the "raw" RuntimeException correctly gets mapped to Maybe for now we could just add a TODO here to possibly update the exception thrown as we refine the possible retry scenarios?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That sounds good, we will have to update the spec anyway later since the new error code we send back is not documented in the response code list.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A for the status code, 500 seems reasonable to me given the current implementation. I guess the point I was trying to make is that this exception is thrown based on a plain (boolean) condition that something is not available. Ideally the error (and its message) should be based on the real underlying error.... but I do not insist on addressing this in current PR. |
||||||||||||||
| namespace.toString(), catalogName)); | ||||||||||||||
| } | ||||||||||||||
| } else { | ||||||||||||||
| throw new NotFoundException("Namespace %s not found", namespace); | ||||||||||||||
| } | ||||||||||||||
| } | ||||||||||||||
| List<PolarisEntity> catalogPath = resolvedPathWrapper.getRawParentPath(); | ||||||||||||||
| PolarisEntity namespaceEntity = resolvedPathWrapper.getRawLeafEntity(); | ||||||||||||||
|
|
@@ -1737,6 +1761,80 @@ public boolean revokePrivilegeOnNamespaceFromRole( | |||||||||||||
| .isSuccess(); | ||||||||||||||
| } | ||||||||||||||
|
|
||||||||||||||
| /** | ||||||||||||||
| * Creates and persists the missing synthetic namespace entities for external catalogs. | ||||||||||||||
| * | ||||||||||||||
| * @param catalogEntity the external passthrough facade catalog entity. | ||||||||||||||
| * @param namespace the expected fully resolved namespace to be created. | ||||||||||||||
| * @param existingPath the partially resolved path currently stored in the metastore. | ||||||||||||||
| * @return the fully resolved path wrapper. | ||||||||||||||
| */ | ||||||||||||||
| private PolarisResolvedPathWrapper createSyntheticNamespaceEntities( | ||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [non-blocker] We may consider extract this and the table creation one out from |
||||||||||||||
| CatalogEntity catalogEntity, Namespace namespace, PolarisResolvedPathWrapper existingPath) { | ||||||||||||||
|
|
||||||||||||||
| if (existingPath == null) { | ||||||||||||||
| throw new IllegalStateException( | ||||||||||||||
| String.format("Catalog entity %s does not exist.", catalogEntity.getName())); | ||||||||||||||
| } | ||||||||||||||
|
|
||||||||||||||
| List<PolarisEntity> completePath = new ArrayList<>(existingPath.getRawFullPath()); | ||||||||||||||
| PolarisEntity currentParent = existingPath.getRawLeafEntity(); | ||||||||||||||
|
|
||||||||||||||
| String[] allNamespaceLevels = namespace.levels(); | ||||||||||||||
| int numMatchingLevels = 0; | ||||||||||||||
| // Find parts of the complete path that match the namespace levels. | ||||||||||||||
| // We skip index 0 because it is the CatalogEntity. | ||||||||||||||
| for (PolarisEntity entity : completePath.subList(1, completePath.size())) { | ||||||||||||||
| if (!entity.getName().equals(allNamespaceLevels[numMatchingLevels])) { | ||||||||||||||
| break; | ||||||||||||||
| } | ||||||||||||||
| numMatchingLevels++; | ||||||||||||||
| } | ||||||||||||||
|
|
||||||||||||||
| for (int i = numMatchingLevels; i < allNamespaceLevels.length; i++) { | ||||||||||||||
| String[] namespacePart = Arrays.copyOfRange(allNamespaceLevels, 0, i + 1); | ||||||||||||||
| String leafNamespace = namespacePart[namespacePart.length - 1]; | ||||||||||||||
| Namespace currentNamespace = Namespace.of(namespacePart); | ||||||||||||||
|
|
||||||||||||||
| // TODO: Instead of creating synthetic entitties, rely on external catalog mediated backfill. | ||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the purpose of this TODO comment? Could you clarify?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe this refers to the Admittedly the wording is a bit confusing here; it's not so much changing the flow to some totally different "backfill" mechanism, but just changing whether we contact the remote catalog to check the existence/state of the implied namespace entity before creating it. Probably we should update the comment to say something like
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The proposed message update certainly sounds more clear :) |
||||||||||||||
| PolarisEntity syntheticNamespace = | ||||||||||||||
| new NamespaceEntity.Builder(currentNamespace) | ||||||||||||||
| .setId(metaStoreManager.generateNewEntityId(getCurrentPolarisContext()).getId()) | ||||||||||||||
| .setCatalogId(catalogEntity.getId()) | ||||||||||||||
| .setParentId(currentParent.getId()) | ||||||||||||||
| .setCreateTimestamp(System.currentTimeMillis()) | ||||||||||||||
| .build(); | ||||||||||||||
|
Comment on lines
+1800
to
+1806
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How can this
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should probably sort out the nuances of where the "synthetic" term fits in vs "passthrough facade" (synthetic is more just descriptive of the process by which the entity is formulated, but ultimately the "type" we're interested in identifying later is that the entity represents a "passthrough facade", even if its origins in the future may or may not be considered "synthetic" in the same way). To answer your question - right now the code basically defines all entities within an ExternalCatalog to be passthrough facades if they exist, so the identity is inherited from its parent catalog. I suppose we could also set a flag of some sort that explicitly identifies it as being a passthrough facade, but we might want to do some more design discussion about that. Individually identifying entities would mostly be important in a couple potential scenarios:
The basic intent of a Catalog's ConnectionConfig was to be resolved hierarchically the same way the StorageConfig is technically resolved hierarchically, so that if it's present on a parent namespace or on the table itself, the "nearest" definition wins. So one possibility is that we set the ConnectionConfig directly on the table to indicate that it's a passthrough facade. |
||||||||||||||
|
|
||||||||||||||
| EntityResult result = | ||||||||||||||
| metaStoreManager.createEntityIfNotExists( | ||||||||||||||
| getCurrentPolarisContext(), | ||||||||||||||
| PolarisEntity.toCoreList(completePath), | ||||||||||||||
| syntheticNamespace); | ||||||||||||||
|
|
||||||||||||||
| if (result.isSuccess()) { | ||||||||||||||
| syntheticNamespace = PolarisEntity.of(result.getEntity()); | ||||||||||||||
| } else { | ||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it would be better if we check the result status is |
||||||||||||||
| Namespace partialNamespace = Namespace.of(Arrays.copyOf(allNamespaceLevels, i + 1)); | ||||||||||||||
| PolarisResolvedPathWrapper partialPath = | ||||||||||||||
| resolutionManifest.getResolvedPath(partialNamespace); | ||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In case of nested namespace, this will fail at Lines 181 to 186 in 06d5231
since we never added the We could just use the full namespace as the identifier and get a newer version of the partial path. If the synthetic namespace entity still not appear in the leaf, we throw the error. If the new partial path contains even more child entities than expected (e.g. Try to create |
||||||||||||||
| PolarisEntity partialLeafEntity = partialPath.getRawLeafEntity(); | ||||||||||||||
| if (partialLeafEntity == null | ||||||||||||||
| || !(partialLeafEntity.getName().equals(leafNamespace) | ||||||||||||||
| && partialLeafEntity.getType() == PolarisEntityType.NAMESPACE)) { | ||||||||||||||
| throw new RuntimeException( | ||||||||||||||
| String.format( | ||||||||||||||
| "Failed to create or find namespace entity '%s' in federated catalog '%s'", | ||||||||||||||
| leafNamespace, catalogEntity.getName())); | ||||||||||||||
| } | ||||||||||||||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @snazy If the creation of the synthetic namespace fails due to a concurrent request, the else condition subsequently reads the concurrently created namespace. (The same goes for tables/views.) Shouldn't this alleviate your concerns? Can you please elaborate on the potential race condition here? CC: @dennishuo
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The Overall, this should be an atomic change - all or nothing, including that "ephemeral table/view". But that's not possible. I'm not convinced that this works w/ e.g. serializable isolation and in all cases.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @poojanilangekar maybe a naive question: the documentation of
And the code is not retrying that operation. Shouldn't it attempt it multiple times instead of using such if/else/if/else blocks?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The "create-if-not-already-exists" logic to return the conflicting row fails with the serializable isolation level. That's the nature of that isolation level. Further, the "resolution manifest" holds the "local" state, but is never updated with "concurrent" changes.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah good point about the resolution manifest. I believe it's a bug - line 1819 should be calling Re: serializable isolation, given the fact that the remote catalog itself is fundamentally non-atomic with anything we see on the local Polaris (in particular, there's no atomic multi-table+multi-namespace GET in Iceberg REST anyways), we're fundamentally shooting for more of an "eventual consistency" semantic in federated catalogs. While concurrent jit-creation might be semi-common due to different workloads trying to hit the same not-yet-JIT-created entity at the same time, it'd be more rare that a JIT-created entity also then gets dropped/unlinked locally within the race condition followed by the attempt to re-read the conflicting entity. So if I understand the race condition identified here correctly, it seems fine to throw some kind of UnknownStateException in this case where the re-read is
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks @dennishuo for the detailed explanation!
+1, IMHO, the overall order might be
We could try to see if it worth to add some option to make the batch creation api,
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can fix line 1819. However, @HonahX, what you're suggesting is not something Polaris currently supports, as we don't have any backfill functionality yet. I think for the backfill to work, we need to add Passthrough Facade support for the federated catalog.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @poojanilangekar, for For this one, using
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah my preference would be to go ahead with simply throwing the exception for now, doing the basic fix to call On the terminology of backfill here, my understanding was that the "backfill triggered" @HonahX is mentioning isn't describing any fundamentally new/different code flow, but is just talking about the JIT-creation here itself as the backfill. So I think we're in alignment, just describing the current PR's flow in slightly different terms. |
||||||||||||||
| syntheticNamespace = partialLeafEntity; | ||||||||||||||
| } | ||||||||||||||
| completePath.add(syntheticNamespace); | ||||||||||||||
| currentParent = syntheticNamespace; | ||||||||||||||
| } | ||||||||||||||
| PolarisResolvedPathWrapper resolvedPathWrapper = resolutionManifest.getResolvedPath(namespace); | ||||||||||||||
| return resolvedPathWrapper; | ||||||||||||||
| } | ||||||||||||||
|
|
||||||||||||||
| public boolean grantPrivilegeOnTableToRole( | ||||||||||||||
| String catalogName, | ||||||||||||||
| String catalogRoleName, | ||||||||||||||
|
|
@@ -2014,9 +2112,9 @@ private boolean grantPrivilegeOnTableLikeToRole( | |||||||||||||
| TableIdentifier identifier, | ||||||||||||||
| List<PolarisEntitySubType> subTypes, | ||||||||||||||
| PolarisPrivilege privilege) { | ||||||||||||||
| if (findCatalogByName(catalogName).isEmpty()) { | ||||||||||||||
| throw new NotFoundException("Parent catalog %s not found", catalogName); | ||||||||||||||
| } | ||||||||||||||
| CatalogEntity catalogEntity = | ||||||||||||||
| findCatalogByName(catalogName) | ||||||||||||||
| .orElseThrow(() -> new NotFoundException("Parent catalog %s not found", catalogName)); | ||||||||||||||
| PolarisEntity catalogRoleEntity = | ||||||||||||||
| findCatalogRoleByName(catalogName, catalogRoleName) | ||||||||||||||
| .orElseThrow(() -> new NotFoundException("CatalogRole %s not found", catalogRoleName)); | ||||||||||||||
|
|
@@ -2026,7 +2124,24 @@ private boolean grantPrivilegeOnTableLikeToRole( | |||||||||||||
| identifier, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ANY_SUBTYPE); | ||||||||||||||
| if (resolvedPathWrapper == null | ||||||||||||||
| || !subTypes.contains(resolvedPathWrapper.getRawLeafEntity().getSubType())) { | ||||||||||||||
| CatalogHandler.throwNotFoundExceptionForTableLikeEntity(identifier, subTypes); | ||||||||||||||
| boolean rbacForFederatedCatalogsEnabled = | ||||||||||||||
| getCurrentPolarisContext() | ||||||||||||||
| .getRealmConfig() | ||||||||||||||
| .getConfig(FeatureConfiguration.ENABLE_SUB_CATALOG_RBAC_FOR_FEDERATED_CATALOGS); | ||||||||||||||
| if (resolutionManifest.getIsPassthroughFacade() && rbacForFederatedCatalogsEnabled) { | ||||||||||||||
| resolvedPathWrapper = | ||||||||||||||
| createSyntheticTableLikeEntities( | ||||||||||||||
| catalogEntity, identifier, subTypes, resolvedPathWrapper); | ||||||||||||||
| if (resolvedPathWrapper == null | ||||||||||||||
| || !subTypes.contains(resolvedPathWrapper.getRawLeafEntity().getSubType())) { | ||||||||||||||
| throw new RuntimeException( | ||||||||||||||
| String.format( | ||||||||||||||
| "Failed to create synthetic table-like entity for table %s in catalog %s", | ||||||||||||||
| identifier.name(), catalogEntity.getName())); | ||||||||||||||
| } | ||||||||||||||
| } else { | ||||||||||||||
| CatalogHandler.throwNotFoundExceptionForTableLikeEntity(identifier, subTypes); | ||||||||||||||
| } | ||||||||||||||
| } | ||||||||||||||
| List<PolarisEntity> catalogPath = resolvedPathWrapper.getRawParentPath(); | ||||||||||||||
| PolarisEntity tableLikeEntity = resolvedPathWrapper.getRawLeafEntity(); | ||||||||||||||
|
|
@@ -2041,6 +2156,77 @@ private boolean grantPrivilegeOnTableLikeToRole( | |||||||||||||
| .isSuccess(); | ||||||||||||||
| } | ||||||||||||||
|
|
||||||||||||||
| /** | ||||||||||||||
| * Creates and persists the missing synthetic table-like entity and its parent namespace entities | ||||||||||||||
| * for external catalogs. | ||||||||||||||
| * | ||||||||||||||
| * @param catalogEntity the external passthrough facade catalog entity. | ||||||||||||||
| * @param identifier the path of the table-like entity(including the namespace). | ||||||||||||||
| * @param subTypes the expected subtypes of the table-like entity | ||||||||||||||
| * @param existingPathWrapper the partially resolved path currently stored in the metastore. | ||||||||||||||
| * @return the resolved path wrapper | ||||||||||||||
| */ | ||||||||||||||
| private PolarisResolvedPathWrapper createSyntheticTableLikeEntities( | ||||||||||||||
| CatalogEntity catalogEntity, | ||||||||||||||
| TableIdentifier identifier, | ||||||||||||||
| List<PolarisEntitySubType> subTypes, | ||||||||||||||
| PolarisResolvedPathWrapper existingPathWrapper) { | ||||||||||||||
|
|
||||||||||||||
| Namespace namespace = identifier.namespace(); | ||||||||||||||
| PolarisResolvedPathWrapper resolvedNamespacePathWrapper = | ||||||||||||||
| !namespace.isEmpty() | ||||||||||||||
| ? createSyntheticNamespaceEntities(catalogEntity, namespace, existingPathWrapper) | ||||||||||||||
| : existingPathWrapper; | ||||||||||||||
|
|
||||||||||||||
| if (resolvedNamespacePathWrapper == null | ||||||||||||||
| || (!namespace.isEmpty() | ||||||||||||||
| && !resolvedNamespacePathWrapper.isFullyResolvedNamespace( | ||||||||||||||
| catalogEntity.getName(), namespace))) { | ||||||||||||||
| throw new RuntimeException( | ||||||||||||||
| String.format( | ||||||||||||||
| "Failed to create synthetic namespace entities for namespace %s in catalog %s", | ||||||||||||||
| namespace.toString(), catalogEntity.getName())); | ||||||||||||||
| } | ||||||||||||||
|
|
||||||||||||||
| PolarisEntity parentNamespaceEntity = resolvedNamespacePathWrapper.getRawLeafEntity(); | ||||||||||||||
|
|
||||||||||||||
| // TODO: Once we support GENERIC_TABLE federation, select the intended type depending on the | ||||||||||||||
| // callsite; if it is instantiated via an Iceberg RESTCatalog factory or a different factory | ||||||||||||||
| // for GenericCatalogs. | ||||||||||||||
| PolarisEntitySubType syntheticEntitySubType = selectEntitySubType(subTypes); | ||||||||||||||
|
|
||||||||||||||
| // TODO: Instead of creating a synthetic table-like entity, rely on external catalog mediated | ||||||||||||||
| // backfill and use the metadata location from the external catalog. | ||||||||||||||
| PolarisEntity syntheticTableEntity = | ||||||||||||||
| new IcebergTableLikeEntity.Builder(identifier, "") | ||||||||||||||
| .setId(metaStoreManager.generateNewEntityId(getCurrentPolarisContext()).getId()) | ||||||||||||||
| .setCatalogId(parentNamespaceEntity.getCatalogId()) | ||||||||||||||
| .setSubType(syntheticEntitySubType) | ||||||||||||||
| .setCreateTimestamp(System.currentTimeMillis()) | ||||||||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Need setParentId here too to make it resolvable later |
||||||||||||||
| .build(); | ||||||||||||||
|
|
||||||||||||||
| EntityResult result = | ||||||||||||||
| metaStoreManager.createEntityIfNotExists( | ||||||||||||||
| getCurrentPolarisContext(), | ||||||||||||||
| PolarisEntity.toCoreList(resolvedNamespacePathWrapper.getRawFullPath()), | ||||||||||||||
| syntheticTableEntity); | ||||||||||||||
|
|
||||||||||||||
| if (result.isSuccess()) { | ||||||||||||||
| syntheticTableEntity = PolarisEntity.of(result.getEntity()); | ||||||||||||||
| } else { | ||||||||||||||
| PolarisResolvedPathWrapper tablePathWrapper = resolutionManifest.getResolvedPath(identifier); | ||||||||||||||
| PolarisEntity leafEntity = | ||||||||||||||
| tablePathWrapper != null ? tablePathWrapper.getRawLeafEntity() : null; | ||||||||||||||
| if (leafEntity == null || !subTypes.contains(leafEntity.getSubType())) { | ||||||||||||||
| throw new RuntimeException( | ||||||||||||||
| String.format( | ||||||||||||||
| "Failed to create or find table entity '%s' in federated catalog '%s'", | ||||||||||||||
| identifier.name(), catalogEntity.getName())); | ||||||||||||||
| } | ||||||||||||||
| } | ||||||||||||||
| return resolutionManifest.getResolvedPath(identifier); | ||||||||||||||
| } | ||||||||||||||
|
|
||||||||||||||
| /** | ||||||||||||||
| * Removes a table-level or view-level grant on {@code identifier} from {@code catalogRoleName}. | ||||||||||||||
| */ | ||||||||||||||
|
|
@@ -2136,4 +2322,25 @@ private boolean revokePrivilegeOnPolicyEntityFromRole( | |||||||||||||
| privilege) | ||||||||||||||
| .isSuccess(); | ||||||||||||||
| } | ||||||||||||||
|
|
||||||||||||||
| /** | ||||||||||||||
| * Selects the appropriate entity subtype for synthetic entities in external catalogs. | ||||||||||||||
| * | ||||||||||||||
| * @param subTypes list of candidate subtypes | ||||||||||||||
| * @return the selected subtype for the synthetic entity | ||||||||||||||
| * @throws IllegalStateException if no supported subtype is found | ||||||||||||||
| */ | ||||||||||||||
| private static PolarisEntitySubType selectEntitySubType(List<PolarisEntitySubType> subTypes) { | ||||||||||||||
| if (subTypes.contains(PolarisEntitySubType.ICEBERG_TABLE)) { | ||||||||||||||
| return PolarisEntitySubType.ICEBERG_TABLE; | ||||||||||||||
| } else if (subTypes.contains(PolarisEntitySubType.ICEBERG_VIEW)) { | ||||||||||||||
| return PolarisEntitySubType.ICEBERG_VIEW; | ||||||||||||||
| } else { | ||||||||||||||
| throw new IllegalStateException( | ||||||||||||||
| String.format( | ||||||||||||||
| "No supported subtype found in %s. Only ICEBERG_TABLE and ICEBERG_VIEW are" | ||||||||||||||
| + " supported for synthetic entities in external catalogs.", | ||||||||||||||
| subTypes)); | ||||||||||||||
| } | ||||||||||||||
| } | ||||||||||||||
| } | ||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it might be preferable to have a separate API for filling in synthetic namespaces rather that to do that implicitly during grants.
Namespaces are first-class entities in Polaris. They may be used / seen / affect other call paths, not just grants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please elaborate on your suggestion? Does that mean we manage two different types of path resolution, one for regular namespaces and if the regular namespace fails, we look for synthetic namespaces? Or do you suggest we add some sort of metadata to the entity to indicate that it's synthetic?
Additionally, what happens when we backfill information from the source of truth catalog? Do we still want it to be synthetic or does it change the type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relating to my comment in the other file #2223 (comment) we probably don't need to necessarily care about the synthetic nature of the entity so much as its role as a "Passthrough Facade".
@dimas-b I agree we should extract common "backfill/JIT-creation" logic, possibly as a fast-follow or we can do it inline here if we really want as well, but it would be "in order to" use that common logic for the implicit backfill during grants. rather than precluding that backfill.
It would then also be shared for implicit backfill during loadTable (could be configured as an option, if the serving of the loadTable doesn't strictly require the entity to be persisted).
This design tradeoff of whether to JIT-create automatically vs requiring the separate API is discussed a bit here: https://docs.google.com/document/d/1Q6eEytxb0btpOPcL8RtkULskOlYUCo_3FLvFRnHkzBY/edit?tab=t.0#bookmark=id.cf85r4rp8tz -
Passthrough Facade with JIT-Creation instead of requiring upfront facade creationI suppose we could ultimately also make it a configurable option if we expose an actual REST API endpoint for materializing the facade entity, but historically the requirement of upfront facade creation has demonstrated to be a barrier to adoption.
I'd say these JIT-created facade entities are still first-class entities as intended, just that they provide the semantic of the Passthrough Facade rather than a source-of-truth entity. In general a Passthrough Facade would mean:
loadTablemay use the etag to efficiently determine that the local copy is already fresh, and thus serve the response directly from the local copyThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify my first comment: I meant having an API to create the synthetic entries that is separate from the grant API. Essentially from the client side the flow will have two stages:
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not have a strong objection against on demand creation of synthetic entities, but in that case I think the
rbacForFederatedCatalogsEnabledflag should be configurable per catalog (not per realm).... if it's not too hard to implement.From the perspective of a self-hosted Polaris, admin users may want to manage this behaviour separately in different catalogs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that definitely seems reasonable. We might want a way to convey allowing a service owner to force a realm-level setting (at least if the service owner wants to force not JIT-creating entities because they don't want to use up lots of database storage), but generally it seems convenient to be able to allow per-catalog settings.
That matches what I imagined as well, which would be that per-catalog owners may eventually choose to even update the passthrough or caching or preemptive-sync/migration settings over the lifetime of a catalog, maybe starting with keeping things as fully "implicit/passthrough" as much as possible then enabling JIT-creation once they decide they want more advanced caching/migration features.