Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,15 @@ public static void enforceFeatureEnabledOrThrow(
.defaultValue(false)
.buildFeatureConfiguration();

public static final FeatureConfiguration<Boolean> ENABLE_SUB_CATALOG_RBAC_FOR_FEDERATED_CATALOGS =
PolarisConfiguration.<Boolean>builder()
.key("ENABLE_SUB_CATALOG_RBAC_FOR_FEDERATED_CATALOGS")
.description(
"When enabled, allows RBAC operations to create synthetic entities for"
+ " entities in federated catalogs that don't exist in the local metastore.")
.defaultValue(false)
.buildFeatureConfiguration();

public static final FeatureConfiguration<Boolean> ENABLE_POLICY_STORE =
PolarisConfiguration.<Boolean>builder()
.key("ENABLE_POLICY_STORE")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -526,7 +526,12 @@ private void authorizeGrantOnTableLikeOperationOrThrow(
PolarisResolvedPathWrapper tableLikeWrapper =
resolutionManifest.getResolvedPath(
identifier, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ANY_SUBTYPE, true);
if (!subTypes.contains(tableLikeWrapper.getRawLeafEntity().getSubType())) {
boolean rbacForFederatedCatalogsEnabled =
getCurrentPolarisContext()
.getRealmConfig()
.getConfig(FeatureConfiguration.ENABLE_SUB_CATALOG_RBAC_FOR_FEDERATED_CATALOGS);
if (!(resolutionManifest.getIsPassthroughFacade() && rbacForFederatedCatalogsEnabled)
&& !subTypes.contains(tableLikeWrapper.getRawLeafEntity().getSubType())) {
CatalogHandler.throwNotFoundExceptionForTableLikeEntity(identifier, subTypes);
}

Expand Down Expand Up @@ -1686,14 +1691,33 @@ public boolean grantPrivilegeOnNamespaceToRole(
PolarisAuthorizableOperation.ADD_NAMESPACE_GRANT_TO_CATALOG_ROLE;
authorizeGrantOnNamespaceOperationOrThrow(op, catalogName, namespace, catalogRoleName);

CatalogEntity catalogEntity =
findCatalogByName(catalogName)
.orElseThrow(() -> new NotFoundException("Parent catalog %s not found", catalogName));
PolarisEntity catalogRoleEntity =
findCatalogRoleByName(catalogName, catalogRoleName)
.orElseThrow(() -> new NotFoundException("CatalogRole %s not found", catalogRoleName));

PolarisResolvedPathWrapper resolvedPathWrapper = resolutionManifest.getResolvedPath(namespace);
if (resolvedPathWrapper == null
|| !resolvedPathWrapper.isFullyResolvedNamespace(catalogName, namespace)) {
throw new NotFoundException("Namespace %s not found", namespace);
boolean rbacForFederatedCatalogsEnabled =
getCurrentPolarisContext()
.getRealmConfig()
.getConfig(FeatureConfiguration.ENABLE_SUB_CATALOG_RBAC_FOR_FEDERATED_CATALOGS);
if (resolutionManifest.getIsPassthroughFacade() && rbacForFederatedCatalogsEnabled) {
resolvedPathWrapper =
createSyntheticNamespaceEntities(catalogEntity, namespace, resolvedPathWrapper);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it might be preferable to have a separate API for filling in synthetic namespaces rather that to do that implicitly during grants.

Namespaces are first-class entities in Polaris. They may be used / seen / affect other call paths, not just grants.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please elaborate on your suggestion? Does that mean we manage two different types of path resolution, one for regular namespaces and if the regular namespace fails, we look for synthetic namespaces? Or do you suggest we add some sort of metadata to the entity to indicate that it's synthetic?

Additionally, what happens when we backfill information from the source of truth catalog? Do we still want it to be synthetic or does it change the type?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relating to my comment in the other file #2223 (comment) we probably don't need to necessarily care about the synthetic nature of the entity so much as its role as a "Passthrough Facade".

@dimas-b I agree we should extract common "backfill/JIT-creation" logic, possibly as a fast-follow or we can do it inline here if we really want as well, but it would be "in order to" use that common logic for the implicit backfill during grants. rather than precluding that backfill.

It would then also be shared for implicit backfill during loadTable (could be configured as an option, if the serving of the loadTable doesn't strictly require the entity to be persisted).

This design tradeoff of whether to JIT-create automatically vs requiring the separate API is discussed a bit here: https://docs.google.com/document/d/1Q6eEytxb0btpOPcL8RtkULskOlYUCo_3FLvFRnHkzBY/edit?tab=t.0#bookmark=id.cf85r4rp8tz - Passthrough Facade with JIT-Creation instead of requiring upfront facade creation

I suppose we could ultimately also make it a configurable option if we expose an actual REST API endpoint for materializing the facade entity, but historically the requirement of upfront facade creation has demonstrated to be a barrier to adoption.

I'd say these JIT-created facade entities are still first-class entities as intended, just that they provide the semantic of the Passthrough Facade rather than a source-of-truth entity. In general a Passthrough Facade would mean:

  1. Listing the parent doesn't get to short-circuit with passthrough facades -- "list" is strictly defined by the remote catalog
  2. GET on the entity represented by the facade still requires some readthrough to the remote catalog, but the contents of a facade entity will augment or optimize the response. For example:
    • Cached metadata representation - a facade may have a local copy of the full TableMetadata JSON along with the etag associated with it from the remote - the passthrough loadTable may use the etag to efficiently determine that the local copy is already fresh, and thus serve the response directly from the local copy
    • Grants semantics - as this PR adds, the facade entity serves as the relational entity within the Polaris RBAC grants system
    • Credential vending - In upcoming related work outlined in the addendum https://docs.google.com/document/d/19Vm0Uy-EyEYtgd2-bZpPICODOB3rYQJvHeqL1ODOOLc/edit?tab=t.0 the local TableMetadata of the facade entity allows us to safely reuse credential-subscoping logic which takes in the logical table data/metadata read/write paths as part of the subscoping policy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify my first comment: I meant having an API to create the synthetic entries that is separate from the grant API. Essentially from the client side the flow will have two stages:

  1. Decide what synthetic namespace are required and create them (new API)
  2. Grant (old API)

Copy link
Contributor

@dimas-b dimas-b Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not have a strong objection against on demand creation of synthetic entities, but in that case I think the rbacForFederatedCatalogsEnabled flag should be configurable per catalog (not per realm).... if it's not too hard to implement.

From the perspective of a self-hosted Polaris, admin users may want to manage this behaviour separately in different catalogs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that definitely seems reasonable. We might want a way to convey allowing a service owner to force a realm-level setting (at least if the service owner wants to force not JIT-creating entities because they don't want to use up lots of database storage), but generally it seems convenient to be able to allow per-catalog settings.

That matches what I imagined as well, which would be that per-catalog owners may eventually choose to even update the passthrough or caching or preemptive-sync/migration settings over the lifetime of a catalog, maybe starting with keeping things as fully "implicit/passthrough" as much as possible then enabling JIT-creation once they decide they want more advanced caching/migration features.

if (resolvedPathWrapper == null
|| !resolvedPathWrapper.isFullyResolvedNamespace(catalogName, namespace)) {
throw new RuntimeException(
String.format(
"Failed to create synthetic namespace entities for namespace %s in catalog %s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A plain RuntimeException will probably show up as a 500 response at the REST API level... Is that intended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there another exception you'd prefer? If so, I can change it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most appropriate one in this case is probably UnprocessableEntityException

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since currently we just create synthetic entities without performing a passthrough, the expectation for createSyntheticNamespaceEntities is to create all synthetic entities or make sure they all exist. If something like CommitStateUnknown (https://github.com/apache/polaris/pull/2223/files#r2366658329) happens, it should be thrown inside the createSyntheticNamespaceEntities.

At this point, I am leaning towards 500 as this should actually be an illegalState after the backfill. UnprocessableEntityException as 422 may imply that repeating request will fail with the same error, but seems we cannot guarantee that here. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you're right, maybe not UnprocessableEntityException, I misremembered the meaning.

409 is also potentially accurate, just that it happens to be a server-side conflict where the caller doesn't necessarily need to do anything different in the request to retry it successfully. So we could return CommitFailedException.

But it's also true that as long as we don't yet drop facade entities, it's an unexpected illegal state. Then the "raw" RuntimeException correctly gets mapped to INTERNAL_SERVER_ERROR.

Maybe for now we could just add a TODO here to possibly update the exception thrown as we refine the possible retry scenarios?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good, we will have to update the spec anyway later since the new error code we send back is not documented in the response code list.

Copy link
Contributor

@dimas-b dimas-b Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A for the status code, 500 seems reasonable to me given the current implementation.

I guess the point I was trying to make is that this exception is thrown based on a plain (boolean) condition that something is not available. Ideally the error (and its message) should be based on the real underlying error.... but I do not insist on addressing this in current PR.

namespace.toString(), catalogName));
}
} else {
throw new NotFoundException("Namespace %s not found", namespace);
}
}
List<PolarisEntity> catalogPath = resolvedPathWrapper.getRawParentPath();
PolarisEntity namespaceEntity = resolvedPathWrapper.getRawLeafEntity();
Expand Down Expand Up @@ -1737,6 +1761,80 @@ public boolean revokePrivilegeOnNamespaceFromRole(
.isSuccess();
}

/**
* Creates and persists the missing synthetic namespace entities for external catalogs.
*
* @param catalogEntity the external passthrough facade catalog entity.
* @param namespace the expected fully resolved namespace to be created.
* @param existingPath the partially resolved path currently stored in the metastore.
* @return the fully resolved path wrapper.
*/
private PolarisResolvedPathWrapper createSyntheticNamespaceEntities(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[non-blocker] We may consider extract this and the table creation one out from PolarisAdminService to a separate class, or within the ResolutionManifest, to make PolarisAdminService focusing on the admin logics. But we could do any big refactoring like this in follow-ups

CatalogEntity catalogEntity, Namespace namespace, PolarisResolvedPathWrapper existingPath) {

if (existingPath == null) {
throw new IllegalStateException(
String.format("Catalog entity %s does not exist.", catalogEntity.getName()));
}

List<PolarisEntity> completePath = new ArrayList<>(existingPath.getRawFullPath());
PolarisEntity currentParent = existingPath.getRawLeafEntity();

String[] allNamespaceLevels = namespace.levels();
int numMatchingLevels = 0;
// Find parts of the complete path that match the namespace levels.
// We skip index 0 because it is the CatalogEntity.
for (PolarisEntity entity : completePath.subList(1, completePath.size())) {
if (!entity.getName().equals(allNamespaceLevels[numMatchingLevels])) {
break;
}
numMatchingLevels++;
}

for (int i = numMatchingLevels; i < allNamespaceLevels.length; i++) {
String[] namespacePart = Arrays.copyOfRange(allNamespaceLevels, 0, i + 1);
String leafNamespace = namespacePart[namespacePart.length - 1];
Namespace currentNamespace = Namespace.of(namespacePart);

// TODO: Instead of creating synthetic entitties, rely on external catalog mediated backfill.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this TODO comment? Could you clarify?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this refers to the Require actually fetching a successful response from the remote catalog before producing the synthetic passthrough facade entity portion of https://docs.google.com/document/d/19Vm0Uy-EyEYtgd2-bZpPICODOB3rYQJvHeqL1ODOOLc/edit?tab=t.0#bookmark=id.7cguk2qz0p88

Admittedly the wording is a bit confusing here; it's not so much changing the flow to some totally different "backfill" mechanism, but just changing whether we contact the remote catalog to check the existence/state of the implied namespace entity before creating it.

Probably we should update the comment to say something like TODO: Instead of blindly creating the passthrough-facade entity, standardize onto some kind of "createOrUpdatePassthroughFacadeEntityBasedOnRemoteCatalog" helper.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed message update certainly sounds more clear :)

PolarisEntity syntheticNamespace =
new NamespaceEntity.Builder(currentNamespace)
.setId(metaStoreManager.generateNewEntityId(getCurrentPolarisContext()).getId())
.setCatalogId(catalogEntity.getId())
.setParentId(currentParent.getId())
.setCreateTimestamp(System.currentTimeMillis())
.build();
Comment on lines +1800 to +1806
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can this Namespace be identified as a "synthetic" namespace later?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably sort out the nuances of where the "synthetic" term fits in vs "passthrough facade" (synthetic is more just descriptive of the process by which the entity is formulated, but ultimately the "type" we're interested in identifying later is that the entity represents a "passthrough facade", even if its origins in the future may or may not be considered "synthetic" in the same way).

To answer your question - right now the code basically defines all entities within an ExternalCatalog to be passthrough facades if they exist, so the identity is inherited from its parent catalog. I suppose we could also set a flag of some sort that explicitly identifies it as being a passthrough facade, but we might want to do some more design discussion about that. Individually identifying entities would mostly be important in a couple potential scenarios:

  1. If we support "mixed-mode" catalogs where some entities are natively owned in Polaris while others are basically "symlinks" to remote tables. Then we presumably need to know if any given table is local or remote. By default, we probably don't want to support this use case unless we have a really compelling reason.
  2. A variation of (1) would be if we support gradual "migration" scenarios. In such a case, we want the ability to incrementally transfer individual tables into having the local Polaris server be the source of truth, so it's also similar to a "mixed-mode" catalog, except the key difference is that we'd have much more constrained intermediate states, and all the tables "originally" came from the same remote catalog.

The basic intent of a Catalog's ConnectionConfig was to be resolved hierarchically the same way the StorageConfig is technically resolved hierarchically, so that if it's present on a parent namespace or on the table itself, the "nearest" definition wins. So one possibility is that we set the ConnectionConfig directly on the table to indicate that it's a passthrough facade.


EntityResult result =
metaStoreManager.createEntityIfNotExists(
getCurrentPolarisContext(),
PolarisEntity.toCoreList(completePath),
syntheticNamespace);

if (result.isSuccess()) {
syntheticNamespace = PolarisEntity.of(result.getEntity());
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if we check the result status is ENTITY_ALREADY_EXISTS before entering the getPassthroughResolvedPath below. Though ENTITY_ALREADY_EXISTS is the only error status now, there could be more in the future and we also have a general UNEXPECTED_ERROR status, where we do not want to proceed but just throw an exception.

Namespace partialNamespace = Namespace.of(Arrays.copyOf(allNamespaceLevels, i + 1));
PolarisResolvedPathWrapper partialPath =
resolutionManifest.getResolvedPath(partialNamespace);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of nested namespace, this will fail at

diagnostics.check(
passthroughPaths.containsKey(key),
"invalid_key_for_passthrough_resolved_path",
"key={} passthroughPaths={}",
key,
passthroughPaths);

since we never added the partialNamespace key to the (passthrough) paths. Only the full namespace in the path.

We could just use the full namespace as the identifier and get a newer version of the partial path. If the synthetic namespace entity still not appear in the leaf, we throw the error. If the new partial path contains even more child entities than expected (e.g. Try to create ns1.ns1a.ns1aa, expect ns1.ns1a but get ns1.ns1a.ns1aa), we could either validate and reconcile or throw an error.

PolarisEntity partialLeafEntity = partialPath.getRawLeafEntity();
if (partialLeafEntity == null
|| !(partialLeafEntity.getName().equals(leafNamespace)
&& partialLeafEntity.getType() == PolarisEntityType.NAMESPACE)) {
throw new RuntimeException(
String.format(
"Failed to create or find namespace entity '%s' in federated catalog '%s'",
leafNamespace, catalogEntity.getName()));
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@snazy If the creation of the synthetic namespace fails due to a concurrent request, the else condition subsequently reads the concurrently created namespace. (The same goes for tables/views.) Shouldn't this alleviate your concerns? Can you please elaborate on the potential race condition here?

CC: @dennishuo

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resolutionManifest doesn't know that, no?

Overall, this should be an atomic change - all or nothing, including that "ephemeral table/view". But that's not possible.

I'm not convinced that this works w/ e.g. serializable isolation and in all cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@poojanilangekar maybe a naive question: the documentation of PolarisMetaStoreManager::createEntityIfNotExists says:

Returns the newly created entity. If this entity was already created, we will simply return the already created entity. We will return null if a different entity with the same name exists or if the catalogPath couldn't be resolved. If null is returned, the client app should retry this operation.

And the code is not retrying that operation. Shouldn't it attempt it multiple times instead of using such if/else/if/else blocks?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "create-if-not-already-exists" logic to return the conflicting row fails with the serializable isolation level. That's the nature of that isolation level.

Further, the "resolution manifest" holds the "local" state, but is never updated with "concurrent" changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good point about the resolution manifest. I believe it's a bug - line 1819 should be calling getPassthroughResolvedPath instead of getResolvedPath.

Re: serializable isolation, given the fact that the remote catalog itself is fundamentally non-atomic with anything we see on the local Polaris (in particular, there's no atomic multi-table+multi-namespace GET in Iceberg REST anyways), we're fundamentally shooting for more of an "eventual consistency" semantic in federated catalogs.

While concurrent jit-creation might be semi-common due to different workloads trying to hit the same not-yet-JIT-created entity at the same time, it'd be more rare that a JIT-created entity also then gets dropped/unlinked locally within the race condition followed by the attempt to re-read the conflicting entity.

So if I understand the race condition identified here correctly, it seems fine to throw some kind of UnknownStateException in this case where the re-read is null and have the caller figure out if they just want to retry or figure out the state manually first.

Copy link
Contributor

@HonahX HonahX Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dennishuo for the detailed explanation!

I believe it's a bug - line 1819 should be calling getPassthroughResolvedPath

+1, IMHO, the overall order might be

  • resolutionManifest tries to resolve the path
  • some or all JIT entities not found, backfill triggered.
  • resolutionManifest getPassthroughPath to resolve the path again
  • Proceed if the whole path resolved, fail with UnkownStateException/ConcurrentModification... if not

We could try to see if it worth to add some option to make the batch creation api, createEntitiesIfNotExist not fail on EntityAlreadyExistException but simply return the entity if it is already there, this way we could push down the actual creation into the persistence layer instead of in the for-loop. But this may need broader discussion as this is persistence API change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can fix line 1819.

However, @HonahX, what you're suggesting is not something Polaris currently supports, as we don't have any backfill functionality yet. I think for the backfill to work, we need to add Passthrough Facade support for the federated catalog.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @poojanilangekar, for backfill I mean creating JIT dummy entities like what the current PR do. But yes, the remaining persistence layer changes are not yet supported in Polaris and also does not need to be in this PR.

For this one, using getPassthroughResolvedPath to get the fresh entity should be enough : )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah my preference would be to go ahead with simply throwing the exception for now, doing the basic fix to call getPassthroughResolvedPath, merging and then iterate on improvements afterwards.

On the terminology of backfill here, my understanding was that the "backfill triggered" @HonahX is mentioning isn't describing any fundamentally new/different code flow, but is just talking about the JIT-creation here itself as the backfill. So I think we're in alignment, just describing the current PR's flow in slightly different terms.

syntheticNamespace = partialLeafEntity;
}
completePath.add(syntheticNamespace);
currentParent = syntheticNamespace;
}
PolarisResolvedPathWrapper resolvedPathWrapper = resolutionManifest.getResolvedPath(namespace);
return resolvedPathWrapper;
}

public boolean grantPrivilegeOnTableToRole(
String catalogName,
String catalogRoleName,
Expand Down Expand Up @@ -2014,9 +2112,9 @@ private boolean grantPrivilegeOnTableLikeToRole(
TableIdentifier identifier,
List<PolarisEntitySubType> subTypes,
PolarisPrivilege privilege) {
if (findCatalogByName(catalogName).isEmpty()) {
throw new NotFoundException("Parent catalog %s not found", catalogName);
}
CatalogEntity catalogEntity =
findCatalogByName(catalogName)
.orElseThrow(() -> new NotFoundException("Parent catalog %s not found", catalogName));
PolarisEntity catalogRoleEntity =
findCatalogRoleByName(catalogName, catalogRoleName)
.orElseThrow(() -> new NotFoundException("CatalogRole %s not found", catalogRoleName));
Expand All @@ -2026,7 +2124,24 @@ private boolean grantPrivilegeOnTableLikeToRole(
identifier, PolarisEntityType.TABLE_LIKE, PolarisEntitySubType.ANY_SUBTYPE);
if (resolvedPathWrapper == null
|| !subTypes.contains(resolvedPathWrapper.getRawLeafEntity().getSubType())) {
CatalogHandler.throwNotFoundExceptionForTableLikeEntity(identifier, subTypes);
boolean rbacForFederatedCatalogsEnabled =
getCurrentPolarisContext()
.getRealmConfig()
.getConfig(FeatureConfiguration.ENABLE_SUB_CATALOG_RBAC_FOR_FEDERATED_CATALOGS);
if (resolutionManifest.getIsPassthroughFacade() && rbacForFederatedCatalogsEnabled) {
resolvedPathWrapper =
createSyntheticTableLikeEntities(
catalogEntity, identifier, subTypes, resolvedPathWrapper);
if (resolvedPathWrapper == null
|| !subTypes.contains(resolvedPathWrapper.getRawLeafEntity().getSubType())) {
throw new RuntimeException(
String.format(
"Failed to create synthetic table-like entity for table %s in catalog %s",
identifier.name(), catalogEntity.getName()));
}
} else {
CatalogHandler.throwNotFoundExceptionForTableLikeEntity(identifier, subTypes);
}
}
List<PolarisEntity> catalogPath = resolvedPathWrapper.getRawParentPath();
PolarisEntity tableLikeEntity = resolvedPathWrapper.getRawLeafEntity();
Expand All @@ -2041,6 +2156,77 @@ private boolean grantPrivilegeOnTableLikeToRole(
.isSuccess();
}

/**
* Creates and persists the missing synthetic table-like entity and its parent namespace entities
* for external catalogs.
*
* @param catalogEntity the external passthrough facade catalog entity.
* @param identifier the path of the table-like entity(including the namespace).
* @param subTypes the expected subtypes of the table-like entity
* @param existingPathWrapper the partially resolved path currently stored in the metastore.
* @return the resolved path wrapper
*/
private PolarisResolvedPathWrapper createSyntheticTableLikeEntities(
CatalogEntity catalogEntity,
TableIdentifier identifier,
List<PolarisEntitySubType> subTypes,
PolarisResolvedPathWrapper existingPathWrapper) {

Namespace namespace = identifier.namespace();
PolarisResolvedPathWrapper resolvedNamespacePathWrapper =
!namespace.isEmpty()
? createSyntheticNamespaceEntities(catalogEntity, namespace, existingPathWrapper)
: existingPathWrapper;

if (resolvedNamespacePathWrapper == null
|| (!namespace.isEmpty()
&& !resolvedNamespacePathWrapper.isFullyResolvedNamespace(
catalogEntity.getName(), namespace))) {
throw new RuntimeException(
String.format(
"Failed to create synthetic namespace entities for namespace %s in catalog %s",
namespace.toString(), catalogEntity.getName()));
}

PolarisEntity parentNamespaceEntity = resolvedNamespacePathWrapper.getRawLeafEntity();

// TODO: Once we support GENERIC_TABLE federation, select the intended type depending on the
// callsite; if it is instantiated via an Iceberg RESTCatalog factory or a different factory
// for GenericCatalogs.
PolarisEntitySubType syntheticEntitySubType = selectEntitySubType(subTypes);

// TODO: Instead of creating a synthetic table-like entity, rely on external catalog mediated
// backfill and use the metadata location from the external catalog.
PolarisEntity syntheticTableEntity =
new IcebergTableLikeEntity.Builder(identifier, "")
.setId(metaStoreManager.generateNewEntityId(getCurrentPolarisContext()).getId())
.setCatalogId(parentNamespaceEntity.getCatalogId())
.setSubType(syntheticEntitySubType)
.setCreateTimestamp(System.currentTimeMillis())
Copy link
Contributor

@HonahX HonahX Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.setCreateTimestamp(System.currentTimeMillis())
.setCreateTimestamp(System.currentTimeMillis())
.setParentId(parentNamespaceEntity.getId())

Need setParentId here too to make it resolvable later

.build();

EntityResult result =
metaStoreManager.createEntityIfNotExists(
getCurrentPolarisContext(),
PolarisEntity.toCoreList(resolvedNamespacePathWrapper.getRawFullPath()),
syntheticTableEntity);

if (result.isSuccess()) {
syntheticTableEntity = PolarisEntity.of(result.getEntity());
} else {
PolarisResolvedPathWrapper tablePathWrapper = resolutionManifest.getResolvedPath(identifier);
PolarisEntity leafEntity =
tablePathWrapper != null ? tablePathWrapper.getRawLeafEntity() : null;
if (leafEntity == null || !subTypes.contains(leafEntity.getSubType())) {
throw new RuntimeException(
String.format(
"Failed to create or find table entity '%s' in federated catalog '%s'",
identifier.name(), catalogEntity.getName()));
}
}
return resolutionManifest.getResolvedPath(identifier);
}

/**
* Removes a table-level or view-level grant on {@code identifier} from {@code catalogRoleName}.
*/
Expand Down Expand Up @@ -2136,4 +2322,25 @@ private boolean revokePrivilegeOnPolicyEntityFromRole(
privilege)
.isSuccess();
}

/**
* Selects the appropriate entity subtype for synthetic entities in external catalogs.
*
* @param subTypes list of candidate subtypes
* @return the selected subtype for the synthetic entity
* @throws IllegalStateException if no supported subtype is found
*/
private static PolarisEntitySubType selectEntitySubType(List<PolarisEntitySubType> subTypes) {
if (subTypes.contains(PolarisEntitySubType.ICEBERG_TABLE)) {
return PolarisEntitySubType.ICEBERG_TABLE;
} else if (subTypes.contains(PolarisEntitySubType.ICEBERG_VIEW)) {
return PolarisEntitySubType.ICEBERG_VIEW;
} else {
throw new IllegalStateException(
String.format(
"No supported subtype found in %s. Only ICEBERG_TABLE and ICEBERG_VIEW are"
+ " supported for synthetic entities in external catalogs.",
subTypes));
}
}
}
Loading