Skip to content

Conversation

@XJDKC
Copy link
Member

@XJDKC XJDKC commented Sep 8, 2025

Milestones

This is Part 3 of the [Splitting] Initial SigV4 Auth Support for Catalog Federation. Upcoming parts will build on this system:

Introduction

This PR introduces service identity management for SigV4 Auth Support for Catalog Federation. Unlike user-supplied parameters, the service identity represents the identity of the Polaris service itself and should be managed by Polaris.

We introduce a new ServiceIdentityProvider interface responsible for three core operations:

1. Allocating a service identity to a catalog entity

  • Invoked during external catalog creation with SigV4 authentication
  • A vendor may use the same service identity across all entities in an account, or assign different identities per catalog
  • The associated DPO stores only a reference (identifier) to the service identity, not the credentials themselves

2. Retrieving identity information for API responses

  • When generating responses for APIs like getCatalog, Polaris uses getServiceIdentityInfo() to return identity metadata (e.g., AWS IAM ARN) without exposing credentials
  • This method returns only display information, never sensitive credentials

3. Retrieving credentials for authentication

  • When accessing a remote catalog, Polaris uses getServiceIdentityCredential() to obtain the full credential with AWS access keys
  • After retrieving its IAM credentials, Polaris assumes the customer-provided IAM role to obtain temporary AWS credentials, then uses those to access the remote catalog via SigV4
  • Credentials are resolved lazily only when actually needed

This PR also introduces a DefaultServiceIdentityProvider, which supports static configuration via Quarkus application properties. While it does not support dynamic runtime resolution (all identities must be specified before server startup), it provides:

  • A simple, practical mechanism for testing SigV4 authentication
  • A reference implementation that vendors can build upon

In the future, vendors may implement custom providers that integrate with their internal infrastructure for on-demand identity management or dynamic credential rotation.

Design Overview

Each ConnectionConfigInfoDpo (used for remote catalog federation) now contains a ServiceIdentityInfoDpo, which holds a SecretReference that serves as a unique identifier for the service identity instance. This design allows:

  • Polaris to store only a reference (identifier) to its service identity in the catalog entity
  • The actual credentials to be looked up from configuration at runtime
  • Separation of metadata (for API responses) from credentials (for authentication)
  • Role assumption using SigV4AuthenticationParametersDpo (supplied by the user)

Key Components

  • ServiceIdentityProvider: The central provider interface with three methods:
    • allocateServiceIdentity(ConnectionConfigInfo) - Assigns a service identity reference to a catalog during creation based on the authentication type
    • getServiceIdentityInfo(ServiceIdentityInfoDpo) - Returns identity metadata (e.g., IAM ARN) without credentials for API responses
    • getServiceIdentityCredential(ServiceIdentityInfoDpo) - Returns the full credential with secrets, resolved lazily only when authentication is needed
  • DefaultServiceIdentityProvider: The default implementation that:
    • Loads service identity configurations from polaris.service-identity.* properties at startup
    • Supports both single-tenant (default configuration) and multi-tenant (per-realm configuration) deployments
    • Uses lazy credential resolution - credentials are only created when getServiceIdentityCredential() is called
    • Validates that references belong to the current realm before resolving
    • Vendors may extend this with dynamic backends (e.g., integrating with secret managers, credential vaults, or internal identity systems).
  • ServiceIdentityCredential: Base class representing a credential with:
    • Identity type (e.g., AWS_IAM)
    • A SecretReference serving as a unique identifier for lookups
    • Implementation-specific credentials (e.g.

Configuration

Configuration examples are provided in the JavaDoc of ServiceIdentityConfiguration for developers. Since this feature is experimental and part of a multi-part series, configuration examples are not included in application.properties to avoid confusion for end-users.

Key Implementation Details

Lazy Credential Resolution:

  • Metadata retrieval (getServiceIdentityInfo) never creates credential providers
  • Full credentials (getServiceIdentityCredential) are only resolved when actually needed for authentication
  • Improves performance and security by minimizing credential creation

Reference-Based Lookups:

  • The SecretReference serves as a unique identifier for each service identity instance
  • Stream-based filtering matches references without precomputed maps
  • Supports multi-tenant scenarios where different realms have different identities

Flowchart

Catalog Federation - Creds Management

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for keeping improving service identity handling in Polaris, @XJDKC !

The approach LGTM in general, but I believe the CDI-related code can be simplified :)

Also, I'm not sure I understand how AWS credentials are going to be used in practice... so those comments are mostly for the sake of clarification.

Copy link
Contributor

@HonahX HonahX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XJDKC Thanks for the great work! Just for my understanding, the Part 1: transformation is no longer needed?

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update @XJDKC !

It is still not very clear to me how identity secrets are plugged into the runtime components that interact with remote systems... so some of my comments may look odd because of this 😅 but I hope I'm not completely off the point :)

@XJDKC XJDKC force-pushed the rxing-catalog-federation-sigv4-part-3 branch from cf9a113 to e386298 Compare September 24, 2025 18:06
@XJDKC XJDKC force-pushed the rxing-catalog-federation-sigv4-part-3 branch from 020fc1b to fe883ce Compare September 26, 2025 18:23
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall 👍 Thanks, @XJDKC !

Call paths to ServiceIdentityProvider now distinguish user-facing (API) data from backend secrets / credentials.

I made some minor comments, but I do not think they are blockers. I hope you could address them (if you agree) while we're waiting for another review from @dennishuo .

Please consider all my previous comment threads "resolved" (but you may want to keep them open for context to other reviews).

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 Looking forward to review / approval from @dennishuo before merging.

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Oct 3, 2025
@dimas-b dimas-b requested review from HonahX and dennishuo October 3, 2025 17:34
@XJDKC
Copy link
Member Author

XJDKC commented Oct 3, 2025

LGTM 👍 Looking forward to review / approval from @dennishuo before merging.

Thanks @dimas-b for reviewing and approving the PR, the feedback is very valuable, now the PR is in a good state.
cc: @dennishuo and @HonahX , can you pls also take a look when you get a chance?

Copy link
Contributor

@HonahX HonahX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@dennishuo dennishuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dennishuo dennishuo merged commit 793a118 into apache:main Oct 4, 2025
16 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Oct 4, 2025
@XJDKC
Copy link
Member Author

XJDKC commented Oct 4, 2025

Many thanks @dimas-b @dennishuo @HonahX for taking the time to review the PR! I will open a PR for part 4 soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants