Skip to content

Conversation

@flyrain
Copy link
Contributor

@flyrain flyrain commented Mar 6, 2025

No description provided.

@flyrain flyrain changed the title Separate Polaris Entities POC Separate Polaris Entities Mar 8, 2025
@flyrain flyrain force-pushed the poc-separate-entity branch from abdc52b to 1db510d Compare March 9, 2025 02:40
import org.apache.polaris.core.persistence.dao.entity.ListEntitiesResult;
import org.apache.polaris.core.persistence.dao.entity.ResolvedEntityResult;

public class PostgresCatalogDaoImpl implements CatalogDao {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All classes under this package are placeholders for jdbc impl.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imho, then this should be outside the eclipselink package ?

extension/persistence/eclipselink/src/main/java/org/apache/polaris/extension/persistence/impl/jdbc/PostgresCatalogDaoImpl.java

may be extension/persistence ?

  • extension
    • persistence
      • relational
        • jdbc
        • eclipselink

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we should put them in a different module ideally. I put them here mainly for demo purpose. Let me remove it in the next commit.


// TODO this should return a type-specific entity result, e.g., CatalogEntityResult
@Nonnull
EntityResult readEntityByName(@Nonnull PolarisCallContext callCtx, @Nonnull String name);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I put in the comment and dev mail list, the type-specific return entity refactor will come later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of Fdb, I think we need multiple impls:

  • jdbc
  • eclipse-link
  • treemap
  • "delegating" to the metastoremanager, which is needed temporarily for backwards compatibility

Copy link
Contributor Author

@flyrain flyrain Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed the current DAO implementations to DelegatingXXXDaoImpl for backward compatibility. We can add JDBC implementation in a followup PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, it looks good now. One other note is that the package name transactional is a little confusing but that can wait until later refactors.

private final AuthenticatedPolarisPrincipal authenticatedPrincipal;
private final PolarisAuthorizer authorizer;
private final PolarisMetaStoreManager metaStoreManager;
private final PolarisDaoManager daoManager;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this class to show case how business logic invokes type-specific DAO objects. We can easily extend it to other classes like BasePolarisCatalog, PolarisEntityManager and Resolver

Copy link
Contributor

@eric-maynard eric-maynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now as a first step; we'll see the benefits of having the DAOs once we are able to provide metastore-specific DAO implementations and replace direct usage of PolarisMetastoreManager with the DAOs.

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Mar 11, 2025
Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have strong concerns with the state of this PR.

Generally, I'm strongly +1 on having concern-oriented APIs and concrete implementation of those - for example an API for catalog management, for catalog content, etc. However, IMHO that should really be persistence implementation independent.

While this PR goes into the direction of concern-based APIs, it still exposes persistence internals and pushes those things up to the call sites, which should not be the case.

Another topic is the tight coupling of object-storage specific concerns with Polaris's backend database. That should really be decoupled, because credential vending has nothing to do with Polaris's backend database.

@github-project-automation github-project-automation bot moved this from Ready to merge to PRs In Progress in Basic Kanban Board Mar 11, 2025
@flyrain
Copy link
Contributor Author

flyrain commented Mar 11, 2025

While this PR goes into the direction of concern-based APIs, it still exposes persistence internals and pushes those things up to the call sites, which should not be the case.

Can you point out in the code where "exposes persistence internals and pushes those things up to the call sites"? I'm also open for suggestion how we can separate concern better.

@snazy
Copy link
Member

snazy commented Mar 17, 2025

Can you point out in the code where "exposes persistence internals and pushes those things up to the call sites"? I'm also open for suggestion how we can separate concern better.

Sure, as mentioned elsewhere, there should eventually be distinct APIs for each concern. Those APIs do not have to and therefore must not deal for example with persistence-internal IDs or dictate how objects are stored but provide the "concern specific" operations.

As that's a more general approach, I really prefer to move this one into draft state and discuss the approach on the dev-ML.

@eric-maynard
Copy link
Contributor

eric-maynard commented Mar 17, 2025 via email

@flyrain
Copy link
Contributor Author

flyrain commented Mar 17, 2025

As that's a more general approach, I really prefer to move this one into draft state and discuss the approach on the dev-ML.

@snazy, here is the dev mail list: https://lists.apache.org/thread/ghtoydoyqfpd3m5qhscqpbgzqm92cyd9. As @eric-maynard mentioned, it was here for a while.

@github-actions
Copy link

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Apr 17, 2025
@github-actions github-actions bot closed this Apr 24, 2025
@github-project-automation github-project-automation bot moved this from PRs In Progress to Done in Basic Kanban Board Apr 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants