Skip to content

Conversation

@kevinjqliu
Copy link
Contributor

@kevinjqliu kevinjqliu commented Jul 31, 2024

Description

Add a new getting-started directory, with a trino subdirectory which includes the following

  • docker-compose-trino.yml, docker-compose file to spin up Polaris and Trino services, as well as setup script to initialize a catalog in Polaris
  • create-polaris-catalog.sh, setup script to initialize a catalog in Polaris
  • trino-config/catalog/iceberg.properties file, which configures Trino to register Polaris as an Iceberg REST catalog

To run, use the command:

docker-compose -f getting-started/trino/docker-compose-trino.yml up 

Note

The setup script and Trino both use the credential principal:root;realm:default-realm. This is a workaround so users do not need to fetch credentials from Polaris' output.

To access the Trino CLI,

docker exec -it trino-trino-1 trino

Note, trino-trino-1 is the name docker container.

Run Trino commands via Trino CLI,

SHOW CATALOGS;
SHOW SCHEMAS FROM iceberg;
SHOW TABLES FROM iceberg.information_schema;
DESCRIBE iceberg.information_schema.tables;

CREATE SCHEMA iceberg.tpch;
CREATE TABLE iceberg.tpch.test_polaris AS SELECT 1 x;
SELECT * FROM iceberg.tpch.test_polaris;

Please include a summary of the changes and the related issue. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes #22

Reference

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Test A
  • Test B

Test Configuration:

  • Firmware version:
  • Hardware:
  • Toolchain:
  • SDK:

Checklist:

Please delete options that are not relevant.

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • If adding new functionality, I have discussed my implementation with the community using the linked GitHub issue
  • I have signed and submitted the ICLA and if needed, the CCLA. See Contributing for details.

@annafil annafil requested a review from a team August 1, 2024 18:20
@flyrain
Copy link
Contributor

flyrain commented Aug 3, 2024

Hi @kevinjqliu , is this PR ready for review or not? Can you mark it as draft if not? Thanks.

@kevinjqliu kevinjqliu marked this pull request as draft August 5, 2024 15:19
@kevinjqliu kevinjqliu force-pushed the kevinjqliu/trino-docker branch from 0e42f7e to c99064e Compare August 5, 2024 16:39
Copy link
Contributor Author

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might need to update docs, similar to #44

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whats the stance on using this credential? Since this is a demo, I think it's acceptable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't tell whether this works by design or due to a bug...

@kevinjqliu kevinjqliu marked this pull request as ready for review August 5, 2024 16:50
@kevinjqliu
Copy link
Contributor Author

r? @collado-mike / @RussellSpitzer
PTAL

@kevinjqliu
Copy link
Contributor Author

Can also use Polaris python CLI to create a catalog
https://github.com/polaris-catalog/polaris/blob/main/docs/quickstart.md#defining-a-catalog

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it also include iceberg.rest-catalog.oauth2.scope? It seems, non scoped requests fail with

org.apache.iceberg.exceptions.ForbiddenException: Forbidden: Principal 'XXX' with activated PrincipalRoles '[catalog]' and activated ids '[]' is not authorized for op XXX

Related trino PR - trinodb/trino#22961

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think trinodb/trino#22961 is required in case of iceberg.rest-catalog.oauth2.token. The scope is needed for iceberg.rest-catalog.oauth2.credential in my understanding.

RussellSpitzer
RussellSpitzer previously approved these changes Aug 26, 2024
Copy link
Member

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worked for me, I'm not enough of a trino expert to understand if anything is off there but since this is just for demoing I think this is probably a good thing to merge. I would request @kevinjqliu we add the instructions to the readme though as well. I think the details you have in the main comment of this pullrequest is basically all we need.

@ebyhr Can you please check out the Trino specific parts and make sure that's all good to go?

Copy link
Contributor

@ebyhr ebyhr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good except for comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using dynamic catalog mode (CATALOG_MANAGEMENT=dynamic) provides more flexibility, but I think a static mode is fine in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't know about this option, cool. lets leave as is for now since dynamic catalog management requires the user to run an extra CREATE CATALOG statement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to not privilege one engine from other, I would propose to create an engine folder in the getting-started one. We can add trino, spark, flink, doris, dremio, etc there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. is the engine folder necessary? How about just getting-started/trino?

getting-started/
    trino/
    spark/
    ...

@kevinjqliu
Copy link
Contributor Author

@RussellSpitzer @ebyhr @jbonofre Thanks for the review. I've updated the PR based on your comments. Please take another look.

flyrain
flyrain previously approved these changes Sep 12, 2024
Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Do you want to take another look? @ebyhr @jbonofre @RussellSpitzer

@kevinjqliu
Copy link
Contributor Author

Thanks @flyrain.
@jbonofre I've addressed your previous comments. Do you mind taking another look?

services:
polaris:
build:
context: .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this context work in your environment? I had do change it to ../../ because Dockerfile exists in the project root directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right, this should be ../../, it was accidentally removed when i copied over the new setup in 9c2be7f

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated! thanks

@kevinjqliu
Copy link
Contributor Author

Reran docker with clean setup, confirmed working.

@flyrain
Copy link
Contributor

flyrain commented Sep 13, 2024

Looks like all comments are resolved. I will merge it if there is no more comment today.

@flyrain flyrain enabled auto-merge (squash) September 13, 2024 20:55
@flyrain flyrain dismissed jbonofre’s stale review September 13, 2024 20:56

Comments have been resolved.

@flyrain flyrain merged commit 874d866 into apache:main Sep 13, 2024
@flyrain
Copy link
Contributor

flyrain commented Sep 13, 2024

Thanks @kevinjqliu for adding Trino docker compose. Thanks @jbonofre @RussellSpitzer @ebyhr @snazy @almazrevolut for the review.

@kevinjqliu kevinjqliu deleted the kevinjqliu/trino-docker branch September 13, 2024 21:39
travis-bowen pushed a commit to travis-bowen/polaris that referenced this pull request Jun 20, 2025
* add snyk policy

* add license
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE REQUEST] Trino getting started docker compose

7 participants