Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,6 @@ COMMENT ON COLUMN grant_records.grantee_catalog_id IS 'catalog id of the grantee
COMMENT ON COLUMN grant_records.grantee_id IS 'id of the grantee';
COMMENT ON COLUMN grant_records.privilege_code IS 'privilege code';


CREATE TABLE IF NOT EXISTS principal_authentication_data (
realm_id TEXT NOT NULL,
principal_id BIGINT NOT NULL,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ services:
volumes:
# Bind local conf file to a convenient location in the container
- type: bind
source: ./postgresql.conf
source: ../assets/postgres/postgresql.conf
target: /etc/postgresql/postgresql.conf
command:
- "postgres"
Expand Down
2 changes: 1 addition & 1 deletion getting-started/eclipselink/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ This example requires `jq` to be installed on your machine.
2. Start the docker compose group by running the following command from the root of the repository:

```shell
docker compose -f getting-started/eclipselink/docker-compose-postgres.yml -f getting-started/eclipselink/docker-compose-bootstrap-db.yml -f getting-started/eclipselink/docker-compose.yml up
docker compose -f getting-started/eclipselink/docker-compose-bootstrap-db.yml -f getting-started/assets/postgres/docker-compose-postgres.yml -f getting-started/eclipselink/docker-compose.yml up
```

3. Using spark-sql: attach to the running spark-sql container:
Expand Down
2 changes: 1 addition & 1 deletion getting-started/eclipselink/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -96,4 +96,4 @@ services:
ports:
- "8080:8080"
volumes:
- ./trino-config/catalog:/etc/trino/catalog
- ../assets/trino-config/catalog:/etc/trino/catalog
92 changes: 92 additions & 0 deletions getting-started/jdbc/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

# Getting Started with Apache Polaris, Relational JDBC, Postgres and Spark SQL

This example requires `jq` to be installed on your machine.

1. If such an image is not already present, build the Polaris image with support for JDBC persistence and
the Postgres JDBC driver:

```shell
./gradlew \
:polaris-quarkus-server:assemble \
:polaris-quarkus-server:quarkusAppPartsBuild --rerun \
:polaris-quarkus-admin:assemble \
:polaris-quarkus-admin:quarkusAppPartsBuild --rerun \
-Dquarkus.container-image.tag=postgres-latest \
-Dquarkus.container-image.build=true
```

2. Start the docker compose group by running the following command from the root of the repository:

```shell
docker compose -f getting-started/jdbc/docker-compose-bootstrap-db.yml -f getting-started/assets/postgres/docker-compose-postgres.yml -f getting-started/jdbc/docker-compose.yml up
```

3. Using spark-sql: attach to the running spark-sql container:

```shell
docker attach $(docker ps -q --filter name=spark-sql)
```

You may not see Spark's prompt immediately, type ENTER to see it. A few commands that you can try:

```sql
CREATE NAMESPACE polaris.ns1;
USE polaris.ns1;
CREATE TABLE table1 (id int, name string);
INSERT INTO table1 VALUES (1, 'a');
SELECT * FROM table1;
```

4. To access Polaris from the host machine, first request an access token:

```shell
export POLARIS_TOKEN=$(curl -s http://polaris:8181/api/catalog/v1/oauth/tokens \
--resolve polaris:8181:127.0.0.1 \
--user root:s3cr3t \
-d 'grant_type=client_credentials' \
-d 'scope=PRINCIPAL_ROLE:ALL' | jq -r .access_token)
```

5. Then, use the access token in the Authorization header when accessing Polaris:

```shell
curl -v http://127.0.0.1:8181/api/management/v1/principal-roles -H "Authorization: Bearer $POLARIS_TOKEN"
curl -v http://127.0.0.1:8181/api/management/v1/catalogs/quickstart_catalog -H "Authorization: Bearer $POLARIS_TOKEN"
```

6. Using Trino CLI: To access the Trino CLI, run this command:
```shell
docker exec -it jdbc-trino-1 trino
```
Note, `jdbc-trino-1` is the name of the Docker container.

Example Trino queries:
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: -> ```sql

SHOW CATALOGS;
SHOW SCHEMAS FROM iceberg;
SHOW TABLES FROM iceberg.information_schema;
DESCRIBE iceberg.information_schema.tables;

CREATE SCHEMA iceberg.tpch;
CREATE TABLE iceberg.tpch.test_polaris AS SELECT 1 x;
SELECT * FROM iceberg.tpch.test_polaris;
```
38 changes: 38 additions & 0 deletions getting-started/jdbc/docker-compose-bootstrap-db.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

services:
polaris-bootstrap:
# IMPORTANT: the image MUST contain the Postgres JDBC driver and EclipseLink dependencies, see README for instructions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these deps are included by default now, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so as per #1447 and #1411

Copy link
Contributor

@dimas-b dimas-b May 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what's the rationale behind this comment then? IMHO, it agitates the reader unnecessarily.

image: apache/polaris-admin-tool:postgres-latest
environment:
polaris.persistence.type: relational-jdbc
quarkus.datasource.db-kind: pgsql
quarkus.datasource.jdbc.url: jdbc:postgresql://postgres:5432/POLARIS
quarkus.datasource.username: postgres
quarkus.datasource.password: postgres
command:
- "bootstrap"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not add this command to the "setup" section of the main docker-compose.yml file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main docker-compose.yml file is still used for the Getting Started experience with Cloud DB's, which are ideally not bootstrapped more than once since it's not part of a Docker image. If we include the Bootstrap into that Docker file, it makes it impossible to reload other Docker services, like Spark or Trino since we'll get the message that a DB can only be bootstrapped once.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point - thx!

- "--realm=POLARIS"
- "--credential=POLARIS,root,s3cr3t"

polaris:
depends_on:
polaris-bootstrap:
condition: service_completed_successfully
99 changes: 99 additions & 0 deletions getting-started/jdbc/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

services:

polaris:
image: apache/polaris:postgres-latest
ports:
# API port
- "8181:8181"
# Management port (metrics and health checks)
- "8182:8182"
# Optional, allows attaching a debugger to the Polaris JVM
- "5005:5005"
environment:
JAVA_DEBUG: "true"
JAVA_DEBUG_PORT: "*:5005"
polaris.persistence.type: relational-jdbc
quarkus.datasource.db-kind: pgsql
quarkus.datasource.jdbc.url: jdbc:postgresql://postgres:5432/POLARIS
quarkus.datasource.username: postgres
quarkus.datasource.password: postgres
polaris.realm-context.realms: POLARIS
quarkus.otel.sdk.disabled: "true"
healthcheck:
test: ["CMD", "curl", "http://localhost:8182/q/health"]
interval: 2s
timeout: 10s
retries: 10
start_period: 10s

polaris-setup:
image: alpine/curl
depends_on:
polaris:
condition: service_healthy
environment:
- STORAGE_LOCATION=${STORAGE_LOCATION}
- AWS_ROLE_ARN=${AWS_ROLE_ARN}
- AZURE_TENANT_ID=${AZURE_TENANT_ID}
volumes:
- ../assets/polaris/:/polaris
entrypoint: '/bin/sh -c "chmod +x /polaris/create-catalog.sh && /polaris/create-catalog.sh"'

spark-sql:
image: apache/spark:3.5.5-java17-python3
depends_on:
polaris-setup:
condition: service_completed_successfully
stdin_open: true
tty: true
ports:
- "4040-4045:4040-4045"
healthcheck:
test: "curl localhost:4040"
interval: 5s
retries: 15
command: [
/opt/spark/bin/spark-sql,
--packages, "org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.7.0,software.amazon.awssdk:bundle:2.28.17,software.amazon.awssdk:url-connection-client:2.28.17,org.apache.iceberg:iceberg-gcp-bundle:1.7.0,org.apache.iceberg:iceberg-azure-bundle:1.7.0",
--conf, "spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions",
--conf, "spark.sql.catalog.polaris=org.apache.iceberg.spark.SparkCatalog",
--conf, "spark.sql.catalog.polaris.type=rest",
--conf, "spark.sql.catalog.polaris.warehouse=quickstart_catalog",
--conf, "spark.sql.catalog.polaris.uri=http://polaris:8181/api/catalog",
--conf, "spark.sql.catalog.polaris.credential=root:s3cr3t",
--conf, "spark.sql.catalog.polaris.scope=PRINCIPAL_ROLE:ALL",
--conf, "spark.sql.defaultCatalog=polaris",
--conf, "spark.sql.catalogImplementation=in-memory",
--conf, "spark.driver.extraJavaOptions=-Divy.cache.dir=/tmp -Divy.home=/tmp"
]

trino:
image: trinodb/trino:latest
depends_on:
polaris-setup:
condition: service_completed_successfully
stdin_open: true
tty: true
ports:
- "8080:8080"
volumes:
- ../assets/trino-config/catalog:/etc/trino/catalog