-
Notifications
You must be signed in to change notification settings - Fork 332
Description
Describe the bug
Issue Summary
Currently when trying out the spark polaris plugin for creating generic tables with DELTA format running into the following issue below.
Followed the docs here: https://polaris.apache.org/in-dev/unreleased/polaris-spark-client/ however noticed that there is an issue with the artifact missing org.apache.polaris:polaris-iceberg-1.8.1-spark-runtime-3.5_2.12:1.0.0
So had to build the latest polaris spark client jar. Also note that I am using S3 as my storage type and creating a Delta Table, so included some additional jars such as (org.apache.hadoop:hadoop-aws:3.3.4,com.amazonaws:aws-java-sdk-bundle:1.12.671). See repro steps below.
To Reproduce
spark-sql \
--jars /Users/rahil/workplace/polaris/plugins/spark/v3.5/build/2.12/libs/polaris-iceberg-1.8.1-spark-runtime-3.5_2.12-0.10.0-beta-incubating-SNAPSHOT.jar \
--packages org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.9.0,org.apache.iceberg:iceberg-aws-bundle:1.9.0,io.delta:delta-spark_2.12:3.3.1,org.apache.hadoop:hadoop-aws:3.3.4,com.amazonaws:aws-java-sdk-bundle:1.12.671 \
--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,io.delta.sql.DeltaSparkSessionExtension \
--conf spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog \
--conf spark.sql.catalog.quickstart_catalog.warehouse=quickstart_catalog \
--conf spark.sql.catalog.quickstart_catalog.header.X-Iceberg-Access-Delegation=vended-credentials \
--conf spark.sql.catalog.quickstart_catalog=org.apache.polaris.spark.SparkCatalog \
--conf spark.sql.catalog.quickstart_catalog.uri=http://localhost:8181/api/catalog \
--conf spark.sql.catalog.quickstart_catalog.credential=${USER_CLIENT_ID}:${USER_CLIENT_SECRET} \
--conf spark.sql.catalog.quickstart_catalog.scope='PRINCIPAL_ROLE:ALL' \
--conf spark.sql.catalog.quickstart_catalog.token-refresh-enabled=true \
--conf spark.sql.catalog.quickstart_catalog.client.region=us-east-1
use quickstart_catalog;
CREATE NAMESPACE IF NOT EXISTS quickstart_namespace;
USE NAMESPACE quickstart_namespace;
CREATE TABLE IF NOT EXISTS people2 (
id int, name string)
USING delta LOCATION 's3a://polaris-onehouse-bucket/people2' TBLPROPERTIES (
'enabled-read-table-formats' = 'ICEBERG');
Actual Behavior
When debugging the loadGenericTable java code path I saw that the additional property I had added 'enabled-read-table-formats' = 'ICEBERG' was not being set at the time of the response (Note this property is for a feature im working in Polaris for doing table conversion)
Expected Behavior
Ideally the delta table properties should appear in both the time of createGenericTable and loadGenericTable when using spark polaris client.
Whats interesting is that when i run a DESCRIBE TABLE EXTENDED I can see the table properties present at the end but I think this is likely going thru some spark catalog code path to return these table properties.
# Detailed Table Information
Name delta.`s3a://polaris-onehouse-bucket/people2`
Type MANAGED
Location s3a://polaris-onehouse-bucket/people2
Provider delta
Table Properties [delta.minReaderVersion=1,delta.minWriterVersion=2,enabled-read-table-formats=ICEBERG]
Time taken: 5.166 seconds, Fetched 9 row(s)
