|
| 1 | +--- |
| 2 | +# |
| 3 | +# Licensed to the Apache Software Foundation (ASF) under one |
| 4 | +# or more contributor license agreements. See the NOTICE file |
| 5 | +# distributed with this work for additional information |
| 6 | +# regarding copyright ownership. The ASF licenses this file |
| 7 | +# to you under the Apache License, Version 2.0 (the |
| 8 | +# "License"); you may not use this file except in compliance |
| 9 | +# with the License. You may obtain a copy of the License at |
| 10 | +# |
| 11 | +# http://www.apache.org/licenses/LICENSE-2.0 |
| 12 | +# |
| 13 | +# Unless required by applicable law or agreed to in writing, |
| 14 | +# software distributed under the License is distributed on an |
| 15 | +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| 16 | +# KIND, either express or implied. See the License for the |
| 17 | +# specific language governing permissions and limitations |
| 18 | +# under the License. |
| 19 | +# |
| 20 | +title: Generic Table (Beta) |
| 21 | +type: docs |
| 22 | +weight: 435 |
| 23 | +--- |
| 24 | + |
| 25 | +The Generic Table in Apache Polaris is designed to provide support for non-Iceberg tables across different table formats includes delta, csv etc. It currently provides the following capabilities: |
| 26 | +- Create a generic table under a namespace |
| 27 | +- Load a generic table |
| 28 | +- Drop a generic table |
| 29 | +- List all generic tables under a namespace |
| 30 | + |
| 31 | +**NOTE** The current generic table is in beta release. Please use it with caution and report any issue if encountered. |
| 32 | + |
| 33 | +## What is a Generic Table? |
| 34 | + |
| 35 | +A generic table in Polaris is an entity that defines the following fields: |
| 36 | + |
| 37 | +- **name** (required): A unique identifier for the table within a namespace |
| 38 | +- **format** (required): The format for the generic table, i.e. "delta", "csv" |
| 39 | +- **base-location** (optional): Table base location in URI format. For example: s3://<my-bucket>/path/to/table |
| 40 | + - The table base location is a location that includes all files for the table |
| 41 | + - A table with multiple disjoint locations (i.e. containing files that are outside the configured base location) is not compliant with the current generic table support in Polaris. |
| 42 | + - If no location is provided, clients or users are responsible for managing the location. |
| 43 | +- **properties** (optional): Properties for the generic table passed on creation. |
| 44 | + - Currently, there is no reserved property key defined. |
| 45 | + - The property definition and interpretation is delegated to client or engine implementations. |
| 46 | +- **doc** (optional): Comment or description for the table |
| 47 | + |
| 48 | +## Generic Table API Vs. Iceberg Table API |
| 49 | + |
| 50 | +Generic Table provides a different set of APIs to operate on the generic table entities while Iceberg APIs operates on |
| 51 | +the Iceberg table entities. |
| 52 | + |
| 53 | +| Operations | **Iceberg Table API** | **Generic Table API** | |
| 54 | +|--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------| |
| 55 | +| Create Table | Create an Iceberg table | Create a generic table | |
| 56 | +| Load Table | Load an Iceberg table. If the table to load is a generic table, you need to call the Generic Table loadTable API, otherwise a TableNotFoundException will be thrown | Load a generic table. Similarly, try to load an Iceberg table through Generic Table API will thrown a TableNotFoundException. | |
| 57 | +| Drop Table | Drop an Iceberg table. Similar as load table, if the table to drop is a Generic table, a tableNotFoundException will be thrown. | Drop a generic table. Drop an Iceberg table through Generic table endpoint will thrown an TableNotFound Exception | |
| 58 | +| List Table | List all Iceberg tables | List all generic tables | |
| 59 | + |
| 60 | +Note that generic table shares the same namespace with Iceberg tables, the table name has to be unique under the same namespace. Furthermore, since |
| 61 | +there is currently no support for Update Generic Table, any update to the existing table requires a drop and re-create. |
| 62 | + |
| 63 | +## Working with Generic Table |
| 64 | + |
| 65 | +There are two ways to work with Polaris Generic Tables today: |
| 66 | +1) Directly communicate with Polaris through REST API calls using tools such as `curl`. Details will be described in the later section. |
| 67 | +2) Use the Spark client provided if you are working with Spark. Please refer to [Polaris Spark Client]({{% ref "polaris-spark-client" %}}) for detailed instructions. |
| 68 | + |
| 69 | +### Create a Generic Table |
| 70 | + |
| 71 | +To create a generic table, you need to provide the corresponding fields as described in [What is a Generic Table](#what-is-a-generic-table). |
| 72 | + |
| 73 | +The REST API for creating a generic Table is `POST /polaris/v1/{prefix}/namespaces/{namespace}/generic-tables`, and the |
| 74 | +request body looks like the following: |
| 75 | + |
| 76 | +```json |
| 77 | +{ |
| 78 | + "name": "<table_name>", |
| 79 | + "format": "<table_format>", |
| 80 | + "base-location": "<table_base_location>", |
| 81 | + "doc": "<comment or description for table>", |
| 82 | + "properties": { |
| 83 | + "<property-key>": "<property-value>" |
| 84 | + } |
| 85 | +} |
| 86 | +``` |
| 87 | + |
| 88 | +Here is an example to create a generic table with name `delta_table` and format as `delta` under a namespace `delta_ns` |
| 89 | +for catalog `delta_catalog` using curl: |
| 90 | + |
| 91 | +```shell |
| 92 | +curl -X POST http://localhost:8181/api/catalog/polaris/v1/delta_catalog/namespaces/delta_ns/generic-tables \ |
| 93 | + -H "Content-Type: application/json" \ |
| 94 | + -d '{ |
| 95 | + "name": "delta_table", |
| 96 | + "format": "delta", |
| 97 | + "base-location": "s3://<my-bucket>/path/to/table", |
| 98 | + "doc": "delta table example", |
| 99 | + "properties": { |
| 100 | + "key1": "value1" |
| 101 | + } |
| 102 | + }' |
| 103 | +``` |
| 104 | + |
| 105 | +### Load a Generic Table |
| 106 | +The REST endpoint for load a generic table is `GET /polaris/v1/{prefix}/namespaces/{namespace}/generic-tables/{generic-table}`. |
| 107 | + |
| 108 | +Here is an example to load the table `delta_table` using curl: |
| 109 | +```shell |
| 110 | +curl -X GET http://localhost:8181/api/catalog/polaris/v1/delta_catalog/namespaces/delta_ns/generic-tables/delta_table |
| 111 | +``` |
| 112 | +And the response looks like the following: |
| 113 | +```json |
| 114 | +{ |
| 115 | + "table": { |
| 116 | + "name": "delta_table", |
| 117 | + "format": "delta", |
| 118 | + "base-location": "s3://<my-bucket>/path/to/table", |
| 119 | + "doc": "delta table example", |
| 120 | + "properties": { |
| 121 | + "key1": "value1" |
| 122 | + } |
| 123 | + } |
| 124 | +} |
| 125 | +``` |
| 126 | + |
| 127 | +### List Generic Tables |
| 128 | +The REST endpoint for listing the generic tables under a given |
| 129 | +namespace is `GET /polaris/v1/{prefix}/namespaces/{namespace}/generic-tables/`. |
| 130 | + |
| 131 | +Following curl command lists all tables under namespace delta_namespace: |
| 132 | +```shell |
| 133 | +curl -X GET http://localhost:8181/api/catalog/polaris/v1/delta_catalog/namespaces/delta_ns/generic-tables/ |
| 134 | +``` |
| 135 | +Example Response: |
| 136 | +```json |
| 137 | +{ |
| 138 | + "identifiers": [ |
| 139 | + { |
| 140 | + "namespace": ["delta_ns"], |
| 141 | + "name": "delta_table" |
| 142 | + } |
| 143 | + ], |
| 144 | + "next-page-token": null |
| 145 | +} |
| 146 | +``` |
| 147 | + |
| 148 | +### Drop a Generic Table |
| 149 | +The drop generic table REST endpoint is `DELETE /polaris/v1/{prefix}/namespaces/{namespace}/generic-tables/{generic-table}` |
| 150 | + |
| 151 | +The following curl call drops the table `delat_table`: |
| 152 | +```shell |
| 153 | +curl -X DELETE http://localhost:8181/api/catalog/polaris/v1/delta_catalog/namespaces/delta_ns/generic-tables/{generic-table} |
| 154 | +``` |
| 155 | + |
| 156 | +### API Reference |
| 157 | + |
| 158 | +For the complete and up-to-date API specification, see the [Catalog API Spec](https://editor-next.swagger.io/?url=https://raw.githubusercontent.com/apache/polaris/refs/heads/main/spec/generated/bundled-polaris-catalog-service.yaml). |
| 159 | + |
| 160 | +## Limitations |
| 161 | + |
| 162 | +Current limitations of Generic Table support: |
| 163 | +1) Limited spec information. Currently, there is no spec for information like Schema, Partition etc. |
| 164 | +2) No commit coordination or update capability provided at the catalog service level. |
| 165 | + |
| 166 | +Therefore, the catalog itself is unaware of anything about the underlying table except some of the loosely defined metadata. |
| 167 | +It is the responsibility of the engine (and plugins used by the engine) to determine exactly how loading or commiting data |
| 168 | +should look like based on the metadata. For example, with the delta support, th delta log serialization, deserialization |
| 169 | +and update all happens at client side. |
0 commit comments