Skip to content

Duplicate tables with same name in a namespace #1076

@pingtimeout

Description

@pingtimeout

Describe the bug

The persistence layer allows the creation of tables with identical name within the same namespace, which should not be possible. Additionally, it seems to be losing some writes.

To Reproduce

  • Check out this commit: 4cf08d4
  • Run the server using the getting-started docker compose file: docker compose -f getting-started/eclipselink/docker-compose.yml up
  • Export the client ID and secrets as environment variables: export CLIENT_ID=root CLIENT_SECRET=s3cr3t
  • Run ./gradlew :polaris-benchmarks:gatlingRun

The simulation will create a catalog named C_0, a namespace named N_0 and then will send 50 simultaneous table creation queries for a table named T_1000.

Actual Behavior

The Gatling output shows that all 50 table creation requests returned with an HTTP 200 OK code.


========================================================================================================================
2025-02-27 09:47:34 UTC                                                                               3s elapsed
---- Requests -----------------------------------------------------------------------|---Total---|-----OK----|----KO----
> Global                                                                             |       103 |       103 |         0
> Authenticate                                                                       |        51 |        51 |         0
> Create Catalog                                                                     |         1 |         1 |         0
> Create Namespace                                                                   |         1 |         1 |         0
> Create Table                                                                       |        50 |        50 |         0

A curl command that lists the tables under namespace NS_0 shows that there are 23 tables with that name.

$ curl \
    -s \
    "http://localhost:8181/api/catalog/v1/C_0/namespaces/NS_0/tables" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $TOKEN" \
  | jq '.identifiers[].name' \
  | wc -l
23

This file is the complete output from GET /api/catalog/v1/C_0/namespaces/NS_0/tables

Expected Behavior

Only a single POST /api/catalog/v1/C_0/namespaces/NS_0/tables should succeed. The remaining 49 queries should be rejected with a HTTP 409 error code.

Additionally, given that the 50 POST /api/catalog/v1/C_0/namespaces/NS_0/tables succeeded, but only 23 tables were actually created, it means that the server lost some writes. For this particular case, it is not that big of a deal as this is an invalid situation. But this raises the question whether other writes can be lost under high concurrency.

Additional context

The issue is reproducible fairly consistently. To make it even easier to reproduce, increase log verbosity (e.g. -Dquarkus.log.level=DEBUG -Dquarkus.log.category.\"org.apache.polaris\".level=DEBUG -Dquarkus.log.category.\"org.apache.iceberg..rest\".level=DEBUG -Dquarkus.log.category.\"io.smallrye.config\".level=DEBUG )

System information

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions