diff --git a/pages/spicedb/concepts/datastores.mdx b/pages/spicedb/concepts/datastores.mdx index 1580e56..e6904ea 100644 --- a/pages/spicedb/concepts/datastores.mdx +++ b/pages/spicedb/concepts/datastores.mdx @@ -6,18 +6,18 @@ In order to reduce operational complexity, SpiceDB leverages existing, popular s AuthZed has standardized our managed services on CockroachDB, but we give self-hosted customers the option to pick the datastore that best suits their operational requirements. -- [CockroachDB](#cockroachdb) - Recomended for self hosted deployments with high throughput and/or multi-region requirements -- [Cloud Spanner](#cloud-spanner) - Recommended for self-hosted Google Cloud deployments -- [PostgreSQL](#postgresql) - Recommended for self-hosted single-region deployments -- [MySQL](#mysql) - Not recommended; only use if you cannot use PostgreSQL -- [memdb](#memdb) - Recommended for local development and integration testing against applications +- [CockroachDB](/spicedb/concepts/datastores/cockroachdb) - Recomended for self hosted deployments with high throughput and/or multi-region requirements +- [Cloud Spanner](/spicedb/concepts/datastores/cloud-spanner) - Recommended for self-hosted Google Cloud deployments +- [PostgreSQL](/spicedb/concepts/datastores/postgresql) - Recommended for self-hosted single-region deployments +- [MySQL](/spicedb/concepts/datastores/mysql) - Not recommended; only use if you cannot use PostgreSQL +- [memdb](/spicedb/concepts/datastores/memdb) - Recommended for local development and integration testing against applications ## Migrations Before a datastore can be used by SpiceDB or before running a new version of SpiceDB, you must execute all available migrations. - The only exception is the [memdb datastore](#memdb) because it does not persist any data. + The only exception is the [memdb datastore](/spicedb/concepts/datastores/memdb) because it does not persist any data. In order to migrate a datastore, run the following command with your desired values: @@ -29,419 +29,3 @@ spicedb migrate head \ ``` For more information, refer to the [Datastore Migration documentation](/spicedb/concepts/datastore-migrations). - -## CockroachDB - -### Usage Notes - - - SpiceDB's Watch API requires CockroachDB's [Experimental Changefeed] to be enabled. - - [Experimental Changefeed]: https://www.cockroachlabs.com/docs/v22.1/changefeed-for - - -- Recommended for multi-region deployments, with configurable region awareness -- Enables horizontal scalability by adding more SpiceDB and CockroachDB instances -- Resiliency to individual CockroachDB instance failures -- Query and data balanced across the CockroachDB -- Setup and operational complexity of running CockroachDB - -### Developer Notes - -- Code can be found [here][crdb-code] -- Documentation can be found [here][crdb-godoc] -- Implemented using [pgx][pgx] for a SQL driver and connection pooling -- Has a native changefeed -- Stores migration revisions using the same strategy as [Alembic][alembic] - -[crdb-code]: https://github.com/authzed/spicedb/tree/main/internal/datastore/crdb -[crdb-godoc]: https://pkg.go.dev/github.com/authzed/spicedb/internal/datastore/crdb -[pgx]: https://pkg.go.dev/gopkg.in/jackc/pgx.v3 -[alembic]: https://alembic.sqlalchemy.org/en/latest/ - -### Configuration - -#### Required Parameters - -| Parameter | Description | Example | -| -------------------- | ----------------------------------------- | ----------------------------------------------------------------------------------------- | -| `datastore-engine` | the datastore engine | `--datastore-engine=cockroachdb` | -| `datastore-conn-uri` | connection string used to connect to CRDB | `--datastore-conn-uri="postgres://user:password@localhost:26257/spicedb?sslmode=disable"` | - -#### Optional Parameters - -| Parameter | Description | Example | -| ---------------------------------------- | ------------------------------------------------------------------------------------- | ----------------------------------------------- | -| `datastore-max-tx-retries` | Maximum number of times to retry a query before raising an error | `--datastore-max-tx-retries=50` | -| `datastore-tx-overlap-strategy` | The overlap strategy to prevent New Enemy on CRDB (see below) | `--datastore-tx-overlap-strategy=static` | -| `datastore-tx-overlap-key` | The key to use for the overlap strategy (see below) | `--datastore-tx-overlap-key="foo"` | -| `datastore-conn-pool-read-max-idletime` | Maximum amount of time a connection can idle in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-read-max-idletime=30m0s` | -| `datastore-conn-pool-read-max-lifetime` | Maximum amount of time a connection can live in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-read-max-lifetime=30m0s` | -| `datastore-conn-pool-read-max-lifetime-jitter` | Waits rand(0, jitter) after a connection is open for max lifetime to actually close the connection | `--datastore-conn-pool-read-max-lifetime-jitter=6m` | -| `datastore-conn-pool-read-max-open` | Number of concurrent connections open in a remote datastore's connection pool (default 20) | `--datastore-conn-pool-read-max-open=20` | -| `datastore-conn-pool-read-min-open` | Number of minimum concurrent connections open in a remote datastore's connection pool (default 20) | `--datastore-conn-pool-read-min-open=20` | -| `datastore-conn-pool-write-healthcheck-interval`| Amount of time between connection health checks in a remote datastore's connection pool (default 30s)| `--datastore-conn-pool-write-healthcheck-interval=30s` | -| `datastore-conn-pool-write-max-idletime` | Maximum amount of time a connection can idle in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-write-max-idletime=30m0s` | -| `datastore-conn-pool-write-max-lifetime` | Maximum amount of time a connection can live in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-write-max-lifetime=30m0s` | -| `datastore-conn-pool-write-max-lifetime-jitter` | Waits rand(0, jitter) after a connection is open for max lifetime to actually close the connection | `--datastore-conn-pool-write-max-lifetime-jitter=6m` | -| `datastore-conn-pool-write-max-open` | Number of concurrent connections open in a remote datastore's connection pool (default 10) | `--datastore-conn-pool-write-max-open=10` | -| `datastore-conn-pool-write-min-open` | Number of minimum concurrent connections open in a remote datastore's connection pool (default 10) | `--datastore-conn-pool-write-min-open=10` | -| `datastore-query-split-size` | The (estimated) query size at which to split a query into multiple queries | `--datastore-query-split-size=5kb` | -| `datastore-gc-window` | Sets the window outside of which overwritten relationships are no longer accessible | `--datastore-gc-window=1s` | -| `datastore-revision-fuzzing-duration` | Sets a fuzzing window on all zookies/zedtokens | `--datastore-revision-fuzzing-duration=50ms` | -| `datastore-readonly` | Places the datastore into readonly mode | `--datastore-readonly=true` | -| `datastore-follower-read-delay-duration` | Amount of time to subtract from non-sync revision timestamps to ensure follower reads | `--datastore-follower-read-delay-duration=4.8s` | -| `datastore-relationship-integrity-enabled` | Enables relationship integrity checks, only supported on CRDB | `--datastore-relationship-integrity-enabled=false` | -| `datastore-relationship-integrity-current-key-id` | Current key id for relationship integrity checks | `--datastore-relationship-integrity-current-key-id="foo"` | -| `datastore-relationship-integrity-current-key-filename` | Current key filename for relationship integrity checks | `--datastore-relationship-integrity-current-key-filename="foo"` | -| `datastore-relationship-integrity-expired-keys` | Config for expired keys for relationship integrity checks | `--datastore-relationship-integrity-expired-keys="foo"` | - -#### Overlap Strategy - -In distributed systems, you can trade-off consistency for performance. - -CockroachDB datastore users that are willing to rely on more subtle guarantees to mitigate the [New Enemy Problem] can configure `--datastore-tx-overlap-strategy`. - -[New Enemy Problem]: /spicedb/concepts/zanzibar#new-enemy-problem - -The available strategies are: - -| Strategy | Description | -| --- | --- | -| `static` (default) | All writes overlap to guarantee safety at the cost of write throughput | -| `prefix` | Only writes that contain objects with same prefix overlap (e.g. `tenant1/user` and `tenant2/user` can be written in concurrently) | -| `request` | Only writes with the same `io.spicedb.requestoverlapkey` header overlap enabling applications to decide on-the-fly which writes have causual dependencies. Writes without any header act the same as `insecure`. | -| `insecure` | No writes overlap, providing the best write throughput, but possibly leaving you vulnerable to the [New Enemy Problem] | - -For more information, refer to the [CockroachDB datastore README][crdb-readme] or our blog post "[The One Crucial Difference Between Spanner and CockroachDB][crdb-blog]". - -[crdb-readme]: https://github.com/authzed/spicedb/blob/main/internal/datastore/crdb/README.md -[crdb-blog]: https://authzed.com/blog/prevent-newenemy-cockroachdb - -#### Garbage Collection Window - - - As of February 2023, the [default garbage collection window] has changed to `1.25 hours` for CockroachDB Serverless and `4 hours` for CockroachDB Dedicated. - - [default garbage collection window]: https://github.com/cockroachdb/cockroach/issues/89233 - - -SpiceDB warns if the garbage collection window as configured in CockroachDB is smaller than the SpiceDB configuration. - -If you need a longer time window for the Watch API or querying at exact snapshots, you can [adjust the value in CockroachDB][crdb-gc]: - -```sql -ALTER ZONE default CONFIGURE ZONE USING gc.ttlseconds = 90000; -``` - -[crdb-gc]: https://www.cockroachlabs.com/docs/stable/configure-replication-zones.html#replication-zone-variables - -#### Relationship Integrity - -Relationship Integrity is a new experimental feature in SpiceDB that ensures that data written into the supported backing datastores (currently: only CockroachDB) is validated as having been written by SpiceDB itself. - -- **What does relationship integrity ensure?** -Relationship integrity primarily ensures that all relationships written into the backing datastore were written via a trusted instance of SpiceDB or that the caller has access to the key(s) necessary to write those relationships. -It ensures that if someone gains access to the underlying datastore, they cannot simply write new relationships of their own invention. - -- **What does relationship integrity *not* ensure?** -Since the relationship integrity feature signs each individual relationship, it does not ensure that removal of relationships is by a trusted party. -Schema is also currently unverified, so an untrusted party could change it as well. -Support for schema changes will likely come in a future version. - -##### Setting up relationship integrity - -To run with relationship integrity, new flags must be given to SpiceDB: - -```zed -spicedb serve ...existing flags... ---datastore-relationship-integrity-enabled ---datastore-relationship-integrity-current-key-id="somekeyid" ---datastore-relationship-integrity-current-key-filename="some.key" -``` - -Place the generated key contents (which must support an HMAC key) in `some.key` - -##### Deployment Process - -1. Start with a **clean** datastore for SpiceDB. **At this time, migrating an existing SpiceDB installation is not supported.** -2. Run the standard `migrate` command but with relationship integrity flags included. -3. Run SpiceDB with the relationship integrity flags included. - -## Cloud Spanner - -### Usage Notes - -- Requires a Google Cloud Account with an active Cloud Spanner instance -- Take advantage of Google's TrueTime. - The Spanner driver assumes the database is linearizable and skips the transaction overlap strategy required by CockroachDB. - -### Developer Notes - -- Code can be found [here][spanner-code] -- Documentation can be found [here][spanner-godoc] -- Starts a background [GC worker][gc-process] to clean up old entries from the manually-generated changelog table - -[spanner-code]: https://github.com/authzed/spicedb/tree/main/internal/datastore/spanner -[spanner-godoc]: https://pkg.go.dev/github.com/authzed/spicedb/internal/datastore/spanner -[gc-process]: https://github.com/authzed/spicedb/blob/main/internal/datastore/common/gc.go - -### Configuration - -- The [Cloud Spanner docs][spanner-docs] outline how to set up an instance -- Authentication via service accounts: The service account that runs migrations must have `Cloud Spanner Database Admin`; SpiceDB (non-migrations) must have `Cloud Spanner Database User`. - -[spanner-docs]: https://cloud.google.com/spanner - -#### Required Parameters - -| Parameter | Description | Example | -| -------------------- | ------------------------------------- | ---------------------------------------------------------------------------------------- | -| `datastore-engine` | the datastore engine | `--datastore-engine=spanner` | -| `datastore-conn-uri` | the cloud spanner database identifier | `--datastore-conn-uri="projects/project-id/instances/instance-id/databases/database-id"` | - -#### Optional Parameters - -| Parameter | Description | Example | -| ---------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------ | -| `datastore-spanner-credentials` | JSON service account token (omit to use [application default credentials](https://cloud.google.com/docs/authentication/production)) | `--datastore-spanner-credentials=./spanner.json` | -| `datastore-gc-interval` | Amount of time to wait between garbage collection passes | `--datastore-gc-interval=3m` | -| `datastore-gc-window` | Sets the window outside of which overwritten relationships are no longer accessible | `--datastore-gc-window=1s` | -| `datastore-revision-fuzzing-duration` | Sets a fuzzing window on all zookies/zedtokens | `--datastore-revision-fuzzing-duration=50ms` | -| `datastore-readonly` | Places the datastore into readonly mode | `--datastore-readonly=true` | -| `datastore-follower-read-delay-duration` | Amount of time to subtract from non-sync revision timestamps to ensure stale reads | `--datastore-follower-read-delay-duration=4.8s` | - -## PostgreSQL - -### Usage Notes - -- Recommended for single-region deployments -- Postgres 15 or newer is required for optimal performance -- Resiliency to failures only when PostgreSQL is operating with a follower and proper failover -- Carries setup and operational complexity of running PostgreSQL -- Does not rely on any non-standard PostgreSQL extensions -- Compatible with managed PostgreSQL services (e.g. AWS RDS) -- Can be scaled out on read workloads using read replicas - - - SpiceDB's Watch API requires PostgreSQL's [Commit Timestamp tracking][commit-ts] to be enabled. - - This can be done by providing the `--track_commit_timestamp=on` flag, configuring `postgresql.conf`, or executing `ALTER SYSTEM SET track_commit_timestamp = on;` and restarting the instance. - - [commit-ts]: https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-TRACK-COMMIT-TIMESTAMP - - -### Developer Notes - -- Code can be found [here][pg-code] -- Documentation can be found [here][pg-godoc] -- Implemented using [pgx][pgx] for a SQL driver and connection pooling -- Stores migration revisions using the same strategy as [Alembic][alembic] -- Implements its own [MVCC][mvcc] model by storing its data with transaction IDs - -[pg-code]: https://github.com/authzed/spicedb/tree/main/internal/datastore/postgres -[pg-godoc]: https://pkg.go.dev/github.com/authzed/spicedb/internal/datastore/postgres -[mvcc]: https://en.wikipedia.org/wiki/Multiversion_concurrency_control - -### Read Replicas - -SpiceDB supports Postgres read replicas and does it while retaining consistency guarantees. -Typical use cases are: - -- scale read workloads/offload reads from the primary -- deploy SpiceDB in other regions with primarily read workloads - -Read replicas are typically configured with asynchronous replication, which involves replication lag. -That would be problematic to SpiceDB's ability to solve the new enemy problem but it addresses the challenge by checking if a revision has been replicated into the target replica. -If missing, it will fall back to the primary. - -All API consistency options will leverage replicas, but the ones that benefit the most are those that involve some level of staleness as it increases the odds a revision has replicated. -`minimize_latency`, `at_least_as_fresh`, and `at_exact_snapshot` consistency modes have the highest chance of being redirected to a replica. - -SpiceDB supports Postgres replicas behind a load-balancer, and/or individually listing replica hosts. -When multiple URIs are provided, they will be queried using a round-robin strategy. -Please note that the maximum number of replica URIs to list is 16. - -Read replicas are configured with the `--datastore-read-replica-*` family of flags. - -SpiceDB supports [PgBouncer](https://www.pgbouncer.org/) connection pooler and is part of the test suite. - -#### Transaction IDs and MVCC - -The Postgres implementation of SpiceDB's internal MVCC mechanism involves storing the internal transaction ID count associated with a given transaction -in the rows written in that transaction. -Because this counter is instance-specific, there are ways in which the data in the datastore can become desynced with that internal counter. -Two concrete examples are the use of `pg_dump` and `pg_restore` to transfer data between an old instance and a new instance and setting up -logical replication between a previously-existing instance and a newly-created instance. - -If you encounter this, SpiceDB can behave as though there is no schema written, because the data (including the schema) is associated with a future transaction ID and therefore isn't "visible" to Spicedb. -If you run into this issue, the fix is [documented here](https://authzed.com/docs/spicedb/concepts/datastores#transaction-ids-and-mvcc) - -### Configuration - -#### Required Parameters - -| Parameter | Description | Example | -| -------------------- | ----------------------------------------------- | -------------------------------------------------------------------------------------------- | -| `datastore-engine` | the datastore engine | `--datastore-engine=postgres` | -| `datastore-conn-uri` | connection string used to connect to PostgreSQL | `--datastore-conn-uri="postgres://postgres:password@localhost:5432/spicedb?sslmode=disable"` | - -#### Optional Parameters - -| Parameter | Description | Example | -| ------------------------------------- | ----------------------------------------------------------------------------------- | -------------------------------------------- | -| `datastore-conn-pool-read-max-idletime` | Maximum amount of time a connection can idle in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-read-max-idletime=30m0s` | -| `datastore-conn-pool-read-max-lifetime` | Maximum amount of time a connection can live in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-read-max-lifetime=30m0s` | -| `datastore-conn-pool-read-max-lifetime-jitter` | Waits rand(0, jitter) after a connection is open for max lifetime to actually close the connection | `--datastore-conn-pool-read-max-lifetime-jitter=6m` | -| `datastore-conn-pool-read-max-open` | Number of concurrent connections open in a remote datastore's connection pool (default 20) | `--datastore-conn-pool-read-max-open=20` | -| `datastore-conn-pool-read-min-open` | Number of minimum concurrent connections open in a remote datastore's connection pool (default 20) | `--datastore-conn-pool-read-min-open=20` | -| `datastore-conn-pool-write-healthcheck-interval`| Amount of time between connection health checks in a remote datastore's connection pool (default 30s)| `--datastore-conn-pool-write-healthcheck-interval=30s` | -| `datastore-conn-pool-write-max-idletime` | Maximum amount of time a connection can idle in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-write-max-idletime=30m0s` | -| `datastore-conn-pool-write-max-lifetime` | Maximum amount of time a connection can live in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-write-max-lifetime=30m0s` | -| `datastore-conn-pool-write-max-lifetime-jitter` | Waits rand(0, jitter) after a connection is open for max lifetime to actually close the connection | `--datastore-conn-pool-write-max-lifetime-jitter=6m` | -| `datastore-conn-pool-write-max-open` | Number of concurrent connections open in a remote datastore's connection pool (default 10) | `--datastore-conn-pool-write-max-open=10` | -| `datastore-conn-pool-write-min-open` | Number of minimum concurrent connections open in a remote datastore's connection pool (default 10) | `--datastore-conn-pool-write-min-open=10` | -| `datastore-query-split-size` | The (estimated) query size at which to split a query into multiple queries | `--datastore-query-split-size=5kb` | -| `datastore-gc-window` | Sets the window outside of which overwritten relationships are no longer accessible | `--datastore-gc-window=1s` | -| `datastore-revision-fuzzing-duration` | Sets a fuzzing window on all zookies/zedtokens | `--datastore-revision-fuzzing-duration=50ms` | -| `datastore-readonly` | Places the datastore into readonly mode | `--datastore-readonly=true` | -| `datastore-read-replica-conn-uri` | Connection string used by datastores for read replicas; only supported for postgres and MySQL | `--datastore-read-replica-conn-uri="postgres://postgres:password@localhost:5432/spicedb\"` | -| `datastore-read-replica-credentials-provider-name` | Retrieve datastore credentials dynamically using aws-iam | | -| `datastore-read-replica-conn-pool-read-healthcheck-interval` | amount of time between connection health checks in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-healthcheck-interval=30s` | -| `datastore-read-replica-conn-pool-read-max-idletime` | maximum amount of time a connection can idle in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-max-idletime=30m` | -| `datastore-read-replica-conn-pool-read-max-lifetime` | maximum amount of time a connection can live in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-max-lifetime=30m` | -| `datastore-read-replica-conn-pool-read-max-lifetime-jitter` | waits rand(0, jitter) after a connection is open for max lifetime to actually close the connection to a read replica(default: 20% of max lifetime) | `--datastore-read-replica-conn-pool-read-max-lifetime-jitter=6m` | -| `datastore-read-replica-conn-pool-read-max-open` | number of concurrent connections open in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-max-open=20` | -| `datastore-read-replica-conn-pool-read-min-open` | number of minimum concurrent connections open in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-min-open=20` | - -## MySQL - -### Usage Notes - -- Recommended for single-region deployments -- Setup and operational complexity of running MySQL -- Does not rely on any non-standard MySQL extensions -- Compatible with managed MySQL services -- Can be scaled out on read workloads using read replicas - -### Developer Notes - -- Code can be found [here][mysql-code] -- Documentation can be found [here][mysql-godoc] -- Implemented using [Go-MySQL-Driver][go-mysql-driver] for a SQL driver -- Query optimizations are documented [here][mysql-executor] -- Implements its own [MVCC][mysql-mvcc] model by storing its data with transaction IDs - -[mysql-code]: https://github.com/authzed/spicedb/tree/main/internal/datastore/mysql -[mysql-godoc]: https://pkg.go.dev/github.com/authzed/spicedb/internal/datastore/mysql -[go-mysql-driver]: https://github.com/go-sql-driver/mysql -[mysql-executor]: https://github.com/authzed/spicedb/blob/main/internal/datastore/mysql/datastore.go#L317 -[mysql-mvcc]: https://en.wikipedia.org/wiki/Multiversion_concurrency_control - -### Read Replicas - - -Do not use a load balancer between SpiceDB and MySQL replicas because SpiceDB will not be able to maintain consistency guarantees. - - -SpiceDB supports MySQL read replicas and does it while retaining consistency guarantees. -Typical use cases are: - -- scale read workloads/offload reads from the primary -- deploy SpiceDB in other regions with primarily read workloads - -Read replicas are typically configured with asynchronous replication, which involves replication lag. -That would be problematic to SpiceDB's ability to solve the new enemy problem but it addresses the challenge by checking if the revision has been replicated into the target replica. -If missing, it will fall back to the primary. -Compared to the Postgres implementation, MySQL support requires two roundtrips instead of one, which means it adds some extra latency overhead. - -All API consistency options will leverage replicas, but the ones that benefit the most are those that involve some level of staleness as it increases the odds a revision has replicated. -`minimize_latency`, `at_least_as_fresh`, and `at_exact_snapshot` consistency modes have the highest chance of being redirected to a replica. - -SpiceDB does not support MySQL replicas behind a load-balancer: you may only list replica hosts individually. -Failing to adhere to this would compromise consistency guarantees. -When multiple host URIs are provided, they will be queried using round-robin. -Please note that the maximum number of MySQL replica host URIs to list is 16. - -Read replicas are configured with the `--datastore-read-replica-*` family of flags. - -### Configuration - -#### Required Parameters - -| Parameter | Description | Example | -| -------------------- | ------------------------------------------ | --------------------------------------------------------------------------------------------- | -| `datastore-engine` | the datastore engine | `--datastore-engine=mysql` | -| `datastore-conn-uri` | connection string used to connect to MySQL | `--datastore-conn-uri="user:password@(localhost:3306)/spicedb?parseTime=True"` | - - - SpiceDB requires `--datastore-conn-uri` to contain the query parameter `parseTime=True`. - - -#### Optional Parameters - -| Parameter | Description | Example | -| ------------------------------------- | ----------------------------------------------------------------------------------- | -------------------------------------------- | -| `datastore-conn-pool-read-max-idletime` | Maximum amount of time a connection can idle in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-read-max-idletime=30m0s` | -| `datastore-conn-pool-read-max-lifetime` | Maximum amount of time a connection can live in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-read-max-lifetime=30m0s` | -| `datastore-conn-pool-read-max-lifetime-jitter` | Waits rand(0, jitter) after a connection is open for max lifetime to actually close the connection | `--datastore-conn-pool-read-max-lifetime-jitter=6m` | -| `datastore-conn-pool-read-max-open` | Number of concurrent connections open in a remote datastore's connection pool (default 20) | `--datastore-conn-pool-read-max-open=20` | -| `datastore-conn-pool-read-min-open` | Number of minimum concurrent connections open in a remote datastore's connection pool (default 20) | `--datastore-conn-pool-read-min-open=20` | -| `datastore-conn-pool-write-healthcheck-interval`| Amount of time between connection health checks in a remote datastore's connection pool (default 30s)| `--datastore-conn-pool-write-healthcheck-interval=30s` | -| `datastore-conn-pool-write-max-idletime` | Maximum amount of time a connection can idle in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-write-max-idletime=30m0s` | -| `datastore-conn-pool-write-max-lifetime` | Maximum amount of time a connection can live in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-write-max-lifetime=30m0s` | -| `datastore-conn-pool-write-max-lifetime-jitter` | Waits rand(0, jitter) after a connection is open for max lifetime to actually close the connection | `--datastore-conn-pool-write-max-lifetime-jitter=6m` | -| `datastore-conn-pool-write-max-open` | Number of concurrent connections open in a remote datastore's connection pool (default 10) | `--datastore-conn-pool-write-max-open=10` | -| `datastore-conn-pool-write-min-open` | Number of minimum concurrent connections open in a remote datastore's connection pool (default 10) | `--datastore-conn-pool-write-min-open=10` | -| `datastore-query-split-size` | The (estimated) query size at which to split a query into multiple queries | `--datastore-query-split-size=5kb` | -| `datastore-gc-window` | Sets the window outside of which overwritten relationships are no longer accessible | `--datastore-gc-window=1s` | -| `datastore-revision-fuzzing-duration` | Sets a fuzzing window on all zookies/zedtokens | `--datastore-revision-fuzzing-duration=50ms` | -| `datastore-mysql-table-prefix string` | Prefix to add to the name of all SpiceDB database tables | `--datastore-mysql-table-prefix=spicedb` | -| `datastore-readonly` | Places the datastore into readonly mode | `--datastore-readonly=true` | -| `datastore-read-replica-conn-uri` | Connection string used by datastores for read replicas; only supported for postgres and MySQL | `--datastore-read-replica-conn-uri="postgres://postgres:password@localhost:5432/spicedb\"` | -| `—datastore-read-replica-credentials-provider-name` | Retrieve datastore credentials dynamically using aws-iam | | -| `datastore-read-replica-conn-pool-read-healthcheck-interval` | amount of time between connection health checks in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-healthcheck-interval=30s` | -| `datastore-read-replica-conn-pool-read-max-idletime` | maximum amount of time a connection can idle in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-max-idletime=30m` | -| `datastore-read-replica-conn-pool-read-max-lifetime` | maximum amount of time a connection can live in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-max-lifetime=30m` | -| `datastore-read-replica-conn-pool-read-max-lifetime-jitter` | waits rand(0, jitter) after a connection is open for max lifetime to actually close the connection to a read replica(default: 20% of max lifetime) | `--datastore-read-replica-conn-pool-read-max-lifetime-jitter=6m` | -| `datastore-read-replica-conn-pool-read-max-open` | number of concurrent connections open in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-max-open=20` | -| `datastore-read-replica-conn-pool-read-min-open` | number of minimum concurrent connections open in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-min-open=20` | - -## memdb - -### Usage Notes - -- Fully ephemeral; *all* data is lost when the process is terminated -- Intended for usage with SpiceDB itself and testing application integrations -- Cannot be ran highly-available as multiple instances will not share the same in-memory data - - - If you need an ephemeral datastore designed for validation or testing, see the test server system in [Validating and Testing] - - -[validating and testing]: /spicedb/modeling/validation-testing-debugging - -### Developer Notes - -- Code can be found [here][memdb-code] -- Documentation can be found [here][memdb-godoc] -- Implements its own [MVCC][mvcc] model by storing its data with transaction IDs - -[memdb-code]: https://github.com/authzed/spicedb/tree/main/internal/datastore/memdb -[memdb-godoc]: https://pkg.go.dev/github.com/authzed/spicedb/internal/datastore/memdb - -### Configuration - -#### Required Parameters - -| Parameter | Description | Example | -| ------------------ | -------------------- | --------------------------- | -| `datastore-engine` | the datastore engine | `--datastore-engine memory` | - -#### Optional Parameters - -| Parameter | Description | Example | -| ------------------------------------- | ----------------------------------------------------------------------------------- | -------------------------------------------- | -| `datastore-revision-fuzzing-duration` | Sets a fuzzing window on all zookies/zedtokens | `--datastore-revision-fuzzing-duration=50ms` | -| `datastore-gc-window` | Sets the window outside of which overwritten relationships are no longer accessible | `--datastore-gc-window=1s` | -| `datastore-readonly` | Places the datastore into readonly mode | `--datastore-readonly=true` | diff --git a/pages/spicedb/concepts/datastores/cloud-spanner.mdx b/pages/spicedb/concepts/datastores/cloud-spanner.mdx new file mode 100644 index 0000000..12abcec --- /dev/null +++ b/pages/spicedb/concepts/datastores/cloud-spanner.mdx @@ -0,0 +1,44 @@ +import { Callout } from 'nextra/components' + +# Cloud Spanner + +## Usage Notes + +- Requires a Google Cloud Account with an active Cloud Spanner instance +- Take advantage of Google's TrueTime. + The Spanner driver assumes the database is linearizable and skips the transaction overlap strategy required by CockroachDB. + +## Developer Notes + +- Code can be found [here][spanner-code] +- Documentation can be found [here][spanner-godoc] +- Starts a background [GC worker][gc-process] to clean up old entries from the manually-generated changelog table + +[spanner-code]: https://github.com/authzed/spicedb/tree/main/internal/datastore/spanner +[spanner-godoc]: https://pkg.go.dev/github.com/authzed/spicedb/internal/datastore/spanner +[gc-process]: https://github.com/authzed/spicedb/blob/main/internal/datastore/common/gc.go + +## Configuration + +- The [Cloud Spanner docs][spanner-docs] outline how to set up an instance +- Authentication via service accounts: The service account that runs migrations must have `Cloud Spanner Database Admin`; SpiceDB (non-migrations) must have `Cloud Spanner Database User`. + +[spanner-docs]: https://cloud.google.com/spanner + +### Required Parameters + +| Parameter | Description | Example | +| -------------------- | ------------------------------------- | ---------------------------------------------------------------------------------------- | +| `datastore-engine` | the datastore engine | `--datastore-engine=spanner` | +| `datastore-conn-uri` | the cloud spanner database identifier | `--datastore-conn-uri="projects/project-id/instances/instance-id/databases/database-id"` | + +### Optional Parameters + +| Parameter | Description | Example | +| ---------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------ | +| `datastore-spanner-credentials` | JSON service account token (omit to use [application default credentials](https://cloud.google.com/docs/authentication/production)) | `--datastore-spanner-credentials=./spanner.json` | +| `datastore-gc-interval` | Amount of time to wait between garbage collection passes | `--datastore-gc-interval=3m` | +| `datastore-gc-window` | Sets the window outside of which overwritten relationships are no longer accessible | `--datastore-gc-window=1s` | +| `datastore-revision-fuzzing-duration` | Sets a fuzzing window on all zookies/zedtokens | `--datastore-revision-fuzzing-duration=50ms` | +| `datastore-readonly` | Places the datastore into readonly mode | `--datastore-readonly=true` | +| `datastore-follower-read-delay-duration` | Amount of time to subtract from non-sync revision timestamps to ensure stale reads | `--datastore-follower-read-delay-duration=4.8s` | diff --git a/pages/spicedb/concepts/datastores/cockroachdb.mdx b/pages/spicedb/concepts/datastores/cockroachdb.mdx new file mode 100644 index 0000000..5ca35ab --- /dev/null +++ b/pages/spicedb/concepts/datastores/cockroachdb.mdx @@ -0,0 +1,276 @@ +import { Callout } from 'nextra/components' + +# CockroachDB + +## Usage Notes + + + SpiceDB's Watch API requires CockroachDB's [Experimental Changefeed] to be enabled. + + [Experimental Changefeed]: https://www.cockroachlabs.com/docs/v22.1/changefeed-for + + +- Recommended for multi-region deployments, with configurable region awareness +- Enables horizontal scalability by adding more SpiceDB and CockroachDB instances +- Resiliency to individual CockroachDB instance failures +- Query and data balanced across the CockroachDB +- Setup and operational complexity of running CockroachDB + +## Developer Notes + +- Code can be found [here][crdb-code] +- Documentation can be found [here][crdb-godoc] +- Implemented using [pgx][pgx] for a SQL driver and connection pooling +- Has a native changefeed +- Stores migration revisions using the same strategy as [Alembic][alembic] + +[crdb-code]: https://github.com/authzed/spicedb/tree/main/internal/datastore/crdb +[crdb-godoc]: https://pkg.go.dev/github.com/authzed/spicedb/internal/datastore/crdb +[pgx]: https://pkg.go.dev/gopkg.in/jackc/pgx.v3 +[alembic]: https://alembic.sqlalchemy.org/en/latest/ + +## Configuration + +### Required Parameters + +| Parameter | Description | Example | +| -------------------- | ----------------------------------------- | ----------------------------------------------------------------------------------------- | +| `datastore-engine` | the datastore engine | `--datastore-engine=cockroachdb` | +| `datastore-conn-uri` | connection string used to connect to CRDB | `--datastore-conn-uri="postgres://user:password@localhost:26257/spicedb?sslmode=disable"` | + +### Optional Parameters + +| Parameter | Description | Example | +| ---------------------------------------- | ------------------------------------------------------------------------------------- | ----------------------------------------------- | +| `datastore-max-tx-retries` | Maximum number of times to retry a query before raising an error | `--datastore-max-tx-retries=50` | +| `datastore-tx-overlap-strategy` | The overlap strategy to prevent New Enemy on CRDB (see below) | `--datastore-tx-overlap-strategy=static` | +| `datastore-tx-overlap-key` | The key to use for the overlap strategy (see below) | `--datastore-tx-overlap-key="foo"` | +| `datastore-conn-pool-read-max-idletime` | Maximum amount of time a connection can idle in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-read-max-idletime=30m0s` | +| `datastore-conn-pool-read-max-lifetime` | Maximum amount of time a connection can live in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-read-max-lifetime=30m0s` | +| `datastore-conn-pool-read-max-lifetime-jitter` | Waits rand(0, jitter) after a connection is open for max lifetime to actually close the connection | `--datastore-conn-pool-read-max-lifetime-jitter=6m` | +| `datastore-conn-pool-read-max-open` | Number of concurrent connections open in a remote datastore's connection pool (default 20) | `--datastore-conn-pool-read-max-open=20` | +| `datastore-conn-pool-read-min-open` | Number of minimum concurrent connections open in a remote datastore's connection pool (default 20) | `--datastore-conn-pool-read-min-open=20` | +| `datastore-conn-pool-write-healthcheck-interval`| Amount of time between connection health checks in a remote datastore's connection pool (default 30s)| `--datastore-conn-pool-write-healthcheck-interval=30s` | +| `datastore-conn-pool-write-max-idletime` | Maximum amount of time a connection can idle in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-write-max-idletime=30m0s` | +| `datastore-conn-pool-write-max-lifetime` | Maximum amount of time a connection can live in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-write-max-lifetime=30m0s` | +| `datastore-conn-pool-write-max-lifetime-jitter` | Waits rand(0, jitter) after a connection is open for max lifetime to actually close the connection | `--datastore-conn-pool-write-max-lifetime-jitter=6m` | +| `datastore-conn-pool-write-max-open` | Number of concurrent connections open in a remote datastore's connection pool (default 10) | `--datastore-conn-pool-write-max-open=10` | +| `datastore-conn-pool-write-min-open` | Number of minimum concurrent connections open in a remote datastore's connection pool (default 10) | `--datastore-conn-pool-write-min-open=10` | +| `datastore-query-split-size` | The (estimated) query size at which to split a query into multiple queries | `--datastore-query-split-size=5kb` | +| `datastore-gc-window` | Sets the window outside of which overwritten relationships are no longer accessible | `--datastore-gc-window=1s` | +| `datastore-revision-fuzzing-duration` | Sets a fuzzing window on all zookies/zedtokens | `--datastore-revision-fuzzing-duration=50ms` | +| `datastore-readonly` | Places the datastore into readonly mode | `--datastore-readonly=true` | +| `datastore-follower-read-delay-duration` | Amount of time to subtract from non-sync revision timestamps to ensure follower reads | `--datastore-follower-read-delay-duration=4.8s` | +| `datastore-relationship-integrity-enabled` | Enables relationship integrity checks, only supported on CRDB | `--datastore-relationship-integrity-enabled=false` | +| `datastore-relationship-integrity-current-key-id` | Current key id for relationship integrity checks | `--datastore-relationship-integrity-current-key-id="foo"` | +| `datastore-relationship-integrity-current-key-filename` | Current key filename for relationship integrity checks | `--datastore-relationship-integrity-current-key-filename="foo"` | +| `datastore-relationship-integrity-expired-keys` | Config for expired keys for relationship integrity checks | `--datastore-relationship-integrity-expired-keys="foo"` | + +## Understanding the New Enemy Problem with CockroachDB + +CockroachDB is a Spanner-like datastore supporting global, immediate consistency, with the mantra "no stale reads." +The CockroachDB implementation should be used when your SpiceDB service runs in multiple geographic regions, and Google's Cloud Spanner is unavailable (e.g. AWS, Azure, bare metal.) + +In order to prevent the new-enemy problem, we need to make related transactions overlap. +We do this by choosing a common database key and writing to that key with all relationships that may overlap. +This tradeoff is cataloged in our blog post "[The One Crucial Difference Between Spanner and CockroachDB][crdb-blog]", and means we are trading off write throughput for consistency. + +[crdb-blog]: https://authzed.com/blog/prevent-newenemy-cockroachdb +[New Enemy Problem]: /spicedb/concepts/zanzibar#new-enemy-problem + +### Overlap Strategies + +CockroachDB datastore users that are willing to rely on more subtle guarantees to mitigate the [New Enemy Problem] can configure the overlap strategy with the flag `--datastore-tx-overlap-strategy`. +The available strategies are: + +| Strategy | Description | +| --- | --- | +| `static` (default) | All writes overlap to protect against the [New Enemy Problem] at the cost of write throughput | +| `prefix` | Only writes that contain objects with same prefix overlap (e.g. `tenant1/user` and `tenant2/user` can be written in concurrently) | +| `request` | Only writes with the same `io.spicedb.requestoverlapkey` gRPC request header overlap enabling applications to decide on-the-fly which writes have causual dependencies. Writes without any header act the same as `insecure`. | +| `insecure` | No writes overlap, providing the best write throughput, does not protect against the [New Enemy Problem] | + +Depending on your application, `insecure` may be acceptable, and it avoids the performance cost associated with the `static` and `prefix` options. +If the [New Enemy Problem] is not a concern for your application, consider using the `insecure` strategy. + +### When is `insecure` overlap a problem? + +Using `insecure` overlap strategy for SpiceDB with CockroachDB means that it is _possible_ that timestamps for two subsequent writes will be out of order. +When this happens, it's _possible_ for the [New Enemy Problem] to occur. + +Let's look at how likely this is, and what the impact might actually be for your workload. + +### When can timestamps be reversed? + +Before we look at how this can impact an application, let's first understand when and how timestamps can be reversed in the first place. + +- When two writes are made in short succession against CockroachDB +- And those two writes hit two different gateway nodes +- And the CRDB gateway node clocks have a delta `D` +- And the writes touch disjoint sets of relationships +- And those two writes are sent within the time delta `D` between the gateway nodes +- And the writes land in ranges whose followers are disjoint sets of nodes +- And other independent cockroach processes (heartbeats, etc) haven't coincidentally synced the gateway node clocks during the writes. + +Then it's possible that the second write will be assigned a timestamp earlier than the first write. +In the next section we'll look at whether that matters for your application, but for now let's look at what makes the above conditions more or less likely: + +- **Clock skew**: A larger clock skew gives a bigger window in which timestamps can be reversed - note that CRDB enforces a max offset between clocks, and getting within some fraction of that max offset will kick the node from the cluster. +- **Network congestion**: or anything that interferes with node heartbeating - this increases the length of time that clocks can be desynchronized befor Cockroach notices and syncs them back up. +- **Cluster size**: When there are many nodes, it is more likely that a write to one range will not have follower nodes that overlap with the followers of a write to another range, and it also makes it more likely that the two writes will have different gateway nodes. + On the other side, a 3 node cluster with `replicas: 3` means that all writes will sync clocks on all nodes. +- **Write rate**: If the write rate is high, it's more likely that two writes will hit the conditions to have reversed timestamps. + If writes only happen once every max offset period for the cluster, it's impossible for their timestamps to be reversed. + +The likelihood of a timestamp reversal is dependent on the cockroach cluster and the application's usage patterns. + +### When does a timestamp reversal matter? + +Now we know when timestamps _could_ be reversed. +But when does that matter to your application? + +The TL;DR is: only when you care about the New Enemy Problem. + +Let's take a look at a couple of examples of how reversed timestamps may be an issue for an application storing permissions in SpiceDB. + +### Neglecting ACL Update Order + +Two separate `WriteRelationship` calls come in: + +- `A`: Alice removes Bob from the `shared` folder +- `B`: Alice adds a new document `not-for-bob.txt` to the `shared` folder + +The normal case is that the timestamp for `A` < the timestamp for `B`. + +But if those two writes hit the conditions for a timestamp reversal, then `B < A`. + +From Alice's perspective, there should be no time at which Bob can ever see `not-for-bob.txt`. +She performed the first write, got a response, and then performed the second write. + +But this isn't true when using `MinimizeLatency` or `AtLeastAsFresh` consistency. +If Bob later performs a `Check` request for the `not-for-bob.txt` document, it's possible that SpiceDB will pick an evaluation timestamp such that `B < T < A`, so that the document is in the folder _and_ bob is allowed to see the contents of the folder. + +Note that this is only possible if `A - T < quantization window`: the check has to happen soon enough after the write for `A` that it's possible that SpiceDB picks a timestamp in between them. +The default quantization window is `5s`. + +### Application Mitigations for ACL Update Order + +This could be mitigated in your application by: + +- Not caring about the problem +- Not allowing the write from `B` within the max_offset time of the CRDB cluster (or the quantization window). +- Not allowing a Check on a resource within max_offset of its ACL modification (or the quantization window). + +### Mis-apply Old ACLs to New Content + +Two separate API calls come in: + +- `A`: Alice remove Bob as a viewer of document `secret` +- `B`: Alice does a `FullyConsistent` `Check` request to get a ZedToken +- `C`: Alice stores that ZedToken (timestamp `B`) with the document `secret` when she updates it to say `Bob is a fool`. + +Same as before, the normal case is that the timestamp for `A` < the timestamp for `B`, but if the two writes hit the conditions for a timestamp reversal, then `B < A`. + +Bob later tries to read the document. +The application performs an `AtLeastAsFresh` `Check` for Bob to access the document `secret` using the stored Zedtoken (which is timestamp `B`.) + +It's possible that SpiceDB will pick an evaluation timestamp `T` such that `B < T < A`, so that bob is allowed to read the newest contents of the document, and discover that Alice thinks he is a fool. + +Same as before, this is only possible if `A - T < quantization window`: Bob's check has to happen soon enough after the write for `A` that it's possible that SpiceDB picks a timestamp in between `A` and `B`, and the default quantization window is `5s`. + +### Application Mitigations for Misapplying Old ACLs + +This could be mitigated in your application by: + +- Not caring about the problem +- Waiting for max_offset (or the quantization window) before doing the fully-consistent check. + +### When does a timestamp reversal _not_ matter? + +There are also some cases when there is no New Enemy Problem even if there are reversed timestamps. + +#### Non-sensitive domain + +Not all authorization problems have a version of the New Enemy Problem, which relies on there being some meaningful +consequence of hitting an incorrect ACL during the small window of time where it's possible. + +If the worst thing that happens from out-of-order ACL updates is that some users briefly see some non-sensitive data, +or that a user retains access to something that they already had access to for a few extra seconds, then even though +there could still effectively be a "New Enemy Problem," it's not a meaningful problem to worry about. + +#### Disjoint SpiceDB Graphs + +The examples of the New Enemy Problem above rely on out-of-order ACLs to be part of the same permission graph. +But not all ACLs are part of the same graph, for example: + +```haskell +definition user {} + +definition blog { + relation author: user + permission edit = author +} + +defintion video { + relation editor: user + permission change_tags = editor +} +``` + +`A`: Alice is added as an `author` of the Blog entry `new-enemy` +`B`: Bob is removed from the `editor`s of the `spicedb.mp4` video + +If these writes are given reversed timestamps, it is possible that the ACLs will be applied out-or-order and this would normally be a New Enemy Problem. +But the ACLs themselves aren't shared between any permission computations, and so there is no actual consequence to reversed timestamps. + +## Garbage Collection Window + + + As of February 2023, the [default garbage collection window] has changed to `1.25 hours` for CockroachDB Serverless and `4 hours` for CockroachDB Dedicated. + + [default garbage collection window]: https://github.com/cockroachdb/cockroach/issues/89233 + + +SpiceDB warns if the garbage collection window as configured in CockroachDB is smaller than the SpiceDB configuration. + +If you need a longer time window for the Watch API or querying at exact snapshots, you can [adjust the value in CockroachDB][crdb-gc]: + +```sql +ALTER ZONE default CONFIGURE ZONE USING gc.ttlseconds = 90000; +``` + +[crdb-gc]: https://www.cockroachlabs.com/docs/stable/configure-replication-zones.html#replication-zone-variables + +## Relationship Integrity + +Relationship Integrity is a new experimental feature in SpiceDB that ensures that data written into the supported backing datastores (currently: only CockroachDB) is validated as having been written by SpiceDB itself. + +### What does relationship integrity ensure? + +Relationship integrity primarily ensures that all relationships written into the backing datastore were written via a trusted instance of SpiceDB or that the caller has access to the key(s) necessary to write those relationships. +It ensures that if someone gains access to the underlying datastore, they cannot simply write new relationships of their own invention. + +### What does relationship integrity _not_ ensure? + +Since the relationship integrity feature signs each individual relationship, it does not ensure that removal of relationships is by a trusted party. +Schema is also currently unverified, so an untrusted party could change it as well. +Support for schema changes will likely come in a future version. + +### Setting up relationship integrity + +To run with relationship integrity, new flags must be given to SpiceDB: + +```zed +spicedb serve ...existing flags... +--datastore-relationship-integrity-enabled +--datastore-relationship-integrity-current-key-id="somekeyid" +--datastore-relationship-integrity-current-key-filename="some.key" +``` + +Place the generated key contents (which must support an HMAC key) in `some.key` + +### Deployment Process + +1. Start with a **clean** datastore for SpiceDB. **At this time, migrating an existing SpiceDB installation is not supported.** +2. Run the standard `migrate` command but with relationship integrity flags included. +3. Run SpiceDB with the relationship integrity flags included. diff --git a/pages/spicedb/concepts/datastores/memdb.mdx b/pages/spicedb/concepts/datastores/memdb.mdx new file mode 100644 index 0000000..be0873c --- /dev/null +++ b/pages/spicedb/concepts/datastores/memdb.mdx @@ -0,0 +1,41 @@ +import { Callout } from 'nextra/components' + +# memdb + +## Usage Notes + +- Fully ephemeral; *all* data is lost when the process is terminated +- Intended for usage with SpiceDB itself and testing application integrations +- Cannot be ran highly-available as multiple instances will not share the same in-memory data + + + If you need an ephemeral datastore designed for validation or testing, see the test server system in [Validating and Testing] + + +[validating and testing]: /spicedb/modeling/validation-testing-debugging + +## Developer Notes + +- Code can be found [here][memdb-code] +- Documentation can be found [here][memdb-godoc] +- Implements its own [MVCC][mvcc] model by storing its data with transaction IDs + +[memdb-code]: https://github.com/authzed/spicedb/tree/main/internal/datastore/memdb +[memdb-godoc]: https://pkg.go.dev/github.com/authzed/spicedb/internal/datastore/memdb +[mvcc]: https://en.wikipedia.org/wiki/Multiversion_concurrency_control + +## Configuration + +### Required Parameters + +| Parameter | Description | Example | +| ------------------ | -------------------- | --------------------------- | +| `datastore-engine` | the datastore engine | `--datastore-engine memory` | + +### Optional Parameters + +| Parameter | Description | Example | +| ------------------------------------- | ----------------------------------------------------------------------------------- | -------------------------------------------- | +| `datastore-revision-fuzzing-duration` | Sets a fuzzing window on all zookies/zedtokens | `--datastore-revision-fuzzing-duration=50ms` | +| `datastore-gc-window` | Sets the window outside of which overwritten relationships are no longer accessible | `--datastore-gc-window=1s` | +| `datastore-readonly` | Places the datastore into readonly mode | `--datastore-readonly=true` | diff --git a/pages/spicedb/concepts/datastores/mysql.mdx b/pages/spicedb/concepts/datastores/mysql.mdx new file mode 100644 index 0000000..90d549d --- /dev/null +++ b/pages/spicedb/concepts/datastores/mysql.mdx @@ -0,0 +1,94 @@ +import { Callout } from 'nextra/components' + +# MySQL + +## Usage Notes + +- Recommended for single-region deployments +- Setup and operational complexity of running MySQL +- Does not rely on any non-standard MySQL extensions +- Compatible with managed MySQL services +- Can be scaled out on read workloads using read replicas + +## Developer Notes + +- Code can be found [here][mysql-code] +- Documentation can be found [here][mysql-godoc] +- Implemented using [Go-MySQL-Driver][go-mysql-driver] for a SQL driver +- Query optimizations are documented [here][mysql-executor] +- Implements its own [MVCC][mysql-mvcc] model by storing its data with transaction IDs + +[mysql-code]: https://github.com/authzed/spicedb/tree/main/internal/datastore/mysql +[mysql-godoc]: https://pkg.go.dev/github.com/authzed/spicedb/internal/datastore/mysql +[go-mysql-driver]: https://github.com/go-sql-driver/mysql +[mysql-executor]: https://github.com/authzed/spicedb/blob/main/internal/datastore/mysql/datastore.go#L317 +[mysql-mvcc]: https://en.wikipedia.org/wiki/Multiversion_concurrency_control + +## Read Replicas + + +Do not use a load balancer between SpiceDB and MySQL replicas because SpiceDB will not be able to maintain consistency guarantees. + + +SpiceDB supports MySQL read replicas and does it while retaining consistency guarantees. +Typical use cases are: + +- scale read workloads/offload reads from the primary +- deploy SpiceDB in other regions with primarily read workloads + +Read replicas are typically configured with asynchronous replication, which involves replication lag. +That would be problematic to SpiceDB's ability to solve the new enemy problem but it addresses the challenge by checking if the revision has been replicated into the target replica. +If missing, it will fall back to the primary. +Compared to the Postgres implementation, MySQL support requires two roundtrips instead of one, which means it adds some extra latency overhead. + +All API consistency options will leverage replicas, but the ones that benefit the most are those that involve some level of staleness as it increases the odds a revision has replicated. +`minimize_latency`, `at_least_as_fresh`, and `at_exact_snapshot` consistency modes have the highest chance of being redirected to a replica. + +SpiceDB does not support MySQL replicas behind a load-balancer: you may only list replica hosts individually. +Failing to adhere to this would compromise consistency guarantees. +When multiple host URIs are provided, they will be queried using round-robin. +Please note that the maximum number of MySQL replica host URIs to list is 16. + +Read replicas are configured with the `--datastore-read-replica-*` family of flags. + +## Configuration + +### Required Parameters + +| Parameter | Description | Example | +| -------------------- | ------------------------------------------ | --------------------------------------------------------------------------------------------- | +| `datastore-engine` | the datastore engine | `--datastore-engine=mysql` | +| `datastore-conn-uri` | connection string used to connect to MySQL | `--datastore-conn-uri="user:password@(localhost:3306)/spicedb?parseTime=True"` | + + + SpiceDB requires `--datastore-conn-uri` to contain the query parameter `parseTime=True`. + + +### Optional Parameters + +| Parameter | Description | Example | +| ------------------------------------- | ----------------------------------------------------------------------------------- | -------------------------------------------- | +| `datastore-conn-pool-read-max-idletime` | Maximum amount of time a connection can idle in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-read-max-idletime=30m0s` | +| `datastore-conn-pool-read-max-lifetime` | Maximum amount of time a connection can live in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-read-max-lifetime=30m0s` | +| `datastore-conn-pool-read-max-lifetime-jitter` | Waits rand(0, jitter) after a connection is open for max lifetime to actually close the connection | `--datastore-conn-pool-read-max-lifetime-jitter=6m` | +| `datastore-conn-pool-read-max-open` | Number of concurrent connections open in a remote datastore's connection pool (default 20) | `--datastore-conn-pool-read-max-open=20` | +| `datastore-conn-pool-read-min-open` | Number of minimum concurrent connections open in a remote datastore's connection pool (default 20) | `--datastore-conn-pool-read-min-open=20` | +| `datastore-conn-pool-write-healthcheck-interval`| Amount of time between connection health checks in a remote datastore's connection pool (default 30s)| `--datastore-conn-pool-write-healthcheck-interval=30s` | +| `datastore-conn-pool-write-max-idletime` | Maximum amount of time a connection can idle in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-write-max-idletime=30m0s` | +| `datastore-conn-pool-write-max-lifetime` | Maximum amount of time a connection can live in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-write-max-lifetime=30m0s` | +| `datastore-conn-pool-write-max-lifetime-jitter` | Waits rand(0, jitter) after a connection is open for max lifetime to actually close the connection | `--datastore-conn-pool-write-max-lifetime-jitter=6m` | +| `datastore-conn-pool-write-max-open` | Number of concurrent connections open in a remote datastore's connection pool (default 10) | `--datastore-conn-pool-write-max-open=10` | +| `datastore-conn-pool-write-min-open` | Number of minimum concurrent connections open in a remote datastore's connection pool (default 10) | `--datastore-conn-pool-write-min-open=10` | +| `datastore-query-split-size` | The (estimated) query size at which to split a query into multiple queries | `--datastore-query-split-size=5kb` | +| `datastore-gc-window` | Sets the window outside of which overwritten relationships are no longer accessible | `--datastore-gc-window=1s` | +| `datastore-revision-fuzzing-duration` | Sets a fuzzing window on all zookies/zedtokens | `--datastore-revision-fuzzing-duration=50ms` | +| `datastore-mysql-table-prefix string` | Prefix to add to the name of all SpiceDB database tables | `--datastore-mysql-table-prefix=spicedb` | +| `datastore-readonly` | Places the datastore into readonly mode | `--datastore-readonly=true` | +| `datastore-read-replica-conn-uri` | Connection string used by datastores for read replicas; only supported for postgres and MySQL | `--datastore-read-replica-conn-uri="postgres://postgres:password@localhost:5432/spicedb\"` | +| `—datastore-read-replica-credentials-provider-name` | Retrieve datastore credentials dynamically using aws-iam | | +| `datastore-read-replica-conn-pool-read-healthcheck-interval` | amount of time between connection health checks in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-healthcheck-interval=30s` | +| `datastore-read-replica-conn-pool-read-max-idletime` | maximum amount of time a connection can idle in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-max-idletime=30m` | +| `datastore-read-replica-conn-pool-read-max-lifetime` | maximum amount of time a connection can live in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-max-lifetime=30m` | +| `datastore-read-replica-conn-pool-read-max-lifetime-jitter` | waits rand(0, jitter) after a connection is open for max lifetime to actually close the connection to a read replica(default: 20% of max lifetime) | `--datastore-read-replica-conn-pool-read-max-lifetime-jitter=6m` | +| `datastore-read-replica-conn-pool-read-max-open` | number of concurrent connections open in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-max-open=20` | +| `datastore-read-replica-conn-pool-read-min-open` | number of minimum concurrent connections open in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-min-open=20` | diff --git a/pages/spicedb/concepts/datastores/postgresql.mdx b/pages/spicedb/concepts/datastores/postgresql.mdx new file mode 100644 index 0000000..61a204b --- /dev/null +++ b/pages/spicedb/concepts/datastores/postgresql.mdx @@ -0,0 +1,106 @@ +import { Callout } from 'nextra/components' + +# PostgreSQL + +## Usage Notes + +- Recommended for single-region deployments +- Postgres 15 or newer is required for optimal performance +- Resiliency to failures only when PostgreSQL is operating with a follower and proper failover +- Carries setup and operational complexity of running PostgreSQL +- Does not rely on any non-standard PostgreSQL extensions +- Compatible with managed PostgreSQL services (e.g. AWS RDS) +- Can be scaled out on read workloads using read replicas + + + SpiceDB's Watch API requires PostgreSQL's [Commit Timestamp tracking][commit-ts] to be enabled. + + This can be done by providing the `--track_commit_timestamp=on` flag, configuring `postgresql.conf`, or executing `ALTER SYSTEM SET track_commit_timestamp = on;` and restarting the instance. + + [commit-ts]: https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-TRACK-COMMIT-TIMESTAMP + + +## Developer Notes + +- Code can be found [here][pg-code] +- Documentation can be found [here][pg-godoc] +- Implemented using [pgx][pgx] for a SQL driver and connection pooling +- Stores migration revisions using the same strategy as [Alembic][alembic] +- Implements its own [MVCC][mvcc] model by storing its data with transaction IDs + +[pg-code]: https://github.com/authzed/spicedb/tree/main/internal/datastore/postgres +[pg-godoc]: https://pkg.go.dev/github.com/authzed/spicedb/internal/datastore/postgres +[pgx]: https://pkg.go.dev/gopkg.in/jackc/pgx.v3 +[alembic]: https://alembic.sqlalchemy.org/en/latest/ +[mvcc]: https://en.wikipedia.org/wiki/Multiversion_concurrency_control + +## Read Replicas + +SpiceDB supports Postgres read replicas and does it while retaining consistency guarantees. +Typical use cases are: + +- scale read workloads/offload reads from the primary +- deploy SpiceDB in other regions with primarily read workloads + +Read replicas are typically configured with asynchronous replication, which involves replication lag. +That would be problematic to SpiceDB's ability to solve the new enemy problem but it addresses the challenge by checking if a revision has been replicated into the target replica. +If missing, it will fall back to the primary. + +All API consistency options will leverage replicas, but the ones that benefit the most are those that involve some level of staleness as it increases the odds a revision has replicated. +`minimize_latency`, `at_least_as_fresh`, and `at_exact_snapshot` consistency modes have the highest chance of being redirected to a replica. + +SpiceDB supports Postgres replicas behind a load-balancer, and/or individually listing replica hosts. +When multiple URIs are provided, they will be queried using a round-robin strategy. +Please note that the maximum number of replica URIs to list is 16. + +Read replicas are configured with the `--datastore-read-replica-*` family of flags. + +SpiceDB supports [PgBouncer](https://www.pgbouncer.org/) connection pooler and is part of the test suite. + +### Transaction IDs and MVCC + +The Postgres implementation of SpiceDB's internal MVCC mechanism involves storing the internal transaction ID count associated with a given transaction +in the rows written in that transaction. +Because this counter is instance-specific, there are ways in which the data in the datastore can become desynced with that internal counter. +Two concrete examples are the use of `pg_dump` and `pg_restore` to transfer data between an old instance and a new instance and setting up +logical replication between a previously-existing instance and a newly-created instance. + +If you encounter this, SpiceDB can behave as though there is no schema written, because the data (including the schema) is associated with a future transaction ID and therefore isn't "visible" to Spicedb. +If you run into this issue, the fix is [documented here](https://authzed.com/docs/spicedb/concepts/datastores#transaction-ids-and-mvcc) + +## Configuration + +### Required Parameters + +| Parameter | Description | Example | +| -------------------- | ----------------------------------------------- | -------------------------------------------------------------------------------------------- | +| `datastore-engine` | the datastore engine | `--datastore-engine=postgres` | +| `datastore-conn-uri` | connection string used to connect to PostgreSQL | `--datastore-conn-uri="postgres://postgres:password@localhost:5432/spicedb?sslmode=disable"` | + +### Optional Parameters + +| Parameter | Description | Example | +| ------------------------------------- | ----------------------------------------------------------------------------------- | -------------------------------------------- | +| `datastore-conn-pool-read-max-idletime` | Maximum amount of time a connection can idle in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-read-max-idletime=30m0s` | +| `datastore-conn-pool-read-max-lifetime` | Maximum amount of time a connection can live in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-read-max-lifetime=30m0s` | +| `datastore-conn-pool-read-max-lifetime-jitter` | Waits rand(0, jitter) after a connection is open for max lifetime to actually close the connection | `--datastore-conn-pool-read-max-lifetime-jitter=6m` | +| `datastore-conn-pool-read-max-open` | Number of concurrent connections open in a remote datastore's connection pool (default 20) | `--datastore-conn-pool-read-max-open=20` | +| `datastore-conn-pool-read-min-open` | Number of minimum concurrent connections open in a remote datastore's connection pool (default 20) | `--datastore-conn-pool-read-min-open=20` | +| `datastore-conn-pool-write-healthcheck-interval`| Amount of time between connection health checks in a remote datastore's connection pool (default 30s)| `--datastore-conn-pool-write-healthcheck-interval=30s` | +| `datastore-conn-pool-write-max-idletime` | Maximum amount of time a connection can idle in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-write-max-idletime=30m0s` | +| `datastore-conn-pool-write-max-lifetime` | Maximum amount of time a connection can live in a remote datastore's connection pool (default 30m0s) | `--datastore-conn-pool-write-max-lifetime=30m0s` | +| `datastore-conn-pool-write-max-lifetime-jitter` | Waits rand(0, jitter) after a connection is open for max lifetime to actually close the connection | `--datastore-conn-pool-write-max-lifetime-jitter=6m` | +| `datastore-conn-pool-write-max-open` | Number of concurrent connections open in a remote datastore's connection pool (default 10) | `--datastore-conn-pool-write-max-open=10` | +| `datastore-conn-pool-write-min-open` | Number of minimum concurrent connections open in a remote datastore's connection pool (default 10) | `--datastore-conn-pool-write-min-open=10` | +| `datastore-query-split-size` | The (estimated) query size at which to split a query into multiple queries | `--datastore-query-split-size=5kb` | +| `datastore-gc-window` | Sets the window outside of which overwritten relationships are no longer accessible | `--datastore-gc-window=1s` | +| `datastore-revision-fuzzing-duration` | Sets a fuzzing window on all zookies/zedtokens | `--datastore-revision-fuzzing-duration=50ms` | +| `datastore-readonly` | Places the datastore into readonly mode | `--datastore-readonly=true` | +| `datastore-read-replica-conn-uri` | Connection string used by datastores for read replicas; only supported for postgres and MySQL | `--datastore-read-replica-conn-uri="postgres://postgres:password@localhost:5432/spicedb\"` | +| `datastore-read-replica-credentials-provider-name` | Retrieve datastore credentials dynamically using aws-iam | | +| `datastore-read-replica-conn-pool-read-healthcheck-interval` | amount of time between connection health checks in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-healthcheck-interval=30s` | +| `datastore-read-replica-conn-pool-read-max-idletime` | maximum amount of time a connection can idle in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-max-idletime=30m` | +| `datastore-read-replica-conn-pool-read-max-lifetime` | maximum amount of time a connection can live in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-max-lifetime=30m` | +| `datastore-read-replica-conn-pool-read-max-lifetime-jitter` | waits rand(0, jitter) after a connection is open for max lifetime to actually close the connection to a read replica(default: 20% of max lifetime) | `--datastore-read-replica-conn-pool-read-max-lifetime-jitter=6m` | +| `datastore-read-replica-conn-pool-read-max-open` | number of concurrent connections open in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-max-open=20` | +| `datastore-read-replica-conn-pool-read-min-open` | number of minimum concurrent connections open in a read-only replica datastore's connection pool | `--datastore-read-replica-conn-pool-read-min-open=20` | diff --git a/pages/spicedb/ops/load-testing.mdx b/pages/spicedb/ops/load-testing.mdx index 886a85c..8805b03 100644 --- a/pages/spicedb/ops/load-testing.mdx +++ b/pages/spicedb/ops/load-testing.mdx @@ -161,7 +161,7 @@ Writes impact datastore performance far more than they impact SpiceDB node perfo **Warning:** CockroachDB Datastore users have [special considerations][crdb] to understand. - [crdb]: ../concepts/datastores#overlap-strategy + [crdb]: ../concepts/datastores/cockroachdb#overlap-strategies diff --git a/pages/spicedb/ops/observability.mdx b/pages/spicedb/ops/observability.mdx index 06420df..5898ee0 100644 --- a/pages/spicedb/ops/observability.mdx +++ b/pages/spicedb/ops/observability.mdx @@ -51,7 +51,7 @@ Alternatively, you can upload profiles to [pprof.me][pprofme] to share with othe SpiceDB uses [OpenTelemetry][otel] for [tracing] the lifetime of requests. -You can configure the tracing in SpiceDB via [global flags], prefixed with `otel`. +You can configure the tracing in SpiceDB via [CLI flags], prefixed with `otel`. Here's a video walking through SpiceDB traces using [Jaeger]: @@ -59,14 +59,14 @@ Here's a video walking through SpiceDB traces using [Jaeger]: [otel]: https://opentelemetry.io [tracing]: https://opentelemetry.io/docs/concepts/signals/traces/ -[global flags]: /spicedb/concepts/commands#global-flags +[CLI flags]: /spicedb/concepts/commands [Jaeger]: https://www.jaegertracing.io ## Structured Logging SpiceDB emits logs to [standard streams][stdinouterr] using [zerolog]. -Logs come in two formats (`console`, `JSON`) and can be configured with the `--log-format` [global flag]. +Logs come in two formats (`console`, `JSON`) and can be configured with the `--log-format` [CLI flag]. If a output device is non-interactive (i.e. not a terminal) these logs are emitted in [NDJSON] by default. @@ -123,7 +123,7 @@ Here's a comparison of the logs starting up a single SpiceDB v1.25 instance: [zerolog]: https://github.com/rs/zerolog -[global flag]: /spicedb/concepts/commands#global-flags +[CLI flags]: /spicedb/concepts/commands [stdinouterr]: https://en.wikipedia.org/wiki/Standard_streams [NDJSON]: https://github.com/ndjson/ndjson-spec