-
Notifications
You must be signed in to change notification settings - Fork 8
Schema spec revision #400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Schema spec revision #400
Conversation
| - Any document that adheres to the rules specified by schema A also adheres to | ||
| rules specified by schema B. | ||
| - Any processing rules defined for schema A also apply for schema B. | ||
| - Any processing rules defined for schema B, that are not defined for schema | ||
| A, do not conflict with the processing rules for schema A. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this isn't part of the change in this PR, but catching it now. This seems to be at least somewhat conflicting with what the compat attribute docs state:
This specification makes no statement as to which parts of the data are examined for compatibility (e.g. xRegistry metadata, domain-specific document, etc.). This SHOULD be defined by the compatibility values. The exact meaning of what each compatibility value means might vary based on the data model of the Resource, therefore this specification only defines a very high-level abstract meaning for each to ensure some degree of consistency.
I think we need to remove this paragraph, because we've made no statement in the Core about what the individual compatibility modes mean, and therefore, summarizing the rules as such might not be accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure I understand what you're saying.... you think the compat definitions needs to be related to each schema format, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To your question, yes, but also, we have not defined in this spec what each compatibility mode means, so either way it should be removed.
schema/spec.md
Outdated
|
|
||
| ## 1. Overview | ||
|
|
||
| A schema registry provides a respository for managing serialization and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| A schema registry provides a respository for managing serialization and | |
| A schema registry provides a respository for managing serialization, |
| A schema registry provides a respository for managing serialization and | ||
| validation and data type definitions schemas as they are commonly used in | ||
| distributed systems. Common schema formats include JSON Schema, JSON | ||
| Structure, Apache Avro Schema, Google Protobuf Schema, and XML Schema. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, this specification does not mandate, or limit, which schema formats are used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to make it clear we support more than just the "common" ones
schema/spec.md
Outdated
| are not. | ||
|
|
||
| Serialization generally occurs based on a specific schema version that the data | ||
| publisher uses. Multiple versions of publishers may exist in the same system, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| publisher uses. Multiple versions of publishers may exist in the same system, | |
| publisher uses. Multiple versions of publishers might exist in the same system, |
schema/spec.md
Outdated
| publisher uses. Multiple versions of publishers may exist in the same system, | ||
| using different schema versions, which is a common occurrence in systems that | ||
| perform live updates. Once data has been published, data serialized based on | ||
| several different versions may exist in a system, in queues, in databases, or in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| several different versions may exist in a system, in queues, in databases, or in | |
| several different versions might exist in a system, in queues, in databases, or in |
| files. | ||
|
|
||
| The schema registry therefore allows managing multiple versions of schemas, | ||
| declare their lineage, and state the compatibility policy. The compatibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| declare their lineage, and state the compatibility policy. The compatibility | |
| declares their lineage, and state their compatibility policy. The compatibility |
| The schema registry therefore allows managing multiple versions of schemas, | ||
| declare their lineage, and state the compatibility policy. The compatibility | ||
| policy is used to determine whether a schema change is compatible with existing | ||
| data, and MAY be enforced by implementations of the schema registry. For this, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"data" -while true, if feels a bit indirect. Would it be better to talk about how it enforces compat between versions of the schema instead? "existing data" might not exist, so it's not really the thing we're focused on.
| policy is used to determine whether a schema change is compatible with existing | ||
| data, and MAY be enforced by implementations of the schema registry. For this, | ||
| this specification leans on the [xRegistry Core][xRegistry Core] specification | ||
| that already defines these versioning and compatibility mechanisms for any kind of resource. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the core spec doesn't actually fully define the compat stuff, just what the terms mean at a high level. It feels like we should say something about how impls will need to be more explicit for each schema format and compat mode, no?
|
|
||
| ### 1.4. Document Store | ||
|
|
||
| The schema registry is a document store and therefore also has the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The schema registry is a document store and therefore also has the | |
| The schema registry is a document store and therefore has the |
| the stored content-type when a client issues a GET request to the | ||
| [`self`][xRegistry self] URL of a schema Version. The associated metadata is | ||
| returned in the HTTP headers. The [default version][xRegistry default-version] | ||
| of the schema Version is returned when the client issues a GET request to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would read better if we removed the extra "Version" from "schema Version"? I don't think we've introduced that term yet and it sounds a bit off to talk about a "version of a version". W/o it we just talk about the "default version of a Resource", which is easier to grok.
| of the schema Version is returned when the client issues a GET request to the | ||
| [`self`][xRegistry self] URL of the `schema` Resource. | ||
|
|
||
| This allows to provide external parties with a link that they can use without |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| This allows to provide external parties with a link that they can use without | |
| This enables the ability to provide external parties with a link that they can use without |
| xRegistry specifics, by simply using a POST against [`self`][xRegistry self] URL | ||
| of the `schema` Resource in the simplest case. | ||
|
|
||
| To access the metadata of the `schema` or the schema version as a JSON document, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| To access the metadata of the `schema` or the schema version as a JSON document, | |
| To access the metadata of the `schema`, or the schema version as a JSON document, |
| of the `schema` Resource in the simplest case. | ||
|
|
||
| To access the metadata of the `schema` or the schema version as a JSON document, | ||
| the client can append a `$details`suffix to the URL, like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| the client can append a `$details`suffix to the URL, like | |
| the client can append a `$details` suffix to the URL, like |
schema/spec.md
Outdated
|
|
||
| In terms of versioning, you can think of a **schema** as a collection of | ||
| versions that are compatible according to the selected `compatibility` mode. | ||
| MUST be created, to indicate the breaking change. The [`deprecated`](xRegistry deprecated) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the first part of this line ("MUST be created, to indicate the breaking change") is old and not needed given the previous paragraph.
schema/spec.md
Outdated
| In terms of versioning, you can think of a **schema** as a collection of | ||
| versions that are compatible according to the selected `compatibility` mode. | ||
| MUST be created, to indicate the breaking change. The [`deprecated`](xRegistry deprecated) | ||
| attribute may be used to indicate the appropriate new schema to use following a breaking change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| attribute may be used to indicate the appropriate new schema to use following a breaking change. | |
| attribute MAY be used to indicate the appropriate new schema to use following a breaking change. |
| versions that are compatible according to the selected `compatibility` mode. | ||
| MUST be created, to indicate the breaking change. The [`deprecated`](xRegistry deprecated) | ||
| attribute may be used to indicate the appropriate new schema to use following a breaking change. | ||
| MUST be created, to indicate the breaking change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left over from a previous edit/version, I think.
| Implementations of this specification MAY include additional extension | ||
| attributes, including the `*` attribute of type `any`. | ||
|
|
||
| Since the Schema Registry is an application of the xRegistry specification, all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's link to the core spec here - from the "xRegistry specification" text
| some application-defined way. A schema Group does not impose any restrictions | ||
| on the contained schemas, meaning that a schema Group MAY contain schemas of | ||
| different formats. Every schema MUST reside inside a schema Group. | ||
| Every schema MUST reside inside a Schema Group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be me, but trying to read this as a newbie ,someone may think that a schema can be part of a Group directly. So, perhaps a tiny bit of clarity:
Every schema (i.e. schema Resource) MUST reside inside a Schema Group
| `"com.example.event.2024-02"`, so that incompatible, but historically related | ||
| schemas can be more easily identified by users and developers. The schema | ||
| `versionid` then functions as the semantic minor version identifier. | ||
| which is a `versionid` of the Version that this Version is based on. The `ancestor` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something is missing before this line.
| `versionid` then functions as the semantic minor version identifier. | ||
| which is a `versionid` of the Version that this Version is based on. The `ancestor` | ||
| attribute permits multiple Versions to reference the same ancestor, and allows for | ||
| implementations to determine the Version's ancestor. See the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/See/Since/ ?
But even then the sentence doesn't parse correctly. I think some tweaking is needed.
| is appended separated with a colon, for instance | ||
| `.../com.example.telemetrydata:TelemetryEvent`. | ||
|
|
||
| Like the [xRegistry Core][xRegistry Core] specification, this specification does not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove? Dup of next para
schema/spec.md
Outdated
|
|
||
| It is expected that any implementation of this specification will use | ||
| authentication and authorization mechanisms that are appropriate for the | ||
| application domain and the deployment environment. This may include, but is not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| application domain and the deployment environment. This may include, but is not | |
| application domain and the deployment environment. This MAY include, but is not |
|
rebase is needed |
Signed-off-by: Clemens Vasters <[email protected]>
Signed-off-by: Clemens Vasters <[email protected]>
38fe6ee to
651b729
Compare
Signed-off-by: Clemens Vasters <[email protected]>
Signed-off-by: Clemens Vasters <[email protected]>
Signed-off-by: Clemens Vasters <[email protected]>
| attribute holds a URI and is specifically meant to reference a schema document | ||
| residing in a registry. For example, a CloudEvent with a `dataschema` attribute | ||
| pointing to a schema version in a schema registry might look like this, using | ||
| the schema version's [`self`][xRegistry self] URL as the value of the `dataschema` attribute: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you have your editor wrap lines like this at 80?
| publisher uses. Multiple versions of publishers MAY exist in the same system, | ||
| using different schema versions, which is a common occurrence in systems that | ||
| perform live updates. Once data has been published, data serialized based on | ||
| several different versions MAY exist in a system, in queues, in databases, or in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| several different versions MAY exist in a system, in queues, in databases, or in | |
| several different versions might exist in a system, in queues, in databases, or in |
| This allows to provide external parties with a link that they can use without | ||
| needing to know any details about xRegistry. | ||
|
|
||
| Storing a new schema is similarly straightforward for clients that do not know |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Storing a new schema is similarly straightforward for clients that do not know | |
| Storing a new Version of a schema is similarly straightforward for clients that do not know |
| ### 4.1. Schema Groups | ||
|
|
||
| ### Terminology | ||
| The Group (`<GROUP>`) name for the Schema Registry is `schemagroups`. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is switched up: GROUP->schemagroup, GROUPS->schemagroups.
You may want to explicitly mention one "singular" vs "plural", not just the plural name itself
| ### Schema Resources | ||
| ### 4.2. Schema Resources | ||
|
|
||
| The Resources (`<RESOURCE>`) collection inside of Schema Groups is named |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sam here - talk about singular vs plural - the current text is a bit mixed up
|
|
||
| The Resource plural name (`<RESOURCES>`) is `schemas`, and the Resource | ||
| singular name (`<RESOURCE>`) is `schema`. | ||
| All Versions of a Schema MUST adhere to the semantic rules of the schema's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| All Versions of a Schema MUST adhere to the semantic rules of the schema's | |
| All Versions of a single Schema Resource MUST adhere to the semantic rules of the schema's |
Signed-off-by: Clemens Vasters [email protected]
New Intro