Skip to content

Conversation

@clemensv
Copy link
Contributor

@clemensv clemensv commented Aug 5, 2025

Signed-off-by: Clemens Vasters [email protected]

New Intro

Comment on lines 353 to 360
- Any document that adheres to the rules specified by schema A also adheres to
rules specified by schema B.
- Any processing rules defined for schema A also apply for schema B.
- Any processing rules defined for schema B, that are not defined for schema
A, do not conflict with the processing rules for schema A.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this isn't part of the change in this PR, but catching it now. This seems to be at least somewhat conflicting with what the compat attribute docs state:

This specification makes no statement as to which parts of the data are examined for compatibility (e.g. xRegistry metadata, domain-specific document, etc.). This SHOULD be defined by the compatibility values. The exact meaning of what each compatibility value means might vary based on the data model of the Resource, therefore this specification only defines a very high-level abstract meaning for each to ensure some degree of consistency.

I think we need to remove this paragraph, because we've made no statement in the Core about what the individual compatibility modes mean, and therefore, summarizing the rules as such might not be accurate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I understand what you're saying.... you think the compat definitions needs to be related to each schema format, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To your question, yes, but also, we have not defined in this spec what each compatibility mode means, so either way it should be removed.

schema/spec.md Outdated

## 1. Overview

A schema registry provides a respository for managing serialization and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A schema registry provides a respository for managing serialization and
A schema registry provides a respository for managing serialization,

A schema registry provides a respository for managing serialization and
validation and data type definitions schemas as they are commonly used in
distributed systems. Common schema formats include JSON Schema, JSON
Structure, Apache Avro Schema, Google Protobuf Schema, and XML Schema.
Copy link
Contributor

@duglin duglin Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, this specification does not mandate, or limit, which schema formats are used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to make it clear we support more than just the "common" ones

schema/spec.md Outdated
are not.

Serialization generally occurs based on a specific schema version that the data
publisher uses. Multiple versions of publishers may exist in the same system,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
publisher uses. Multiple versions of publishers may exist in the same system,
publisher uses. Multiple versions of publishers might exist in the same system,

schema/spec.md Outdated
publisher uses. Multiple versions of publishers may exist in the same system,
using different schema versions, which is a common occurrence in systems that
perform live updates. Once data has been published, data serialized based on
several different versions may exist in a system, in queues, in databases, or in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
several different versions may exist in a system, in queues, in databases, or in
several different versions might exist in a system, in queues, in databases, or in

files.

The schema registry therefore allows managing multiple versions of schemas,
declare their lineage, and state the compatibility policy. The compatibility
Copy link
Contributor

@duglin duglin Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
declare their lineage, and state the compatibility policy. The compatibility
declares their lineage, and state their compatibility policy. The compatibility

The schema registry therefore allows managing multiple versions of schemas,
declare their lineage, and state the compatibility policy. The compatibility
policy is used to determine whether a schema change is compatible with existing
data, and MAY be enforced by implementations of the schema registry. For this,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"data" -while true, if feels a bit indirect. Would it be better to talk about how it enforces compat between versions of the schema instead? "existing data" might not exist, so it's not really the thing we're focused on.

policy is used to determine whether a schema change is compatible with existing
data, and MAY be enforced by implementations of the schema registry. For this,
this specification leans on the [xRegistry Core][xRegistry Core] specification
that already defines these versioning and compatibility mechanisms for any kind of resource.
Copy link
Contributor

@duglin duglin Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the core spec doesn't actually fully define the compat stuff, just what the terms mean at a high level. It feels like we should say something about how impls will need to be more explicit for each schema format and compat mode, no?


### 1.4. Document Store

The schema registry is a document store and therefore also has the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The schema registry is a document store and therefore also has the
The schema registry is a document store and therefore has the

the stored content-type when a client issues a GET request to the
[`self`][xRegistry self] URL of a schema Version. The associated metadata is
returned in the HTTP headers. The [default version][xRegistry default-version]
of the schema Version is returned when the client issues a GET request to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would read better if we removed the extra "Version" from "schema Version"? I don't think we've introduced that term yet and it sounds a bit off to talk about a "version of a version". W/o it we just talk about the "default version of a Resource", which is easier to grok.

of the schema Version is returned when the client issues a GET request to the
[`self`][xRegistry self] URL of the `schema` Resource.

This allows to provide external parties with a link that they can use without
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This allows to provide external parties with a link that they can use without
This enables the ability to provide external parties with a link that they can use without

xRegistry specifics, by simply using a POST against [`self`][xRegistry self] URL
of the `schema` Resource in the simplest case.

To access the metadata of the `schema` or the schema version as a JSON document,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To access the metadata of the `schema` or the schema version as a JSON document,
To access the metadata of the `schema`, or the schema version as a JSON document,

of the `schema` Resource in the simplest case.

To access the metadata of the `schema` or the schema version as a JSON document,
the client can append a `$details`suffix to the URL, like
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
the client can append a `$details`suffix to the URL, like
the client can append a `$details` suffix to the URL, like

schema/spec.md Outdated

In terms of versioning, you can think of a **schema** as a collection of
versions that are compatible according to the selected `compatibility` mode.
MUST be created, to indicate the breaking change. The [`deprecated`](xRegistry deprecated)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the first part of this line ("MUST be created, to indicate the breaking change") is old and not needed given the previous paragraph.

schema/spec.md Outdated
In terms of versioning, you can think of a **schema** as a collection of
versions that are compatible according to the selected `compatibility` mode.
MUST be created, to indicate the breaking change. The [`deprecated`](xRegistry deprecated)
attribute may be used to indicate the appropriate new schema to use following a breaking change.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
attribute may be used to indicate the appropriate new schema to use following a breaking change.
attribute MAY be used to indicate the appropriate new schema to use following a breaking change.

versions that are compatible according to the selected `compatibility` mode.
MUST be created, to indicate the breaking change. The [`deprecated`](xRegistry deprecated)
attribute may be used to indicate the appropriate new schema to use following a breaking change.
MUST be created, to indicate the breaking change.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left over from a previous edit/version, I think.

Implementations of this specification MAY include additional extension
attributes, including the `*` attribute of type `any`.

Since the Schema Registry is an application of the xRegistry specification, all
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's link to the core spec here - from the "xRegistry specification" text

some application-defined way. A schema Group does not impose any restrictions
on the contained schemas, meaning that a schema Group MAY contain schemas of
different formats. Every schema MUST reside inside a schema Group.
Every schema MUST reside inside a Schema Group.
Copy link
Contributor

@duglin duglin Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be me, but trying to read this as a newbie ,someone may think that a schema can be part of a Group directly. So, perhaps a tiny bit of clarity:
Every schema (i.e. schema Resource) MUST reside inside a Schema Group

`"com.example.event.2024-02"`, so that incompatible, but historically related
schemas can be more easily identified by users and developers. The schema
`versionid` then functions as the semantic minor version identifier.
which is a `versionid` of the Version that this Version is based on. The `ancestor`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something is missing before this line.

`versionid` then functions as the semantic minor version identifier.
which is a `versionid` of the Version that this Version is based on. The `ancestor`
attribute permits multiple Versions to reference the same ancestor, and allows for
implementations to determine the Version's ancestor. See the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/See/Since/ ?
But even then the sentence doesn't parse correctly. I think some tweaking is needed.

is appended separated with a colon, for instance
`.../com.example.telemetrydata:TelemetryEvent`.

Like the [xRegistry Core][xRegistry Core] specification, this specification does not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove? Dup of next para

schema/spec.md Outdated

It is expected that any implementation of this specification will use
authentication and authorization mechanisms that are appropriate for the
application domain and the deployment environment. This may include, but is not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
application domain and the deployment environment. This may include, but is not
application domain and the deployment environment. This MAY include, but is not

@duglin
Copy link
Contributor

duglin commented Aug 11, 2025

rebase is needed

Signed-off-by: Clemens Vasters <[email protected]>
Signed-off-by: Clemens Vasters <[email protected]>
Signed-off-by: Clemens Vasters <[email protected]>
Signed-off-by: Clemens Vasters <[email protected]>
Signed-off-by: Clemens Vasters <[email protected]>
attribute holds a URI and is specifically meant to reference a schema document
residing in a registry. For example, a CloudEvent with a `dataschema` attribute
pointing to a schema version in a schema registry might look like this, using
the schema version's [`self`][xRegistry self] URL as the value of the `dataschema` attribute:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you have your editor wrap lines like this at 80?

publisher uses. Multiple versions of publishers MAY exist in the same system,
using different schema versions, which is a common occurrence in systems that
perform live updates. Once data has been published, data serialized based on
several different versions MAY exist in a system, in queues, in databases, or in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
several different versions MAY exist in a system, in queues, in databases, or in
several different versions might exist in a system, in queues, in databases, or in

This allows to provide external parties with a link that they can use without
needing to know any details about xRegistry.

Storing a new schema is similarly straightforward for clients that do not know
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Storing a new schema is similarly straightforward for clients that do not know
Storing a new Version of a schema is similarly straightforward for clients that do not know

### 4.1. Schema Groups

### Terminology
The Group (`<GROUP>`) name for the Schema Registry is `schemagroups`. The
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is switched up: GROUP->schemagroup, GROUPS->schemagroups.
You may want to explicitly mention one "singular" vs "plural", not just the plural name itself

### Schema Resources
### 4.2. Schema Resources

The Resources (`<RESOURCE>`) collection inside of Schema Groups is named
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sam here - talk about singular vs plural - the current text is a bit mixed up


The Resource plural name (`<RESOURCES>`) is `schemas`, and the Resource
singular name (`<RESOURCE>`) is `schema`.
All Versions of a Schema MUST adhere to the semantic rules of the schema's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All Versions of a Schema MUST adhere to the semantic rules of the schema's
All Versions of a single Schema Resource MUST adhere to the semantic rules of the schema's

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants