0.4.0
This package provides tools for static JSON schema analysis.
One of these is its eponymous function schemaDescribesSubset which tries to determine whether all data values that satisfy one JSON schema also satisfy another one (which would mean that the first schema described a subset of the set of data values that satisfy the second schema).
Other functions that might be useful include
- schemaDescribesEmptySet, which tries to determine whether a schema does not accept any values at all
- toDNF, which transforms a schema to a disjunctive normal form
- schemasAreEquivalent, which tries to determine whether two schemas both accept the exact same data values.
- schemaDescribesUniverse, which tries to determine whether a schema will accept any arbitrary JSON value.
All of these functions work out of the box with standard JSON Schema, but can also regard custom keywords using plugins.
npm install json-schema-describes-subset
The functions schemaDescribesSubset,
schemaDescribesEmptySet,
schemasAreEquivalent and
schemaDescribesUniverse, which return
boolean | null
values are referred to as discriminative functions. (As
opposed to toDNF, which doesn't discriminate anything but rather
transforms the provided schema.)
The reasons why a discriminative function would
return true
are also referred to as contradictions, since they are
determined in schemaDescribesEmptySet and a schema's
internal contradiction would be a reason why the schema doesn't accept any value
and therefore describes the empty set.
It might appear natural to refer to a schema that describes the subset of the set described by another schema as "subschema". This project however sticks to the terminology of the JSON Schema specification, where "subschema" refers to a schema that is contained in a surrounding parent schema. Instead "subset schema" or "superset schema" might be used to express the relation between the sets of data values that satisfy the respective schemas.
schemaDescribesSubset(
potentialSubsetSchema
,potentialSupersetSchema
,options?
):null
|boolean
Defined in: schema-describes-subset/schema-describes-subset.ts:99
Tries to determine whether the first argument JSON schema
(potentialSubsetSchema
) describes a subset of the set of data values described
by the second argument JSON schema (potentialSupersetSchema
).
Parameter | Type |
---|---|
potentialSubsetSchema |
JSONSchema |
potentialSupersetSchema |
JSONSchema |
options? |
Options |
null
| boolean
Returns true
if it does find a reason to do so.
If such a reason cannot be found, usually null
is returned to indicate the
possibility of false negatives. (Not having found any reason to return true
doesn't mean that there aren't any.)
This behavior is sufficient for many use cases and has been the focus so far.
The ability to determine true positive true
results is fairly powerful and
will work in many complex cases. (See the following examples and
Limitations.) The true positive false
return value is
currently only returned if an example data value that satisfies
potentialSubsetSchema
but not potentialSupersetSchema
can be trivially
found. See Limitations for more details.
If a few of the following examples that return true
seem unintuitive at first
glance, try to find a data value that satisfies the first schema but not the
second one. Failing to find such a data value might help to understand why
true
is returned. (If, contrary to expectations, you actually are able to find
such a data value, please do report a
bug).
import { schemaDescribesSubset } from 'json-schema-describes-subset'
console.log(
schemaDescribesSubset(
{
type: 'number',
},
true,
),
) // logs: `true`
console.log(
schemaDescribesSubset(false, {
type: 'number',
}),
) // logs: `true`
console.log(
schemaDescribesSubset(
{
type: ['number', 'boolean', 'string', 'null'],
},
{ type: ['number', 'null'] },
),
) // logs: `false`
console.log(
schemaDescribesSubset(
{ type: 'integer' },
{ type: ['number', 'string', 'boolean'] },
),
) // logs: `true`
console.log(
schemaDescribesSubset(
{
minimum: 5.5,
},
{
exclusiveMinimum: 5.5,
},
),
) // logs: `false`
console.log(
schemaDescribesSubset(
{
minimum: 5.6,
},
{
exclusiveMinimum: 5.5,
},
),
) // logs: `true`
console.log(
schemaDescribesSubset(
{ minimum: 10, maximum: 30, multipleOf: 5 },
{ anyOf: [{ multipleOf: 3 }, { multipleOf: 20 }, { enum: [10, 25] }] },
),
) // logs: `true`
console.log(
schemaDescribesSubset(
{ type: 'string', maxLength: 5, minLength: 10 },
{ type: 'null' },
),
) // logs: `true`
console.log(
schemaDescribesSubset(
{
prefixItems: [{ type: 'string' }, { type: 'boolean' }],
items: { type: 'object' },
},
{
prefixItems: [
{ type: ['string', 'number'] },
{ type: 'boolean' },
{ type: 'object' },
],
},
),
) // logs: `true`
console.log(
schemaDescribesSubset(
{ contains: { type: 'number' }, minContains: 5 },
{ minItems: 5 },
),
) // logs: `true`
console.log(
schemaDescribesSubset(
{
prefixItems: [{ type: 'number' }, { type: 'boolean' }],
items: { type: 'string' },
maxItems: 3,
},
{ uniqueItems: true },
),
) // logs: `true`
console.log(
schemaDescribesSubset(
{ required: ['a'], maxProperties: 2 },
{
anyOf: [
{ properties: { b: { type: 'string' } } },
{ properties: { c: { type: 'string' } } },
],
},
),
) // logs: `true`
console.log(
schemaDescribesSubset(
{ maxProperties: 2, required: ['abc', 'def'] },
{ propertyNames: { minLength: 2 } },
),
) // logs: `true`
console.log(
schemaDescribesSubset(
{ maxProperties: 1 },
{
anyOf: [
{ properties: { x: { type: 'string' } } },
{ patternProperties: { '^a$': { type: 'string' } } },
],
},
),
) // logs: `true`
console.log(
schemaDescribesSubset(
{
additionalProperties: { type: 'number' },
properties: { a: { type: 'string' } },
},
{
additionalProperties: { type: 'number' },
properties: {
a: { type: 'string' },
b: { type: ['boolean', 'number'] },
},
},
),
) // logs: `true`
console.log(
schemaDescribesSubset(
{
allOf: [
{
properties: {
aa: { type: 'string' },
aaa: { type: 'string' },
aaaa: { type: 'string' },
},
patternProperties: {
'^b+$': { type: 'string' },
},
},
{
additionalProperties: { type: 'number' },
patternProperties: {
'^a+$': { type: 'string' },
'^b+$': true,
},
},
{
propertyNames: { not: { pattern: '^b+$' } },
},
],
},
{
additionalProperties: { type: 'number' },
patternProperties: {
'^a+$': { type: 'string' },
},
},
),
) // logs: `true`
console.log(
schemaDescribesSubset(
{
patternProperties: {
'^a+$': { type: 'string' },
'^b+$': { type: 'boolean' },
},
propertyNames: { pattern: '^a+$' },
},
{
additionalProperties: false,
patternProperties: { '^a+$': { type: 'string' } },
},
),
) // logs: `true`
console.log(
schemaDescribesSubset(
{ required: ['a', 'b', 'c'] },
{ dependentRequired: { a: ['b', 'c'] } },
),
) // logs: `true`
console.log(
schemaDescribesSubset(
{
properties: {
b: { type: 'number' },
},
additionalProperties: false,
},
{
properties: {
b: { type: ['string', 'number'] },
},
dependentSchemas: {
a: {
properties: {
b: {
type: 'string',
},
},
},
},
},
),
) // logs: `true`
This function is useful whenever you want to ensure that different data interfaces are compatible with each other.
For example, it can be used to check whether a new API version is backwards compatible with the old one.
Several other good use cases where a function like schemaDescribesSubset
might
come in handy, are described in the introduction of the paper
Type Safety with JSON Subschema, which
follows the same goal as this function using a slightly different approach.
The implementation utilizes schemaDescribesEmptySet and the fact that A ⊆ B if and only if A ∩ ¬B = ∅. (That relation should be obvious if illustrated in a venn diagram.)
It basically looks similar to this:
function schemaDescribesSubset(
potentialSubsetSchema: JSONSchema,
potentialSupersetSchema: JSONSchema,
): boolean | null {
return schemaDescribesEmptySet({
allOf: [potentialSubsetSchema, { not: potentialSupersetSchema }],
})
}
schemaDescribesSubset
uses
Ajv to validate consts
among others. It can be configured using
ValidationPlugins. If you ever need a
routine that validates a value a
against a schema B
and that is equally
configured, an alternative to importing and configuring Ajv would be to use:
schemaDescribesSubset({ const: a }, B)
This is one of the cases where
a definite boolean is always returned and never null
.
However, since this is not optimized for performance, configuring and using a validator might often be the better choice.
JSONSchema =
JSONSchemaObject
|boolean
Defined in: json-schema/json-schema.ts:56
A schema compatible with the JSON Schema Draft 2020-12 specification. If you would like to use one of the functions provided by this project with an older JSON Schema draft, you could try to use something like alterschema.
In the functions that accept more than one schema
(schemaDescribesSubset and
schemasAreEquivalent) it is assumed that when a schema
resource's $id
appears in more than one of the root schemas, the respective
schemas are identical.
Since currently Ajv is used
under the hood, the nullable
keyword is supported out of the box, despite of not being a standard JSON Schema
keyword.
Custom keywords can be supported and the behavior of standard keywords can be customized using Plugins.
In order to be permissive towards custom keywords, the type is equivalent to
Record<string, unknown> | boolean
but it still provides code completion and tool tip documentation for standard keywords.
There are only limited checks whether the provided schemas are actually valid. Providing invalid schemas will cause undefined behavior.
Referenced schema resources ($ref
) are not retrieved via their url. If a
referenced resource is not part of the schema itself, it needs to be provided in
Options.definitions.
Some of the standard keywords of
JSON Schema Draft 2020-12 are not
supported yet at all ($dynamicRef
, $dynamicAnchor
, unevaluatedItems
and
unevaluatedProperties
). JSON schemas passed as arguments to toDNF
that contain any of them might cause an exception to be thrown. If such schemas
are passed to any of the discriminative functions
(like schemaDescribesSubset or
schemaDescribesEmptySet) a false negative null
value might be returned.
Options =
object
Defined in: options/options.ts:35
optional
baseURI:string
| (string
|null
|undefined
)[]
Defined in: options/options.ts:48
If a schema does not have an $id
or the $id
is a relative URI, a baseURI
can be provided in the Options
object. For example, this could be the schema's
retrieval URI.
Providing a non relative baseURI (either as part of the Options
object or
$id
) is important if the schema contains relative $ref
s.
In functions that accept more than one schema as arguments (like
schemaDescribesSubset or
schemasAreEquivalent) baseURI
can be an array of
strings which correspond to each schema.
optional
definitions:Exclude
<JSONSchema
,boolean
>[]
Defined in: options/options.ts:57
Referenced schema resources ($ref
) are not retrieved via their url. If a
referenced resource is not part of the schema itself, it needs to be provided
here.
TODO: make this also accept an object with retrieval urls as keys. This would also support referenced to boolean schemas better.
optional
plugins:Plugin
[]
Defined in: options/options.ts:65
Support non standard custom keywords by adding plugins. There is one predefined custom plugin: formatPlugin.
So far, the focus of this project for
discriminative functions like
schemaDescribesSubset or
schemaDescribesEmptySet has been to find reasons why
true
would be the correct result. They do so fairly powerfully and will find
such reasons in many complex schemas. These reasons are also referred to as
contradictions because they are determined by
schemaDescribesEmptySet and a contradiction would be
a reason why a schema would not accept any value.
However there are also cases where such reasons for a true
result cannot be
found (see the
examples below). When
reasons for a true
result couldn't be found, usually null
is returned,
meaning either there are no reasons to return true
and actually false
would
be the correct result (true negative) or there are reasons to return true
, but
they couldn't be determined (false negative). Currently only some trivial cases
actually return false
.
In many use cases, where false
and "possibly false
" results would be treated
equally, this behavior would be completely sufficient. For example, if changes
to an API are checked for backwards compatibility using
schemaDescribesSubset, you would only want to know
whether the result is true
or not.
All falsy return values could therefore be regarded as "false
with possible
false negatives".
🚧TODO🚧: comprehensive description of how each keyword is evaluated, so that the reader gets an idea of what to expect exactly. Maybe as doc of each built-in plugin?
The following are examples of keywords which may impose currently undetected
contradictions and therefore might cause false negative null
results.
When comparing string patterns, they are checked for equality, but their internal logic is not analyzed any further.
schemaDescribesSubset(
// potentialSubsetSchema:
{ pattern: '^[abc]{3}$' },
// potentialSupersetSchema:
{ pattern: '^[abc]{2,3}$' },
) // returns `null`
This returns null
even though the schema { pattern: '^[abc]{3}$' }
does
in fact describe a subset of the set of values that satisfy
{ pattern: '^[abc]{2,3}$' }
, but this is not determined by
schemaDescribesSubset
, since unequal patterns aren't analyzed any further.
In some cases it is possible to receive an unambiguous result by creating the schemas in a way where equal patterns appear in both schemas:
schemaDescribesSubset(
// potentialSubsetSchema:
{ pattern: '^[abc]{3}$' },
// potentialSupersetSchema:
{ anyOf: [{ pattern: '^[abc]{2}$' }, { pattern: '^[abc]{3}$' }] },
) // returns `true`
This potentialSupersetSchema
is equivalent to the one in the previous example,
but shares a pattern with the potentialSubsetSchema
and therefore true
can
be determined as the result.
Also, constant values might be tested against patterns, so that the following
returns true
:
schemaDescribesSubset(
// potentialSubsetSchema:
{ required: ['a', 'aa'], maxProperties: 2 },
// potentialSupersetSchema:
{ propertyNames: { pattern: '^a+$' } },
) // returns `true`
$ref
s are currently only compared for whether they reference the same
resource. Future improvements could involve inlining referenced resources and
therefore produce less false negative results.
🚧TODO🚧: add more examples, so that the reader gets an idea of what to expect exactly
Some keywords are not supported yet at all ($dynamicRef
, $dynamicAnchor
,
unevaluatedItems
and unevaluatedProperties
). Using schemas that contain any
of them might cause errors to be thrown or possibly false negatives (null
) to
be returned. See JSONSchema for details.
schemaDescribesEmptySet(
schema
,options?
):null
|boolean
Defined in: dnf/dnf.ts:607
Tries to determine whether the provided JSON Schema is unsatisfiable and
therefore describes the empty set. In that case, the schema would be equivalent
to the false
schema.
Parameter | Type |
---|---|
schema |
JSONSchema |
options? |
Options |
null
| boolean
Returns true
if it does find a reason why the schema will not accept any
value.
If such a reason cannot be found, usually null
is returned to indicate the
possibility of false negatives.
The true positive false
return value is currently only returned if an example
data value that satisfies the schema can be trivially found. See
Limitations for more details.
import { schemaDescribesEmptySet } from 'json-schema-describes-subset'
console.log(schemaDescribesEmptySet(false)) // logs: `true`
console.log(
schemaDescribesEmptySet(
// this schema will accept anything that is not a number
{ minimum: 2, maximum: 1 },
),
) // logs: `false`
console.log(
schemaDescribesEmptySet({
type: 'number',
minimum: 2,
maximum: 1,
}),
) // logs: `true`
The provided schema is first transformed to a
disjunctive normal form
similar to the one returned by toDNF. Then each disjunct is checked
for contradictions which would make it unsatisfiable. If a contradiction is
found for each disjunct, the complete schema is unsatisfiable and true
is
returned.
toDNF<
Options_
>(schema
,options?
):DNFFromOptions
<Options_
>
Defined in: dnf/dnf.ts:446
Transforms the given schema to a disjunctive normal form similar to the one utilized by schemaDescribesEmptySet.
Type Parameter | Default type |
---|---|
Options_ extends undefined | Options |
undefined |
Parameter | Type |
---|---|
schema |
JSONSchema |
options? |
Options_ |
DNFFromOptions
<Options_
>
The resulting dnf schema will be equivalent to the provided schema (meaning that it will accept the same data values) but all boolean combinations will be restructured.
Subschemas that represent property values of a JSON object or elements of a JSON array do not represent boolean combinations. They are currently considered atomic for that purpose.
The resulting dnf schema will be simplified so that disjuncts that were
determined to be unsatisfiable are already eliminated. If each disjunct was
determined to be unsatisfiable the return value is false
.
The return type's most general form (without specified plugin types,
for example returned by toDNF<Options>(...)
) is equivalent to:
type GeneralDNFSpelledOut =
| boolean
| {
anyOf: (
| { const: unknown }
| {
[mergeableKeyword: string]: unknown
type: 'string' | 'number' | 'object' | 'array'
allOf?: JSONSchema[]
const?: never
anyOf?: never
not?: never
}
)[]
}
If the provided option's type does not contain any custom plugins,
the default return type (for example returned by toDNF(schema)
(without
options) or by toDNF<{ plugins: [] }>(...)
) is equivalent to:
type DefaultDNFSpelledOut =
| boolean
| {
anyOf: (
| { const: unknown }
| {
type: 'number'
maximum?: number
minimum?: number
multipleOf?: number
allOf?: (
| { not: { const: number } }
| { not: { multipleOf: number } }
| { $ref: string }
| { not: { $ref: string } }
)[]
const?: never
anyOf?: never
not?: never
}
| {
type: 'string'
maxLength?: number
minLength?: number
allOf?: (
| { not: { const: string } }
| { pattern: string }
| { not: { pattern: string } }
| { $ref: string }
| { not: { $ref: string } }
)[]
const?: never
anyOf?: never
not?: never
}
| {
type: 'object'
maxProperties?: number
minProperties?: number
patternProperties?: Record<string, JSONSchema>
properties?: Record<string, JSONSchema>
propertyNames?: JSONSchema
required?: string[]
allOf?: (
| { not: { const: Record<string, unknown> } }
| {
additionalProperties: JSONSchema
properties?: Record<string, true>
patternProperties?: Record<string, true>
}
| { not: { patternProperties: Record<string, JSONSchema> } }
| {
not: {
additionalProperties: JSONSchema
properties?: Record<string, true>
patternProperties?: Record<string, true>
}
}
| { not: { propertyNames: JSONSchema } }
| { $ref: string }
| { not: { $ref: string } }
)[]
const?: never
anyOf?: never
not?: never
}
| {
type: 'array'
items?: JSONSchema
maxItems?: number
minItems?: number
prefixItems?: JSONSchema[]
uniqueItems?: boolean
allOf?: (
| { not: { const: unknown[] } }
| {
contains: JSONSchema
minContains?: number
maxContains?: number
}
| {
not: { uniqueItems?: boolean }
}
| {
not: {
prefixItems?: true[]
items?: JSONSchema
}
}
| { $ref: string }
| { not: { $ref: string } }
)[]
const?: never
anyOf?: never
not?: never
}
)[]
}
The return type will adjust according to the (explicit or inferred) type of the
property plugins
of the provided options
.
import { toDNF } from 'json-schema-describes-subset'
console.log(
toDNF({
anyOf: [{ minimum: 2 }, { exclusiveMinimum: 1 }],
}),
)
logs:
{
"anyOf": [
{ "const": null },
{ "const": true },
{ "const": false },
{ "type": "number", "minimum": 1, "allOf": [{ "not": { "const": 1 } }] },
{ "type": "string" },
{ "type": "array" },
{ "type": "object" }
]
}
import { toDNF } from 'json-schema-describes-subset'
console.log(
toDNF({
anyOf: [{ multipleOf: 2 }, { multipleOf: 3 }, { multipleOf: 4 }],
}),
)
logs:
{
"anyOf": [
{ "const": null },
{ "const": true },
{ "const": false },
{ "type": "number", "multipleOf": 2 },
{ "type": "number", "multipleOf": 3 },
{ "type": "string" },
{ "type": "array" },
{ "type": "object" }
]
}
This function was created mainly for demonstration purposes, but might also have some real world use cases. For example when creating a data mocking tool, that generates example data for a given schema, it might be easier to generate that data for one of the logically flat disjuncts instead of a complex schema which is logically deeply nested.
schemasAreEquivalent(
schemaA
,schemaB
,options?
):null
|boolean
Defined in: derived/derived.ts:60
Tries to determine whether the provided schemas accept the exact same set of data values.
Parameter | Type |
---|---|
schemaA |
JSONSchema |
schemaB |
JSONSchema |
options? |
Options |
null
| boolean
The limitations concerning false negative null
results apply
here.
🚧TODO🚧
One possible use case could be: If you are creating a tool that transforms a JSON Schema to another representation (like toDNF), this function could be useful to help create tests.
schemaDescribesUniverse(
schema
,options?
):null
|boolean
Defined in: derived/derived.ts:30
Tries to determine whether the provided schema accepts any JSON value. In that
case, the schema would be equivalent to the true
or {}
schema.
Parameter | Type |
---|---|
schema |
JSONSchema |
options? |
Options |
null
| boolean
The limitations concerning false negative null
results apply
here.
🚧TODO🚧
Can't think of any 🤷♂️. This function was created only because it was so easy to do so.
This project is under active development. The following tries to deliver an idea of what future changes might (or might not) include.
The following does not fall within this project's scope:
-
Create a JSON Schema validation tool
There already are good validation solutions. For this project Ajv is used internally for validation. This is regarded by of this project's functions. For example, if schemaDescribesEmptySet returns true, there isn't any value that would satisfy the schema according to Ajv.
(Technically it would actually be fairly easy to switch to another validation solution)
-
Support of older JSON Schema drafts
This project tries to always support the latest JSON Schema draft (currently 2020-12). You could try to convert your schemas that are built according to an older draft before passing them to any of this project's functions using a tool like alterschema.
The main focus of this project is its eponymous function
schemaDescribesSubset. A major goal is to minimize
false negative (null
) results while simultaneously making sure
that a boolean result is always true positive/true negative. One way to get
closer to that goal is to add or optimize support for
standard keywords.
Additional predefined custom plugins might be added to support more non standard keywords, if they are very common.
Another goal is to increase the number of cases where a boolean result is returned.
Any kind of feedback and code contribution is highly appreciated. Make sure to always adhere to this project's code of conduct
See CONTRIBUTING.md
for details.
- Johannes Bohner [email protected]