Skip to content

Commit e1b93f6

Browse files
committed
Add case_validate
The first tests to confirm `case_validate` functionality are reproduction of the CASE and UCO ontology repositories' `pyshacl` test results. References: * [AC-210] Add validation command to CASE-Utilities-Python Signed-off-by: Alex Nelson <[email protected]>
1 parent 978612e commit e1b93f6

File tree

10 files changed

+423
-4
lines changed

10 files changed

+423
-4
lines changed

README.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,33 @@ Installation is demonstrated in the `.venv.done.log` target of the [`tests/`](te
2020
## Usage
2121

2222

23+
### `case_validate`
24+
25+
This repository provides `case_validate` as an adaptation of the `pyshacl` command from [RDFLib's pySHACL](https://github.com/RDFLib/pySHACL). The command-line interface is adapted to run as though `pyshacl` were provided the full CASE ontology (and adopted full UCO ontology) as both a shapes and ontology graph. "Compiled" (or, "aggregated") CASE ontologies are in the [`case_utils/ontology/`](case_utils/ontology/) directory, and are installed with `pip`, so data validation can occur without requiring networking after this repository is installed.
26+
27+
To see a human-readable validation report of an instance-data file:
28+
29+
```bash
30+
case_validate input.json
31+
```
32+
33+
If `input.json` is not conformant, a report will be emitted, and `case_validate` will exit with status `1`. (This is a `pyshacl` behavior, where `0` and `1` report validation success. Status of >`1` is for other errors.)
34+
35+
To produce the validation report as a machine-readable graph output, the `--format` flag can be used to modify the output format:
36+
37+
```bash
38+
case_validate --format turtle input.json > result.ttl
39+
```
40+
41+
To use one or more supplementary ontology files, the `--ontology-graph` flag can be used, more than once if desired, to supplement the selected CASE version:
42+
43+
```bash
44+
case_validate --ontology-graph internal_ontology.ttl --ontology-graph experimental_shapes.ttl input.json
45+
```
46+
47+
Other flags are reviewable with `case_validate --help`.
48+
49+
2350
### `case_file`
2451

2552
To characterize a file, including hashes:
Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
#!/usr/bin/env python3
2+
3+
# This software was developed at the National Institute of Standards
4+
# and Technology by employees of the Federal Government in the course
5+
# of their official duties. Pursuant to title 17 Section 105 of the
6+
# United States Code this software is not subject to copyright
7+
# protection and is in the public domain. NIST assumes no
8+
# responsibility whatsoever for its use by other parties, and makes
9+
# no guarantees, expressed or implied, about its quality,
10+
# reliability, or any other characteristic.
11+
#
12+
# We would appreciate acknowledgement if the software is used.
13+
14+
__version__ = "0.1.0"
15+
16+
import argparse
17+
import importlib.resources
18+
import logging
19+
import os
20+
import pathlib
21+
import sys
22+
import typing
23+
24+
import rdflib # type: ignore
25+
import pyshacl # type: ignore
26+
27+
import case_utils.ontology
28+
29+
from case_utils.ontology.version_info import *
30+
31+
_logger = logging.getLogger(os.path.basename(__file__))
32+
33+
def main() -> None:
34+
parser = argparse.ArgumentParser(description="CASE wrapper to PySHACL command line tool.")
35+
36+
# Configure debug logging before running parse_args, because there could be an error raised before the construction of the argument parser.
37+
logging.basicConfig(level=logging.DEBUG if ("--debug" in sys.argv or "-d" in sys.argv) else logging.INFO)
38+
39+
case_version_choices_list = ["none", "case-" + CURRENT_CASE_VERSION]
40+
41+
# Add arguments specific to case_validate.
42+
parser.add_argument(
43+
'-d',
44+
'--debug',
45+
action='store_true',
46+
help='Output additional runtime messages.'
47+
)
48+
parser.add_argument(
49+
"--built-version",
50+
choices=tuple(case_version_choices_list),
51+
default="case-"+CURRENT_CASE_VERSION,
52+
help="Monolithic aggregation of CASE ontology files at certain versions. Does not require networking to use. Default is most recent CASE release."
53+
)
54+
parser.add_argument(
55+
"--ontology-graph",
56+
action="append",
57+
help="Combined ontology (i.e. subclass hierarchy) and shapes (SHACL) file, in any format accepted by rdflib recognized by file extension (e.g. .ttl). Will supplement ontology selected by --built-version. Can be given multiple times."
58+
)
59+
60+
# Inherit arguments from pyshacl.
61+
parser.add_argument(
62+
'--abort',
63+
action='store_true',
64+
help='(As with pyshacl CLI) Abort on first invalid data.'
65+
)
66+
parser.add_argument(
67+
'-w',
68+
'--allow-warnings',
69+
action='store_true',
70+
help='(As with pyshacl CLI) Shapes marked with severity of Warning or Info will not cause result to be invalid.',
71+
)
72+
parser.add_argument(
73+
"-f",
74+
"--format",
75+
choices=('human', 'turtle', 'xml', 'json-ld', 'nt', 'n3'),
76+
default='human',
77+
help="(ALMOST as with pyshacl CLI) Choose an output format. Default is \"human\". Difference: 'table' not provided."
78+
)
79+
parser.add_argument(
80+
'-im',
81+
'--imports',
82+
action='store_true',
83+
help='(As with pyshacl CLI) Allow import of sub-graphs defined in statements with owl:imports.',
84+
)
85+
parser.add_argument(
86+
'-i',
87+
'--inference',
88+
choices=('none', 'rdfs', 'owlrl', 'both'),
89+
default='none',
90+
help="(As with pyshacl CLI) Choose a type of inferencing to run against the Data Graph before validating. Default is \"none\".",
91+
)
92+
93+
parser.add_argument("in_graph")
94+
95+
args = parser.parse_args()
96+
97+
data_graph = rdflib.Graph()
98+
data_graph.parse(args.in_graph)
99+
100+
ontology_graph = rdflib.Graph()
101+
if args.built_version != "none":
102+
ttl_filename = args.built_version + ".ttl"
103+
_logger.debug("ttl_filename = %r.", ttl_filename)
104+
ttl_data = importlib.resources.read_text(case_utils.ontology, ttl_filename)
105+
ontology_graph.parse(data=ttl_data, format="turtle")
106+
if args.ontology_graph:
107+
for arg_ontology_graph in args.ontology_graph:
108+
_logger.debug("arg_ontology_graph = %r.", arg_ontology_graph)
109+
ontology_graph.parse(arg_ontology_graph)
110+
111+
validate_result : typing.Tuple[
112+
bool,
113+
typing.Union[Exception, bytes, str, rdflib.Graph],
114+
str
115+
]
116+
validate_result = pyshacl.validate(
117+
data_graph,
118+
shacl_graph=ontology_graph,
119+
ont_graph=ontology_graph,
120+
inference=args.inference,
121+
abort_on_first=args.abort,
122+
allow_warnings=True if args.allow_warnings else False,
123+
debug=True if args.debug else False,
124+
do_owl_imports=True if args.imports else False
125+
)
126+
127+
# Relieve RAM of the data graph after validation has run.
128+
del data_graph
129+
130+
conforms = validate_result[0]
131+
validation_graph = validate_result[1]
132+
validation_text = validate_result[2]
133+
134+
if args.format == "human":
135+
sys.stdout.write(validation_text)
136+
else:
137+
if isinstance(validation_graph, rdflib.Graph):
138+
validation_graph_str = validation_graph.serialize(format=args.format)
139+
sys.stdout.write(validation_graph_str)
140+
del validation_graph_str
141+
elif isinstance(validation_graph, bytes):
142+
sys.stdout.write(validation_graph.decode("utf-8"))
143+
elif isinstance(validation_graph, str):
144+
sys.stdout.write(validation_graph)
145+
else:
146+
raise NotImplementedError("Unexpected result type returned from validate: %r." % type(validation_graph))
147+
148+
sys.exit(0 if conforms else 1)
149+
150+
if __name__ == "__main__":
151+
main()

setup.cfg

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ classifiers =
1818
include_package_data = true
1919
install_requires =
2020
pandas
21+
pyshacl
2122
rdflib >= 6.0.2
2223
requests
2324
tabulate
@@ -29,6 +30,7 @@ console_scripts =
2930
case_file = case_utils.case_file:main
3031
case_sparql_construct = case_utils.case_sparql_construct:main
3132
case_sparql_select = case_utils.case_sparql_select:main
33+
case_validate = case_utils.case_validate:main
3234

3335
[options.package_data]
3436
case_utils = py.typed

tests/case_utils/Makefile

Lines changed: 22 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,15 +20,18 @@ tests_srcdir := $(top_srcdir)/tests
2020
all: \
2121
all-case_file \
2222
all-case_sparql_construct \
23-
all-case_sparql_select
23+
all-case_sparql_select \
24+
all-case_validate
2425

2526
.PHONY: \
2627
all-case_file \
2728
all-case_sparql_construct \
2829
all-case_sparql_select \
30+
all-case_validate \
2931
check-case_file \
3032
check-case_sparql_construct \
31-
check-case_sparql_select
33+
check-case_sparql_select \
34+
check-case_validate
3235

3336
all-case_file: \
3437
$(tests_srcdir)/.venv.done.log
@@ -45,15 +48,21 @@ all-case_sparql_select: \
4548
$(MAKE) \
4649
--directory case_sparql_select
4750

51+
all-case_validate: \
52+
$(tests_srcdir)/.venv.done.log
53+
$(MAKE) \
54+
--directory case_validate
55+
4856
check: \
4957
check-case_file \
5058
check-case_sparql_construct \
51-
check-case_sparql_select
59+
check-case_sparql_select \
60+
check-case_validate
5261
source $(tests_srcdir)/venv/bin/activate \
5362
&& pytest \
5463
--ignore case_file \
5564
--ignore case_sparql_construct \
56-
--ignore case_sparql_select \
65+
--ignore case_validate \
5766
--log-level=DEBUG
5867

5968
check-case_file: \
@@ -74,7 +83,16 @@ check-case_sparql_select: \
7483
--directory case_sparql_select \
7584
check
7685

86+
check-case_validate: \
87+
$(tests_srcdir)/.venv.done.log
88+
$(MAKE) \
89+
--directory case_validate \
90+
check
91+
7792
clean:
93+
@$(MAKE) \
94+
--directory case_validate \
95+
clean
7896
@$(MAKE) \
7997
--directory case_sparql_select \
8098
clean
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
#!/usr/bin/make -f
2+
3+
# This software was developed at the National Institute of Standards
4+
# and Technology by employees of the Federal Government in the course
5+
# of their official duties. Pursuant to title 17 Section 105 of the
6+
# United States Code this software is not subject to copyright
7+
# protection and is in the public domain. NIST assumes no
8+
# responsibility whatsoever for its use by other parties, and makes
9+
# no guarantees, expressed or implied, about its quality,
10+
# reliability, or any other characteristic.
11+
#
12+
# We would appreciate acknowledgement if the software is used.
13+
14+
SHELL := /bin/bash
15+
16+
top_srcdir := $(shell cd ../../.. ; pwd)
17+
18+
tests_srcdir := $(top_srcdir)/tests
19+
20+
all: \
21+
all-case_test_examples \
22+
all-uco_test_examples
23+
24+
.PHONY: \
25+
all-case_test_examples \
26+
all-uco_test_examples \
27+
check-case_test_examples \
28+
check-uco_test_examples
29+
30+
all-case_test_examples:
31+
$(MAKE) \
32+
--directory case_test_examples
33+
34+
all-uco_test_examples:
35+
$(MAKE) \
36+
--directory uco_test_examples
37+
38+
check: \
39+
check-case_test_examples \
40+
check-uco_test_examples
41+
42+
check-case_test_examples:
43+
$(MAKE) \
44+
--directory case_test_examples \
45+
check
46+
47+
check-uco_test_examples:
48+
$(MAKE) \
49+
--directory uco_test_examples \
50+
check
51+
52+
clean:
53+
@$(MAKE) \
54+
--directory case_test_examples \
55+
clean
56+
@$(MAKE) \
57+
--directory uco_test_examples \
58+
clean
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
This directory runs the examples-based tests in the CASE and UCO ontology repositories, using `case_validate` in place of `pyshacl`.

0 commit comments

Comments
 (0)