Merge LinkML-generated SHACL/OWL with manually maintained SHACL shapes / OWL ontologies that use advance SHACL/OWL features
Given the nature of LinkML as a simplified frontend to both OWL and SHACL, it will always inherently be less expressive than OWL or SHACL itself, no matter how much functionality we add via #276. Think of, e.g., conditional cardinalities in SHACL.
We need a way of storing, in our Single Point of Truth, both LinkML schema definitions and manually maintained files. These files will reside next to each other, i.e., in the same directories, and will usually be edited in the same merge requests. The CI will have to merge the SHACL/OWL generated from LinkML and the SHACL/OWL maintained manually.
Semantically, this is trivial. Assume two RDF graphs (e.g., stored in two files), with triples s p1 o1
and s p2 o2
– merging them automatically results in a graph like
flowchart LR
s -->|p1| o1
s -->|p2| o2
Here are two technical ways of doing the remaining syntactic job, assuming that output.ttl
(SHACL or OWL) should be created from linkml-generated.ttl
and manually-maintained.ttl
:
- on the command line:
- convert, using an off-the-shelf tool,
linkml-generated.ttl
andmanually-maintained.ttl
from Turtle to N-Triples, i.e., a text file with one subject-predicate-object RDF triple per line, resulting in two fileslinkml-generated.nt
andmanually-maintained.nt
- concatenate
linkml-generated.nt
andmanually-maintained.nt
to result inoutput.nt
- convert, using an off-the-shelf tool,
output.nt
tooutput.ttl
(or JSON-LD or RDF/XML or any other serialization we like). As the result is meant to be consumed by machines, pretty-printing (e.g., of namespace prefixes) is optional.
- convert, using an off-the-shelf tool,
- in program code (e.g., Python):
- create a new, empty RDF graph (that's how Python rdflib calls it; other libraries, e.g., Java Jena, call it "model")
- add the triples from each of
linkml-generated.ttl
andmanually-maintained.ttl
to the graph (e.g., using rdflib.Graph.parse) - serialize the resulting graph to
output.ttl
It does not matter whether we do this in the granularity of small modules (e.g., one artifact per class), or for the ontology overall (two big SHACL/OWL files).
There is no reason to be concerned about inconsistencies introduced by this process. Thanks to LinkML and the manually maintained SHACL/OWL residing next to each other, they will be maintained usually by the same experts. On the other hand, inconsistencies may even be introduced in a single LinkML or SHACL or OWL file. The final result will have to be validated by other means anyway, e.g., the OWL ontology will have to be validated using Protégé and a reasoner before any release.