onto2py
What it is
A utility module that parses an RDF Turtle (.ttl) ontology (OWL/RDFS + optional SHACL) and generates Pydantic-based Python models that can emit RDF graphs. It can also write the generated module to disk and optionally scaffold per-class “action” stubs.
Public API
Data structures
@dataclass PropertyInfo- Holds extracted property metadata:
name,property_type("data"or"object")range_classes: Dict[str, Optional[int]](object property targets + cardinality;None= unspecified,>1= list)datatype(data property Python type string),required,description,default_value
- Holds extracted property metadata:
@dataclass ClassInfo- Holds extracted class metadata:
name,uri,parent_classes,propertiesdescription,labelproperty_uris: Dict[str, str](property name → predicate URI)
- Holds extracted class metadata:
Primary function
onto2py(ttl_file: str | io.TextIOBase, overwrite: bool = False) -> str- Parses TTL content into an
rdflib.Graph, extracts:- OWL/RDFS classes
- inheritance (
rdfs:subClassOf) - OWL restrictions (
owl:Restriction) underrdfs:subClassOf(as object properties) - OWL object/datatype properties, domains/ranges
- SHACL constraints (
sh:NodeShape,sh:property,sh:minCount,sh:maxCount)
- Adds standard metadata properties to all classes:
rdfs:label(required,str)dcterms:created(required,datetime.datetime, defaultdatetime.datetime.now())dcterms:creator(required, defaultos.environ.get('USER'); see caveats about its typing)
- Generates a single Python code string and, when given a file path:
- writes a sibling
.pyfile next to the.ttl - creates per-class stub files under
.../ontologies/classes/...(controlled byoverwrite)
- writes a sibling
- Parses TTL content into an
Name extraction helpers
extract_class_name_from_label(label: str) -> Optional[str]- Converts a label into a PascalCase Python identifier.
extract_class_name(uri, g: Optional[rdflib.Graph] = None) -> Optional[str]- Prefer
rdfs:label(if graph provided), else URI fragment/path; returns a Python identifier.
- Prefer
extract_property_name_from_label(label: str) -> Optional[str]- Converts a label into a
snake_casePython identifier.
- Converts a label into a
extract_property_name(uri, g: Optional[rdflib.Graph] = None) -> Optional[str]- Prefer
rdfs:label(if graph provided), else URI fragment/path; returns a Python identifier.
- Prefer
RDF/OWL parsing helpers (used internally but callable)
extract_classes_from_union(g, union_node, classes) -> List[str]- Extracts class names from
owl:unionOflists.
- Extracts class names from
extract_classes_from_intersection(g, intersection_node, classes, OWL) -> List[str]- Extracts class names from
owl:intersectionOflists (supports nested restrictions).
- Extracts class names from
extract_cardinality_from_restriction(g, restriction, OWL) -> Optional[int]- Reads
owl:cardinality/owl:minCardinality/owl:maxCardinality.
- Reads
extract_classes_with_cardinality_from_intersection(g, intersection_node, classes, OWL) -> Dict[str, Optional[int]]- Extracts class names plus cardinalities from
owl:intersectionOfnested restrictions.
- Extracts class names plus cardinalities from
extract_restriction_properties(g, restriction, class_uri, class_info, classes, OWL)- Converts
owl:Restrictionnodes underrdfs:subClassOfintoPropertyInfoentries.
- Converts
Graph accessors
get_label(g, resource) -> Optional[str](readsrdfs:label)get_description(g, resource) -> Optional[str](readsrdfs:comment, elserdfs:label)get_property_description(g, prop) -> Optional[str](readsskos:definition)get_property_range(g, prop, classes) -> Dict[str, Optional[int]](readsrdfs:rangefor object properties)get_datatype_range(g, prop) -> Optional[str](mapsrdfs:rangeXSD datatype → Python type string)
SHACL support
extract_shacl_constraints(g, classes, properties, SHACL)- Applies SHACL
minCount/maxCountto properties for target classes.
- Applies SHACL
process_property_shape(g, prop_shape, class_info, properties, SHACL)- Implements the per-property constraint extraction.
Code generation / file generation
inherit_parent_properties(classes)- Copies parent properties into children (preserving
requiredflag as stored on the parent properties).
- Copies parent properties into children (preserving
add_metadata_properties(g, classes)- Adds
label/created/creatorproperties to all classes if missing.
- Adds
create_class_files(ttl_file_path, classes, py_file, overwrite=False)- Writes per-class Python files that subclass the generated model class.
apply_linting(code: str) -> str- Attempts to run
ruff formatandruff check --fixon generated code (falls back to original code on errors).
- Attempts to run
generate_python_code(classes, properties) -> str- Creates a complete Python module including an
RDFEntitybase class and all ontology classes.
- Creates a complete Python module including an
topological_sort_classes(classes) -> List[ClassInfo]- Sorts classes to emit parent classes before children (best-effort; handles cycles).
generate_class_code(class_info, has_any_import=False) -> List[str]- Emits one class definition (deduplicates properties by name).
generate_property_code(prop, has_any_import=False) -> str- Emits one Pydantic field line using
typing.AnnotatedandField(...).
- Emits one Pydantic field line using
Configuration/Dependencies
- Runtime dependencies:
rdflib(parsing TTL, RDF terms, RDF lists viardflib.collection.Collection)pydantic(generated code usesBaseModel,Field)
- Optional external tool:
ruff(used byapply_linting; if missing/fails, code is returned unformatted)
- File system behavior:
- If
ttl_fileis a path string, a sibling.pyfile is written. - Per-class stubs are written under a derived
ontologies/classes/...directory layout.
- If
Usage
Generate code from a TTL file path
from naas_abi_core.utils.onto2py.onto2py import onto2py
code = onto2py("path/to/ontology.ttl")
print(code[:500])
Generate code from an in-memory TTL string
import io
from naas_abi_core.utils.onto2py.onto2py import onto2py
ttl = """
@prefix ex: <http://example.com/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
ex:Person a owl:Class ; rdfs:label "Person" .
"""
code = onto2py(io.StringIO(ttl))
print(code)
Caveats
- Writing to disk happens automatically when
ttl_fileis a string path (it writesPath(ttl_file).with_suffix(".py")and prints status). dcterms:creatormetadata property is added withproperty_type="data"but usesrange_classes={"str": 1}(notdatatype="str"), which influences how it is typed/emitted.- Optional properties default to sentinel values in generated models:
- object properties:
'http://ontology.naas.ai/abi/unknown'(or a list containing it) - data properties:
'unknown'
- object properties:
- SHACL
maxCountis applied to all range classes of a property;minCount > 0marks the property as required. - OWL restriction-derived properties are treated as object properties and explicitly marked
required=False(restrictions do not automatically imply required in this generator).