graph.py
What it is
Utilities for parsing Turtle (TTL) ontologies with rdflib, extracting:
- OWL classes (
owl:Class) - Object-property-style relationships expressed via
rdfs:subClassOf+owl:Restriction - Human-readable labels (
rdfs:label), optionally sourced from imported ontologies
Public API
-
URI_TO_GROUP: dict[str, str]- Static mapping from specific BFO class URIs to high-level groups (e.g.,
WHAT,WHO).
- Static mapping from specific BFO class URIs to high-level groups (e.g.,
-
get_imported_graph(graph: rdflib.Graph) -> rdflib.Graph- Loads ontologies referenced via
owl:importsinto a separate graph (intended for label lookup only). - Tries multiple RDF formats (
xml,turtle,rdf,owl) per import.
- Loads ontologies referenced via
-
get_short_name(uri: rdflib.URIRef) -> str- Returns the fragment after
#or the last path segment after/.
- Returns the fragment after
-
get_rdfs_label(uri: rdflib.URIRef, graph: rdflib.Graph, imported_graph: Optional[rdflib.Graph] = None) -> str- Resolves
rdfs:labelforuri, checking:graphimported_graph(if provided)- a merged view (
graph + imported_graph)
- Falls back to
get_short_name(uri)if no label is found. - If the label string contains
"@", keeps only the part before"@".
- Resolves
-
extract_classes_from_union(graph: rdflib.Graph, union_node: rdflib.BNode) -> set[rdflib.URIRef]- Extracts named classes (
URIRef) from anowl:unionOfRDF list. - Handles nested unions and has a manual RDF-list traversal fallback.
- Extracts named classes (
-
extract_restriction_targets(graph: rdflib.Graph, restriction: rdflib.BNode) -> set[rdflib.URIRef]- From an
owl:Restriction, extracts named target classes from:owl:allValuesFromowl:someValuesFrom
- If the value is a union/list, extracts all member named classes.
- From an
-
extract_relationships(graph: rdflib.Graph, class_uri: rdflib.URIRef) -> list[tuple[rdflib.URIRef, rdflib.URIRef, rdflib.URIRef]]- Reads
rdfs:subClassOfblank nodes that areowl:Restrictionand returns triples:(source_class, target_class, property_uri)
- Only considers targets found via
extract_restriction_targets.
- Reads
-
get_class_id_prefix(uri: rdflib.URIRef, graph: rdflib.Graph) -> str- Attempts to match
urito a namespace prefix registered ingraph.namespaces(). - Fallback heuristics:
- returns
"bfo"if URI containsbfo/BFO - returns
"abi"if URI containsabi - else returns
"class"
- returns
- Attempts to match
-
get_inverse_property(property_uri: rdflib.URIRef, graph: rdflib.Graph) -> Optional[rdflib.URIRef]- Finds an inverse property via
owl:inverseOfin either direction.
- Finds an inverse property via
-
get_group_from_class_hierarchy(class_uri: rdflib.URIRef, graph: rdflib.Graph, visited: Optional[set[rdflib.URIRef]] = None) -> Optional[str]- Walks up
rdfs:subClassOflinks (URI parents only) to find the first class whose URI appears inURI_TO_GROUP. - Uses
visitedto prevent cycles.
- Walks up
-
parse_turtle_ontology(turtle_path: str, imported_ontologies: Optional[list[str]] = None) -> tuple[rdflib.Graph, rdflib.Graph, set[rdflib.URIRef], set[tuple[rdflib.URIRef, rdflib.URIRef, rdflib.URIRef]]]- Parses a TTL file into a main graph.
- Loads
owl:importsinto a separate imported graph (and optionally additional imports passed in). - Collects:
- explicitly declared classes (
rdf:type owl:Class) - additional named classes referenced by restrictions in the main graph
- explicitly declared classes (
- Extracts restriction-based relationships and de-duplicates “inverse” pairs by treating
(source, target)and(target, source)as duplicates (property URI is not considered in this inverse check).
Configuration/Dependencies
- Dependencies
rdflib(Graph,URIRef,BNode,RDF,RDFS,OWL,Collection)naas_abi_core.loggerfor logging
- Network access
get_imported_graph()may fetchowl:importsURLs over the network if present.
Usage
from naas_abi_marketplace.domains.ontology_engineer.utils.graph import (
parse_turtle_ontology,
get_rdfs_label,
)
main_g, imported_g, classes, rels = parse_turtle_ontology("ontology.ttl")
print(f"Classes: {len(classes)}")
print(f"Relationships: {len(rels)}")
# Print a few labeled relationships
for s, t, p in list(rels)[:5]:
s_lbl = get_rdfs_label(s, main_g, imported_g)
p_lbl = get_rdfs_label(p, main_g, imported_g)
t_lbl = get_rdfs_label(t, main_g, imported_g)
print(f"{s_lbl} --{p_lbl}--> {t_lbl}")
Caveats
parse_turtle_ontology()returns 4 items(graph, imported_graph, classes, unique_relationships); the docstring mentions additional items that are not returned.- Relationship de-duplication treats
(source, target)and(target, source)as inverses without checkingowl:inverseOfand without considering the property URI; this can drop distinct predicates between the same two classes in opposite directions. - Only restrictions expressed as
rdfs:subClassOfblank nodes typedowl:Restrictionare processed; directrdfs:subClassOfURI parent links are not emitted as relationships. get_rdfs_label()strips anything after"@"in the label string; this is a simplistic language-tag handling.