AIAgentOntologyGenerationPipeline
What it is
A pipeline that:
- Loads the latest Artificial Analysis LLM JSON dataset from a datastore folder
- Groups models into AI-agent “modules” (e.g.,
chatgpt,claude,llama) using name/slug/provider heuristics - Generates TTL ontology files per agent (BFO-structured content as strings)
- Writes outputs into a timestamped datastore directory and also deploys a “current” TTL into module ontologies folders
- Inserts a small summary
rdflib.Graph(file count + timestamp) into a configured triple store
Public API
Classes
-
AIAgentOntologyGenerationConfiguration(PipelineConfiguration)- Configuration for the pipeline.
- Fields:
triple_store: ITripleStoreService(required) — target triple store service used toinsert(Graph)datastore_path: str— output root for generated ontologies (timestamped subfolders)source_datastore_path: str— input folder containing*_llms_data.jsonfilesmax_models_per_agent: int— cap per agent for performance
-
AIAgentOntologyGenerationParameters(PipelineParameters)- Execution parameters.
- Fields:
force_regenerate: bool— defined but not used in current implementationagent_filter: Optional[List[str]]— restrict generation to specific agent modules (keys like["chatgpt","claude"])
-
AIAgentOntologyGenerationPipeline(Pipeline)- Main pipeline implementation.
Methods (intended for external use)
-
AIAgentOntologyGenerationPipeline.run(parameters: PipelineParameters) -> rdflib.Graph- Runs the pipeline end-to-end.
- Validates parameter type (
AIAgentOntologyGenerationParameters). - Loads latest AA dataset, generates/deploys TTL files, writes a JSON summary file, inserts a summary graph into the triple store, and returns that graph.
-
AIAgentOntologyGenerationPipeline.as_tools() -> list[BaseTool]- Exposes the pipeline as a LangChain
StructuredToolnamedai_agent_ontology_generation. - Tool calls
run(AIAgentOntologyGenerationParameters(**kwargs)).
- Exposes the pipeline as a LangChain
-
AIAgentOntologyGenerationPipeline.as_api(...) -> None- Present but currently does nothing (returns
None).
- Present but currently does nothing (returns
-
AIAgentOntologyGenerationPipeline.get_configuration() -> AIAgentOntologyGenerationConfiguration- Returns the pipeline configuration instance.
Configuration/Dependencies
Required dependencies
rdflib(Graph,Literal,Namespace) — graph returned/inserted, though TTL is generated as plain text files.naas_abi_core.pipeline— basePipeline,PipelineConfiguration,PipelineParameters.naas_abi_core.services.triple_store.TripleStorePorts.ITripleStoreService- Must provide an
insert(graph: Graph)method.
- Must provide an
langchain_core.tools— foras_tools()(BaseTool,StructuredTool).
Filesystem inputs/outputs
- Input: latest file matching
*_llms_data.jsoninsource_datastore_path. - Output: under
datastore_path/<UTC_TIMESTAMP>/<AgentTitle>Ontology.ttl(current)<UTC_TIMESTAMP>_<AgentTitle>Ontology.ttl(audit copy)generation_summary_<UTC_TIMESTAMP>.json
- Deployment output: also writes
<AgentTitle>Ontology.ttlinto:Path(__file__).parent.parent.parent / <agent_module> / "ontologies" / <AgentTitle>Ontology.ttl
Usage
from naas_abi.pipelines.AIAgentOntologyGenerationPipeline import (
AIAgentOntologyGenerationPipeline,
AIAgentOntologyGenerationConfiguration,
AIAgentOntologyGenerationParameters,
)
# Minimal triple store stub for demonstration
class TripleStoreStub:
def insert(self, graph):
pass
pipeline = AIAgentOntologyGenerationPipeline(
AIAgentOntologyGenerationConfiguration(
triple_store=TripleStoreStub(),
source_datastore_path="storage/datastore/core/modules/abi/ArtificialAnalysisWorkflow",
datastore_path="storage/datastore/core/modules/abi/AIAgentOntologyGenerationPipeline",
max_models_per_agent=50,
)
)
graph = pipeline.run(
AIAgentOntologyGenerationParameters(agent_filter=["chatgpt", "claude"])
)
print(len(graph)) # summary triples countCaveats
force_regenerateparameter is currently unused.- The LangChain tool description states “datastore only, no module deployment”, but
run()does deploy TTL files into module folders. - Module deployment path is derived from
__file__by going up 3 directories; this assumes a specific repo/layout and may write files into unexpected locations depending on installation. - Ontology TTL content is written as text and is not parsed/validated before writing.
- If no
*_llms_data.jsonexists (or source directory missing),run()raisesValueError.