Use generative AI with superior provenance, using established and Government-approved standards
GraphRAG is promoted by Microsoft and Neo4j as a way ex extending Retrieval-Augmented Generation (RAG) with graph search over a property graph.
While such approach has multiple advantages in flexibility and performance, we believe Semantic Graph implemented using standard Ontologies can offer superior audit capabilities.
Government agencies and highly regulated enterprises, who can be audited in responce to public enquiries. Government agencies in democratic countries operate under extensive duty of care for the public, and have to account for individual actions when requested. Highly regulated enterprises, while privately owned, can also face extensive scrutiny. Both have to be able to prove that a Large Language Model was given all necessary context information, and that information itself was accurately sourced.
Generative AI offers amazing capabilities that have the potential of massively improving productivity. At the same time, they are prone to hallucinations. That problem is usually address with Retrieval-Augmented Generation (RAG), which provides it judges relevant according to semantic closeness to the prompt. It can be further augmented with related information retrieved through traversing a knowledge graph in a property graph database like Neo4j, Amazon Neptune, Azure Cosmos DB, and others.
Property graphs databases are fast, scalable and flexible. The way they represent data is decided by data engioneers ad hoc. On the other hand, Semantic Databases, also known as SPARQL Databases, like Ontotext GraphDB, Stardog, AllegroGraph and others, are usually implemented to rely on internationally recognised Ontologies defined in Web Ontology Language (OWL). Most Governments also developed Ontologies for their data.
As a result, implementing GraphRAG over Linked Data can:
Make it easier to identify relevant data, and
Make it easier to prove that all relevant data was provided.
For example, cimbining Provenance Ontology (PROV-O) with Organisation Ontology (W3C-ORG) allows expression of sourcing information from different organisations. Adding eProcurement Ontology can add standart representation of procurement operations.
Over 10 days, we can demonstrate the capabilities of GraphRAG over Linked Data. We can deliver directly in Australia, and via an accredited intermediary in US, UK and EU
Outputs:
Thin‑slice charter (use case, users, impact if wrong, metrics).
Risk screen (Attachment B) from DTA policy, with a go/no‑go to treat the pilot as “higher‑risk” or not.
Draft AI transparency statement outline (what the agency would publish). digital.gov.au
Outputs:
Mapping spec: AusTender → ePO; ABN → RegOrg (IDs, legal names).
Initial KG loaded in a triple store; SHACL shapes for Contract, Organization, SanctionsListing. docs.ted.europa.eu+2W3C+2
Outputs:
SPARQL queries for 0/1/2‑hop exposure (Contract → Supplier → Parent/UBO → SanctionsListing).
Provenance model: dataset‑level + statement‑level (RDF‑star) with PROV‑O (prov:wasDerivedFrom, prov:generatedAtTime, prov:wasAttributedTo).
Sample provenance bundle serialized as JSON‑LD. W3C
Outputs:
Baseline retrieval over documents (award notices, sanctions entries, registry extracts) with citations.
Eval harness: 20–30 questions; metrics for task success, faithfulness, latency, cost.
Baseline report (numbers + failure taxonomy).
Outputs:
Graph‑guided candidate selection (subgraph traversal → targeted passages).
A/B eval (GraphRAG vs. baseline) and delta table; decision memo: keep/iterate/kill GraphRAG for this use case.
Outputs:
Assurance plan mapped to the National AI assurance framework; attack/defend drill (prompt injection, over‑long context, output handling).
Security checklist mapped to ACSC ISM controls and ACSC’s AI guidance. Department of Finance+2Cyber.gov.au+2
Outputs:
Lite PIA (per OAIC guide/tool): data flows, risks, mitigations, APPs mapping.
Recordkeeping plan per NAA guidance (what is a record, retention, export formats); confirm PROV‑O + logs are captured as records. OAIC+1
Outputs:
Draft contract annex using DTA AI model clauses (tailored to this use case), with evaluation/red‑team deliverables, data ownership, transparency, and assurance language consistent with CPRs. buyict.gov.au+2ags.gov.au+2
Outputs
Final AI transparency statement content (what to publish), entry template for your internal AI register, and runbook for ongoing monitoring.
Traceability pack: ontology slice, SHACL shapes, SPARQL queries, provenance profile, eval set.
Outputs:
Live demo; acceptance against metrics (e.g., ≥X pp task‑success uplift vs. baseline; zero critical faithfulness fails).
Go/iterate/kill decision; backlog for productionization (data quality, coverage, performance).