9 stable releases
Uses new Rust 2024
| new 1.11.0 | Mar 2, 2026 |
|---|---|
| 1.10.0 | Feb 22, 2026 |
| 1.5.2 | Jan 20, 2026 |
| 1.0.0 | Nov 16, 2025 |
| 0.1.0 | Apr 30, 2024 |
#129 in Text processing
Used in 10 crates
(8 directly)
490KB
9K
SLoC
terraphim_rolegraph
Knowledge graph implementation for semantic document search.
Overview
terraphim_rolegraph provides a role-specific knowledge graph that connects concepts, relationships, and documents for graph-based semantic search. Results are ranked by traversing relationships between matched concepts.
Features
- 📊 Graph-Based Search: Navigate concept relationships for smarter results
- 🔍 Multi-Pattern Matching: Fast Aho-Corasick text scanning
- 🎯 Semantic Ranking: Sum node + edge + document ranks
- 🔗 Path Connectivity: Check if matched terms connect via graph paths
- ⚡ High Performance: O(n) matching, efficient graph traversal
- 🎭 Role-Specific: Separate graphs for different user personas
Installation
[dependencies]
terraphim_rolegraph = "1.0.0"
Quick Start
Creating and Querying a Graph
use terraphim_rolegraph::RoleGraph;
use terraphim_types::{RoleName, Thesaurus, NormalizedTermValue, NormalizedTerm, Document};
#[tokio::main]
async fn main() -> Result<(), terraphim_rolegraph::Error> {
// Create thesaurus
let mut thesaurus = Thesaurus::new("engineering".to_string());
thesaurus.insert(
NormalizedTermValue::from("rust"),
NormalizedTerm {
id: 1,
value: NormalizedTermValue::from("rust programming"),
url: Some("https://rust-lang.org".to_string()),
}
);
thesaurus.insert(
NormalizedTermValue::from("async"),
NormalizedTerm {
id: 2,
value: NormalizedTermValue::from("asynchronous programming"),
url: Some("https://rust-lang.github.io/async-book/".to_string()),
}
);
// Create role graph
let mut graph = RoleGraph::new(
RoleName::new("engineer"),
thesaurus
).await?;
// Index documents
let doc = Document {
id: "rust-async-guide".to_string(),
title: "Async Rust Programming".to_string(),
body: "Learn rust and async programming with tokio".to_string(),
url: "https://example.com/rust-async".to_string(),
description: Some("Comprehensive async Rust guide".to_string()),
summarization: None,
stub: None,
tags: Some(vec!["rust".to_string(), "async".to_string()]),
rank: None,
source_haystack: None,
};
let doc_id = doc.id.clone();
graph.insert_document(&doc_id, doc);
// Query the graph
let results = graph.query_graph("rust async", None, Some(10))?;
for (id, indexed_doc) in results {
println!("Document: {} (rank: {})", id, indexed_doc.rank);
}
Ok(())
}
Path Connectivity Checking
use terraphim_rolegraph::RoleGraph;
use terraphim_types::{RoleName, Thesaurus};
#[tokio::main]
async fn main() -> Result<(), terraphim_rolegraph::Error> {
let thesaurus = Thesaurus::new("engineering".to_string());
let graph = RoleGraph::new(RoleName::new("engineer"), thesaurus).await?;
// Check if matched terms are connected by a graph path
let text = "rust async tokio programming";
let connected = graph.is_all_terms_connected_by_path(text);
if connected {
println!("All terms are connected - they form a coherent topic!");
} else {
println!("Terms are disconnected - possibly unrelated concepts");
}
Ok(())
}
Multi-term Queries with Operators
use terraphim_rolegraph::RoleGraph;
use terraphim_types::{RoleName, Thesaurus, LogicalOperator};
#[tokio::main]
async fn main() -> Result<(), terraphim_rolegraph::Error> {
let thesaurus = Thesaurus::new("engineering".to_string());
let mut graph = RoleGraph::new(RoleName::new("engineer"), thesaurus).await?;
// AND query - documents must contain ALL terms
let results = graph.query_graph_with_operators(
&["rust", "async", "tokio"],
&LogicalOperator::And,
None,
Some(10)
)?;
println!("AND query: {} results", results.len());
// OR query - documents may contain ANY term
let results = graph.query_graph_with_operators(
&["rust", "python", "go"],
&LogicalOperator::Or,
None,
Some(10)
)?;
println!("OR query: {} results", results.len());
Ok(())
}
Architecture
Graph Structure
The knowledge graph uses a three-layer structure:
-
Nodes (Concepts)
- Represent terms from the thesaurus
- Track frequency/importance (rank)
- Connect to related concepts via edges
-
Edges (Relationships)
- Connect concepts that co-occur in documents
- Weighted by co-occurrence strength (rank)
- Associate documents via concept pairs
-
Documents (Content)
- Indexed by concepts they contain
- Linked via edges between their concepts
- Ranked by node + edge + document scores
Ranking Algorithm
Search results are ranked by summing:
total_rank = node_rank + edge_rank + document_rank
- node_rank: How important/frequent the concept is
- edge_rank: How strong the concept relationship is
- document_rank: Document-specific relevance
Higher total rank = more relevant result.
Performance Characteristics
- Text Matching: O(n) with Aho-Corasick multi-pattern matching
- Graph Query: O(k × e × d) where:
- k = number of matched terms
- e = average edges per node
- d = average documents per edge
- Memory: ~100 bytes/node + ~200 bytes/edge
- Connectivity Check: DFS with backtracking (exponential worst case, fast for k≤8)
API Overview
Core Methods
RoleGraph::new()- Create graph from thesaurusinsert_document()- Index a documentquery_graph()- Simple text queryquery_graph_with_operators()- Multi-term query with AND/ORis_all_terms_connected_by_path()- Check path connectivityfind_matching_node_ids()- Get matched concept IDs
Graph Inspection
get_graph_stats()- Statistics (node/edge/document counts)get_node_count()/get_edge_count()/get_document_count()is_graph_populated()- Check if graph has contentvalidate_documents()- Find orphaned documentsfind_document_ids_for_term()- Reverse lookup
Async Support
The graph uses tokio::sync::Mutex for thread-safe async operations:
use terraphim_rolegraph::RoleGraphSync;
let sync_graph = RoleGraphSync::new(graph);
let locked = sync_graph.lock().await;
let results = locked.query_graph("search term", None, Some(10))?;
Utility Functions
Text Processing
split_paragraphs()- Split text into paragraphs
Node ID Pairing
magic_pair(x, y)- Create unique edge ID from two node IDsmagic_unpair(z)- Extract node IDs from edge ID
Examples
See the examples/ directory for:
- Building graphs from markdown files
- Multi-role graph management
- Custom ranking strategies
- Path analysis and connectivity
Minimum Supported Rust Version (MSRV)
This crate requires Rust 1.70 or later.
License
Licensed under Apache-2.0. See LICENSE for details.
Related Crates
- terraphim_types: Core type definitions
- terraphim_automata: Text matching and autocomplete
- terraphim_service: Main service layer with search
Support
- Discord: https://discord.gg/VPJXB6BGuY
- Discourse: https://terraphim.discourse.group
- Issues: https://github.com/terraphim/terraphim-ai/issues
Dependencies
~15–32MB
~381K SLoC