6 releases (3 breaking)
| 0.4.1 | Mar 3, 2026 |
|---|---|
| 0.4.0 | Mar 3, 2026 |
| 0.3.1 | Jan 23, 2026 |
| 0.2.0 | Jan 18, 2026 |
| 0.1.0 | Jan 18, 2026 |
#2225 in Database interfaces
Used in 2 crates
3.5MB
85K
SLoC
flowscope-core
Core SQL lineage analysis engine for FlowScope.
Overview
flowscope-core is a Rust library that performs static analysis on SQL queries to extract table and column-level lineage information. It serves as the foundation for the FlowScope ecosystem, powering the WebAssembly bindings and JavaScript packages.
Features
- Multi-Dialect Parsing: Built on
sqlparser-rs, supporting PostgreSQL, Snowflake, BigQuery, DuckDB, Redshift, MySQL, SQLite, Databricks, ClickHouse, and Generic ANSI SQL. - Deep Lineage Extraction:
- Table-level dependencies (SELECT, INSERT, UPDATE, MERGE, COPY, UNLOAD, etc.)
- Column-level data flow (including transformations)
- Cross-statement lineage tracking (CREATE TABLE AS, INSERT INTO ... SELECT)
- dbt/Jinja Templating: Preprocess SQL with Jinja or dbt-style templates before analysis, with built-in stubs for
ref(),source(),config(),var(), andis_incremental(). - Complex SQL Support: Handles CTEs (Common Table Expressions), Subqueries, Joins, Unions, Window Functions, and lateral column aliases.
- Schema Awareness: Utilize provided schema metadata to validate column references and resolve wildcards (
SELECT *). - Type Inference: Infer expression types with dialect-aware type compatibility checking.
- SQL Linting: 72 lint rules across 9 families (AL, AM, CP, CV, JJ, LT, RF, ST, TQ) with AST-driven semantic checks and token-aware formatting checks. Rules include autofix metadata with safe/unsafe classification.
- Diagnostics: Returns structured issues (errors, warnings) with source spans for precise highlighting.
Structure
src/
├── analyzer.rs # Main analysis orchestration
├── analyzer/
│ ├── context.rs # Per-statement state and scope management
│ ├── schema_registry.rs # Schema metadata and name resolution
│ ├── visitor.rs # AST visitor for lineage extraction
│ ├── query.rs # Query analysis (SELECT, subqueries)
│ ├── expression.rs # Expression and column lineage
│ ├── select_analyzer.rs # SELECT clause analysis
│ ├── statements.rs # Statement-level analysis
│ ├── ddl.rs # DDL statement handling (CREATE, ALTER)
│ ├── cross_statement.rs # Cross-statement lineage tracking
│ ├── diagnostics.rs # Issue reporting
│ ├── input.rs # Input merging and deduplication
│ └── helpers/ # Utility functions
├── linter/ # SQL lint engine
│ ├── mod.rs # Linter orchestration
│ ├── config.rs # Rule configuration
│ ├── document.rs # Document model (shared tokens)
│ ├── rule.rs # Rule trait and context
│ ├── visit.rs # AST visitor for rules
│ └── rules/ # 72 rule implementations
├── parser/ # SQL dialect handling
├── types/ # Request/response types
└── lineage/ # Lineage graph construction
Usage
use flowscope_core::{analyze, AnalyzeRequest, Dialect};
fn main() {
let request = AnalyzeRequest {
sql: "SELECT u.name, o.id FROM users u JOIN orders o ON u.id = o.user_id".to_string(),
dialect: Dialect::Postgres,
schema: None, // Optional schema metadata
file_path: None,
};
let result = analyze(&request);
// Access table lineage
for statement in result.statements {
println!("Tables: {:?}", statement.nodes);
println!("Edges: {:?}", statement.edges);
}
}
Linting
use flowscope_core::linter::{Linter, LintConfig, LintDocument};
let config = LintConfig::default();
let linter = Linter::new(config);
let document = LintDocument::new(sql, dialect);
let issues = linter.check_document(&document);
Testing
cargo test
License
Apache 2.0
Dependencies
~7–11MB
~186K SLoC