1 unstable release
Uses new Rust 2024
new 0.23.0 | Apr 13, 2025 |
---|
#4 in #coding-style
180KB
4K
SLoC
Mago Type Syntax
A fast, memory-efficient Rust crate for parsing PHP docblock type strings (e.g., from @var
, @param
, @return
tags) into a structured Abstract Syntax Tree (AST).
Originally developed as part of the Mago static analysis toolset, this crate provides the specialized lexer, parser, and AST definitions needed to work with PHP's docblock type syntax, including many Psalm and PHPStan extensions.
Features
- Dedicated Lexer & Parser: Includes a performant lexer (
lexer::TypeLexer
) and recursive descent parser (parser::construct
internally, exposed viaparse_str
) specifically designed for type strings. - Structured AST: Produces a detailed Abstract Syntax Tree (
ast::Type
) representing the type's structure, moving beyond simple string manipulation. - Accurate Spans: Preserves accurate source location (
mago_span::Span
) information for all AST nodes, relative to the original source file (requires providing the correct initialSpan
when parsing). - Performance: Designed with performance and memory efficiency in mind.
- Error Reporting: Provides structured error types (
error::ParseError
) with span information on failure. - Core Utilities: Relies on
mago_syntax_core
for shared low-level lexing infrastructure like theInput
buffer and utility functions/macros.
Supported Syntax (Examples)
This parser covers a wide range of standard PHPDoc, PHPStan, and Psalm type syntaxes:
- Keywords:
int
,string
,bool
,float
,mixed
,null
,void
,never
,object
,resource
,true
,false
,scalar
,numeric
,array-key
,list
,non-empty-list
,non-empty-string
,class-string
,iterable
,callable
,pure-callable
,pure-closure
,stringable-object
,lowercase-string
,positive-int
,negative-int
,resource
,closed-resource
,open-resource
,numeric-string
,truthy-string
, etc. - Literals:
- Strings:
'string-literal'
,"another one"
- Integers:
123
,-45
,0x1A
,0o77
,0b10
,123_456
- Floats:
1.23
,-0.5
,.5
,1.2e3
,7E-10
- Strings:
- Unspecified Literals:
literal-int
,literal-string
,non-empty-literal-string
- Operators:
|
(Union),&
(Intersection),?
(Nullable) - Structure:
- Parentheses:
(int|string)
- Nullables:
?int
,?array<string>
- Unions:
int|string|null
- Intersections:
Countable&Traversable
- Member References:
MyClass::CONST
,MyClass::class
- Parentheses:
- Generics:
array<KeyType, ValueType>
,array<ValueType>
list<ValueType>
,non-empty-list<ValueType>
iterable<KeyType, ValueType>
,iterable<ValueType>
class-string<ClassName>
,interface-string<InterfaceName>
, etc.- User types:
My\Collection<ItemType>
self
,static
,parent
(Parsed asType::Reference
which can have generics)
- Array Shapes:
array{key: Type, 'other-key': Type}
list{Type, Type}
- Optional keys:
array{name: string, age?: int}
- Unsealed shapes:
array{name: string, ...}
,list{int, ...<int|string>}
- (Note: Supports any parsed
Type
as a key, per design choice)
- Callables:
callable
,Closure
,pure-callable
,pure-Closure
callable(ParamType1, ParamType2): ReturnType
Closure(): void
- Optional params:
callable(int=)
- Variadic params:
callable(string...)
- Variables:
$var
- Conditionals:
$var is string ? int : bool
T is not null ? T : mixed
- KeyOf / ValueOf:
key-of<T>
,value-of<T>
- Indexed Access:
T[K]
- Int Ranges:
int<0, 100>
,int<min, 0>
,int<1, max>
- Properties Of:
properties-of<T>
,public-properties-of<T>
,protected-properties-of<T>
,private-properties-of<T>
- Unary
+
/-
Types:+1
,-2.0
(parsed asType::Posited
,Type::Negated
)
Unsupported Syntax (Currently)
This crate does not yet support parsing the following syntax:
int-mask<T>
,int-mask-of<T>
Usage
-
Add Dependencies:
Add
mago_type_syntax
to yourCargo.toml
. You will also likely needmago_span
andmago_source
to create the necessary inputs.[dependencies] mago_type_syntax = "..." mago_span = "..." mago_source = "..."
-
Parse a Type String: Use the main entry point
mago_type_syntax::parse_str
. You need the type string itself and theSpan
indicating its position within the original source file.use mago_type_syntax::{parse_str, ast::Type}; use mago_span::{Position, Span}; use mago_span::HasSpan; use mago_source::SourceIdentifier; fn main() { let type_string = "array<int, string>|null"; let source_id = SourceIdentifier::dummy(); // Use your actual source identifier // Calculate the span of the type string within its original file // Example: if it starts at byte 100 and ends at byte 124 let start_pos = Position::new(source_id, 100); let end_pos = Position::new(source_id, 100 + type_string.len()); let type_span = Span::new(start_pos, end_pos); // Parse the string match parse_str(type_span, type_string) { Ok(parsed_ast) => { println!("Successfully parsed AST: {:#?}", parsed_ast); // You can now traverse or analyze the parsed_ast (Type enum) match parsed_ast { Type::Union(union_type) => { // ... process union ... println!("Parsed a union type!"); } Type::Array(array_type) => { // This won't be hit for the example above println!("Parsed an array type!"); } // ... handle other Type variants ... _ => { println!("Parsed other type variant"); } } } Err(parse_error) => { eprintln!("Failed to parse type string: {:?}", parse_error); // Access span via parse_error.span() if needed from HasSpan trait eprintln!("Error occurred at span: {:?}", parse_error.span()); } } }
Dependencies
~4–9.5MB
~87K SLoC