#location #source #track #original #keeping

location_info

Library for keeping track of the original source location of things

2 releases

0.1.0-alpha-1 May 25, 2023

#8 in #keeping

Apache-2.0 OR MIT

14KB
176 lines

location_info

A library to keep track of where in the input source code things come from.

See https://docs.rs/location_info for documentation


lib.rs:

A library defining a type for wrapping values in their corresponding source code location, and conveniently creating and transforming these without loosing track of the location.

Setup

The crate is generic over your compiler's source code location representation, and for convenience it is probably a good idea to create a type alias for your particluar type. For example, if representing your locations as a file ID along with a byte range, you would write

use location_info::{Location};

// Define the location type
#[derive(Clone)]
pub struct Span(usize, std::ops::Range<u64>);

impl Location for Span {
fn nowhere() -> Self {
// Here users of our compiler should never encounter thigns without valid
// source mapping, so we'll use a dummy value.
Span(0, 0..0)
}

fn between(Self(start_file, start_range): &Self, Self(end_file, end_range): &Self) -> Self {
assert!(start_file == end_file); // A location spanning multiple files is weird
Span(*start_file, start_range.start.min(end_range.start) .. start_range.end.max(end_range.end))
}
}

// Avoid having to specify the other generic parameter in your compiler
type Loc<T> = location_info::Loc<T, Span>;

It can also be helpful to implement From<T> for your Location type for Ts which you will often convert into locations. For example, tokens in your parser:


enum TokenKind {
// ...
}
struct Token {
kind: TokenKind,
file: usize,
offset: std::ops::Range<u64>
}

impl From<&Token> for Span {
fn from(tok: &Token) -> Span {
Span(tok.file, tok.offset.clone())
}
}

Usage

To use the crate, wrap the values you want to track the source location of in Loc<T>. For example, an AST node for function heads might look like

struct Identifier(String);
struct FnHead {
fn_keyword: Loc<()>,
name: Loc<Identifier>,
args: Loc<Vec<Loc<Identifier>>>
}

Attaching location info is done using the methods in [WithLocation].

use location_info::WithLocation;
enum TokenKind {
Fn,
Identifier(String),
OpenParen,
CloseParen,
}

fn parse_fn_head(tokens: &mut impl Iterator<Item=Token>) -> Result<Loc<FnHead>> {
let fn_keyword = parse_fn_keyword(tokens)?;
let name = parse_ident(tokens)?;

let open_paren = parse_open_paren(tokens)?;
let mut args_raw = vec![];
while let ident @ Loc{inner: Identifier(_), ..} = parse_ident(tokens)? {
args_raw.push(ident);
}
let close_paren = parse_close_paren(tokens)?;
// The span of the argument list is the span between open_paren and close_paren
let args = args_raw.between(&open_paren, &close_paren);

Ok(
FnHead {
// We don't want the token, just a Loc<()>, those can be created
// with ().at(...)
fn_keyword: ().at(&fn_keyword),
name,
args
}.between(&fn_keyword, &close_paren)
)
}

Mapping

After location info has been created, it is often useful to be able to transform the internal struct, for example when lowering from one IR to another. The [Loc] struct has several functions mapping over the contained value.

Dependencies

~0.4–1MB
~23K SLoC