#serialization #edn #data #deserialize

rsedn

A Rust library for reading and writing EDN (Extensible Data Notation) data

2 unstable releases

0.2.0 Aug 12, 2024
0.1.0 Aug 10, 2024

#904 in Parser implementations

Download history 204/week @ 2024-08-09 24/week @ 2024-08-16 27/week @ 2024-09-13 19/week @ 2024-09-20 32/week @ 2024-09-27 8/week @ 2024-10-04

86 downloads per month

MIT license

44KB
1K SLoC

rsedn

rsedn is a crate that implements a subset(atm) of Extensible Data Notation

Supported Syntax

  • ( ) lists
  • [ ] vectors
  • { } maps
  • #{ } sets
  • symbols (the full edn symbol specification)
  • :keywords
  • #user/tags
  • #_discard (not being discarted, just parsed)
  • Integers (unsupported arbitrary precision integers)
  • Floats
  • Boolean
  • nil
  • Strings (unsupported \uNNNN unicode sequences)
  • Characters
  • Built-in tagged elements (#inst and #uuid)
  • Comments

Usage

rsedn usage is aplit into 4 steps:

  1. Build a Source from a &str (use rsedn::source_from_str)
  2. Lex Source to produce a Vec<Lexeme> (use rsedn::lex_source)
  3. Parse each Lexeme to produce a Token (use rsedn::parse_lexeme)
  4. Create a TokenStream (use LinkedList::iter) and consume it to produce a Form (use rsedn::consume_token_stream)

Concepts

Source

A wrapper around the source code, we always refer to source code as &'source str. It can be latter used to get the span (the actual text) of some Lexeme

Lexeme

Stores the coordinates of a piece of meaningful source code (just coordinates, no text), it doens't classifies it, just knows that the given piece of text may have a meaning.

For instance: (println) has 3 lexemes: (, println, and ), (def var 5) has 5 lexemes: (, def, var, 5 and )

Token

A wrapper around a lexeme that stores the span and what kind of token it is. It classifies by reading the span and checking the syntax of the corresponding piece of source code.

Producing a Token may produce a TokenizationError when the lexeme isn't syntatically right.

Form

The final step, it's built by one or more tokens and represents an edn form: a list, a vector, a symbol, etc.

Almost no manipulation is done with the source text, except the parsing of text into values like: i64, f64, bool and String for the corresponding edn forms.

Forms are what you may use out of this library.

Producing forms may produce a ParsingError when the tokens in the token stream aren't the expected ones.

Dependencies

~1.5MB
~22K SLoC