#recursion #egg #interface #e-graphs #language #ast #parameters

egg_recursive

A recursive interface for egg: e-graphs good without S-expresion!

1 unstable release

new 0.1.0 Feb 7, 2025

#920 in Rust patterns

Download history 164/week @ 2025-02-04

164 downloads per month

MIT/Apache

23KB
374 lines

egg_recursive - Recursive interface for egg

This crate provides a recursive interface to egg. egg alrady comes with S-expression-based interface, but it has the following drawbacks:

  • It uses FromStr and Display to parse/format textual representation of ASTs.
    • These CAN be used for other purposes and this can cause a conflict with other applications.
  • Parser favours the first clause for terminal variants with the same parameter.
    • This can result in unexpected behaviour under the existence of ambiguity.
  • ALL textual representation of ASTs fed to egg::rewrite is checked at RUNTIME.
    • This means we can't see syntax error until compilation;
    • This further complicates the debugging process.
  • S-expressions get harder and harder to be parsed by human eyes

This crate alleviates these problems by introducing a recursive interface to egg.


lib.rs:

This crate provides a recursive interface to [egg].

[egg] alrady comes with S-expression-based interface, but it has the following drawbacks:

  • It uses std::str::FromStr and std::fmt::Display to parse/format textual representation of ASTs.
    • These CAN be used for other purposes and this can cause a conflict with other applications.
  • Parser favours the first clause for terminal variants with the same parameter.
    • This can result in unexpected behaviour under the existence of ambiguity.
  • ALL textual representation of ASTs fed to egg::rewrite is checked at RUNTIME.
    • This means we can't see syntax error until compilation;
    • This further complicates the debugging process.
  • S-expressions get harder and harder to be parsed by human eyes

This crate alleviates these problems by introducing a recursive interface to [egg].

Abstraction over Language

You can define a recursive language as a ordinary Self-referencing enum, containing [Box]es, [[Self; N]], or [Vec]s or other terminal other types. Such language AST must implement Recursive trait and have a Signature type synonym. This also provides Pat<L> for pattern matching, which can be converted to/from Pattern<AsLanguage<L>>. We are providing a Language derive macro for automatically derive such types and impls.

use egg::*;
use egg_recursive::*;

#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash, Language)]
pub enum Arith {
    Num(i32),
    Neg(Box<Self>),
    Add([Box<Self>; 2]),
    Mul([Box<Self>; 2]),
}

This generates the following entities:

  1. Recursive impl for Arith;
  2. EggArith type alias for Signature<Arith, Id>, which implements egg::Language;
  3. ArithPat type for recursive pattern ASTs.
  4. ArithCons trait, which provides smart constructors for Arith and ArithPats.
    • ArithCons has trait methods like num(i64), add([Self; 2]), mul([Self; 2]), neg(Self), etc. It has just snake_cased variant of the original enum variants. If the snake_case version is reserved keyword, suffixed with _ (e.g. If becomes if_()).

An enum passed to self::Language should satisfy the following conditions:

  1. It MUST NOT have any generic parameters.
  2. Recursive variants MUST have only one field, which is one of the following (in any case, the type itself MUST be referenced by Self, not a enum name itself):
    • Box<Self>,
    • [Box<Self>; N],
    • Vec<Self>,
    • T<Box<Self>> for some IntoLanguageChildren type T (described in the next section).
  3. Arguments to the non-recursive variants MUST implement at least [Eq], [Ord], Hash, and Clone.

Helper Types for egg::LanguageChildren

Sometimes, the more the LanguageChildren has a components, the harder to get memorise the order/semantics of components. To alleviates this situation, we introduce IntoLanguageChildren trait, which abstracts over view types of egg::LanguageChildren.

You don't have to write it manually and it can generally be derived by self::LanguageChildren derive macro.

Suppose we want to add boolean operations and if-expressions:

use egg::*;
use egg_recursive::*;

#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash, LanguageChildren)]
pub struct IfThenElse<T> {
    pub cond: T,
    pub then_branch: T,
    pub else_branch: T,
}

#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash, Language)]
pub enum Arith {
    Num(i32),
    Bool(bool),
    Neg(Box<Self>),
    Add([Box<Self>; 2]),
    Mul([Box<Self>; 2]),
    IsZero(Box<Self>),
    If(IfThenElse<Box<Self>>),
}

This generates RawIfThenElse and IntoLanguageChildren instance to convert it to LanguageChildren. You can call either IfThenElse::view or RawIfThenElse::unview to convert it to/from RawIfThenElse.

egg::LanguageChildren macro accepts types with the following conditions:

  1. It MUST be a struct with exactly one generic parameter.
  2. Every field must be one of the following (where T its only parameter):
  3. AT MOST ONE Vec<T> is allowed in its field.

Writing Rewrite Rules

egg_recursive provides a redefined rewrite! macro to write rewrite rules. This is almost the same as the original egg::rewrite! macro, but it allows you to write conditions in a more readable way. Due to the ambiguity issue, it doesn't support bidirectional rules (<=>) for the time being.

The following illustrates the usage of the macro:

pub fn make_rules() -> Vec<Rewrite<EggArith, ConstantFold>> {
    let v = |s: &str| ArithPat::pat_var(s);
    let x = || v("x");
    let y = || v("y");
    let x_var: Var = "?x".parse().unwrap();
    vec![
        rewrite!("add-comm"; x() + y() => y() + x()),
        rewrite!("mul-comm"; x() * y() => y() * x()),
        rewrite!("add-zero"; x() + ArithPat::float(0.0.into()) => x()),
        rewrite!("add-zero"; x() => x() + ArithPat::float(0.0.into())),
        rewrite!("mul-inv";
            x() * x().inv() => ArithPat::float(1.0.into())
        ;   if is_non_zero(x_var.clone())
        ),
    ]
}

In the above, + and * comes from std::ops::Add and std::ops::Mul instances defined manually, which is possible because ArithPat is a newtype, but not an alias.

Dependencies

~3.5–6MB
~108K SLoC