### 10 releases

new 0.2.3 | Sep 26, 2023 |
---|---|

0.2.2 | Sep 26, 2023 |

0.1.5 | Jul 10, 2023 |

0.1.4 | Jun 6, 2023 |

0.1.2 | Aug 26, 2022 |

#**126** in Algorithms

**123** downloads per month

**MIT**license

175KB

4.5K
SLoC

# Eqlog

An extension of datalog with function symbols and equality. Combines datalog with congruence closure. Compiles to specialized rust code that easily integrates into rust projects.

## Example: Semilattices

Semilattices are partial orders (i.e. sets equipped with reflexive, transitive and antisymmetric relations) with binary meets. They can be formalized in the following eqlog theory:

`//` semilattice.eqlog
`//` The carrier set.
`type` `El``;`
`//` The less-equal relation.
pred le`:` El `*` El`;`
`//` The less-equal relation is reflexive.
rule reflexivity `{`
`if` x`:` El`;`
then `le``(`x`,` x`)``;`
`}`
`//` The less-equal relation is transitive.
rule transitivity `{`
`if` `le``(`x`,` y`)``;`
`if` `le``(`y`,` z`)``;`
then `le``(`x`,` z`)``;`
`}`
`//` The less-equal relation is antisymmetric.
rule antisymmetry `{`
`if` `le``(`x`,` y`)``;`
`if` `le``(`y`,` x`)``;`
then x `=` y`;`
`}`
`//` A function assigning binary meets, i.e. binary greatest lower bound.
func `meet``(`El`,` El`)` `->` El`;`
`//` The meet function is total, i.e. defined for all elements x, y.
rule totality `{`
`if` x`:` El`;`
`if` y`:` El`;`
then `meet``(`x`,` y`)``!``;`
`}`
`//` The meet is a lower bound.
rule lower_bound `{`
`if` m `=` `meet``(`x`,` y`)``;`
then `le``(`m`,` x`)``;`
then `le``(`m`,` y`)``;`
`}`
`//` The meet is the greatest lower bound.
rule greatest_lower_bound `{`
`if` `le``(`z`,` x`)``;`
`if` `le``(`z`,` y`)``;`
`if` m `=` `meet``(`x`,` y`)``;`
then `le``(`z`,` m`)``;`
`}`

Eqlog translates this eqlog file to a rust module that allows computations on models of the semilattice theory, i.e. semilattices. For example, we can verify that the meet function in a semilattice is associative:

`//` main.rs
`use` `eqlog_runtime``::`eqlog_mod`;`
`eqlog_mod!``(`semilattice`)``;`
`use` `crate``::``semilattice``::``*``;`
`fn` `main``(``)`` ``{`
`//` Create an empty semilattice structure and add three elements to it.
`let` `mut` sl `=` `Semilattice``::`new`(``)``;`
`let` x `=` sl`.``new_el``(``)``;`
`let` y `=` sl`.``new_el``(``)``;`
`let` z `=` sl`.``new_el``(``)``;`
`//` Close the semilattice structure by matching atoms in if statements of rules
`//` and adding then atoms until we reach a fixed point.
sl`.``close``(``)``;`
`//` sl satisfies all semilattice rules now.
`//` Evaluate the left-associated meet xy_z = (x /\ y) /\ z.
`let` xy `=` sl`.``meet``(`x`,` y`)``.``unwrap``(``)``;`
`let` xy_z `=` sl`.``meet``(`xy`,` z`)``.``unwrap``(``)``;`
`//` Evaluate the right-associated meet x_yz = x /\ (y /\ z).
`let` yz `=` sl`.``meet``(`y`,` z`)``.``unwrap``(``)``;`
`let` x_yz `=` sl`.``meet``(`x`,` yz`)``.``unwrap``(``)``;`
`//` Check that the two elements are equal.
`if` sl`.``are_equal_el``(`xy_z`,` x_yz`)` `{`
`println!``(``"`Meet is associative.`"``)``;`
`}` `else` `{`
`println!``(``"`Meet is not associative.`"``)``;`
`}`
`}`

The eqlog-test/src contains more examples:

## Integration into rust projects

A full sample project can be found in examples/semilattice.
Eqlog consists of the compiler crate

, which is only needed during build, and the runtime crate `eqlog`

.
We can specify them as dependencies in `eqlog-runtime`

by adding the following lines:`Cargo .toml`

`[``dependencies``]`
`eqlog-runtime ``=` `"`0`"`
`[``build-dependencies``]`
`eqlog ``=` `"`0`"`

In order for our rust code to integrate with the code generated by eqlog, we need to run eqlog before building the crate itself.
This can be accomplished by adding the following

file next to `build .rs`

`Cargo``.`toml

in the package root directory:`fn` `main``(``)`` ``{`
`eqlog``::`process_root`(``)``;`
`}`

Cargo automatically executes the

file before building.
`build .rs`

`eqlog``::`process_root`(``)`

searches for files under the `src`

directory and generates a rust module for each `.`eqlog

file.
To declare the rust module generated from an eqlog file, we use the `eqlog_mod!`

macro:`use` `eqlog_runtime``::`eqlog_mod`;`
`eqlog_mod!``(``<`filename without extension`>``)``;`

Note that a special invocation that specifies the full path is needed for eqlog files in nested subdirectories of

.`src`

## Language

Each eqlog file consists of a sequence of type, predicate, function and rule declarations. Mathematically, eqlog is a way to specify essentially algebraic theories.

### Types

Types represent the different carrier sets of models of our theory. They are declared as follows:

`type` `<`TypeName`>``;`

### Predicates

Predicates are declared as follows:

`pred ``<`PredName`>``(``<`Type_1`>``,` `...``,` `<`Type_n`>``)``;`

Each predicate has an arity, which is the list of types appearing in the parentheses.

### Functions

Functions are declared as follows:

`func ``<`FuncName`>``(``<`ArgType_1`>``,` `...``,` `<`ArgType_n`>``)` `->` `<`ResultType`>``;`

Each function has a domain, which is the list of types appearing before the arrow, and a codomain type after the arrow.

A function with empty domain is a constant. Constants are declared as follows:

`func ``<`ConstantName`>` `:` `<`Type`>``;`

To eqlog, functions are synonymous to *partial* functions; they need not be total.

### Rules

Rules are of the form

`rule ``<`name`>` `{`
`if` `<`Premise_1`>``;`
`if` `<`Premise_2`>``;`
`...`
`if` `<`Premise_m`>``;`
then `<`Conclusion_1`>``;`
then `<`Conclusion_2`>``;`
`...`
then `<`Conclusion_n`>``;`
`}`

where the

and `<`Premise_i`>`

are `<`Conclusion_j`>`*atoms*.

Most atoms contain *terms*.
A term is either a variable or an application of a function symbol to terms.
Variables that are used only once should be replaced with a wildcard

.`_`

Eqlog supports the following atoms:

- Atoms of the form

. Such atoms assert that`<`PredName`>``(``<`arg_1`>``,``...``,``<`arg_n`>``)`

must satisfy the`<`arg_1`>``,``...``,``<`arg_n`>`

predicate.`<`PredName`>` - Atoms of the form

. Note the exclamation mark. Such atoms assert that`<`term`>``!`

is defined, i.e. that the involved functions are defined on their arguments. In a`<``term``>`

clause,`then`

can be given a name like so:`<``term``>`

.`<`var_name`>``:``=``<`term`>``!` - Atoms of the form

. Such atoms assert that the terms`<`tm_1`>``=``<`tm_2`>`

and`<`tm_1`>`

are both defined and equal.`<`tm_2`>` - Atoms of the form

. Such atoms assert that`<`var_name`>``:``<`TypeName`>`

is a variable of type`<`var_name`>`

. They can only appear in`<`TypeName`>`

clauses.`if`

Except for the

form of the exclamation mark operator, no additional variables may be introduced in then statements.
Every term appearing in a `<`var`>` `:``=` `...`

statement of a rule must be equal to a term appearing in an earlier `then`

statement.
The only way to circumvent this restriction is to add a statement of the form `if`

.
Later `then <tm>!;`

`then`

statements atoms can then freely use `<``tm``>`

.For example, consider the following invalid and valid rules for the semilattice theory above:

`//` Invalid: Cannot infer type of x.
rule `{`
`if` x `=` x`;`
then x `=` x`;`
`}`
`//` Valid (but nonsensical):
rule `{`
`if` x`:` El`;`
then x `=` x`;`
`}`
`//` Invalid: x and y are not introduced in an `if` statement.
rule `{`
then `meet``(`x`,` y`)``!``;`
`}`
`//` Valid:
rule `{`
`if` x`:` El`;`
`if` y`:` El`;`
then `meet``(`x`,` y`)``!``;`
`}`
`//` Invalid: meet(x, y) is not equal to a term occurring earlier.
rule `{`
`if` `le``(`z`,` x`)``;`
`if` `le``(`z`,` y`)``;`
then `le``(`z`,` `meet``(`x`,` y`)``)``;`
`}`
`//` Valid: Assert that meet(x, y) exists in a then statement.
rule `{`
`if` `le``(`z`,` x`)``;`
`if` `le``(`z`,` y`)``;`
then `meet``(`x`,` y`)``!`
then `le``(`z`,` `meet``(`x`,` y`)``)``;`
`}`
`//` Valid: Suppose that meet(x, y) exists in if statement.
rule `{`
`if` `le``(`z`,` x`)``;`
`if` `le``(`z`,` y`)``;`
`if` `meet``(`x`,` y`)``!``;`
then `le``(`z`,` `meet``(`x`,` y`)``)``;`
`}`
`//` Invalid: Both Meet(x, y) and Meet(y, x) do not occur earlier.
rule `{`
`if` x`:` El`;`
`if` y`:` El`;`
then `meet``(`x`,` y`)``;`
then `meet``(`y`,` x`)``;`
`}`
`//` Valid: the term on the left-hand side of the equation is introduced
`//` in an if statement.
rule `{`
`if` `meet``(`x`,` y`)``!``;`
then `meet``(`x`,` y`)``;`
then `meet``(`y`,` x`)``;`
`}`
`//` Valid: the term on the right-hand side of the equation is introduced
`//` earlier in a then statement.
rule `{`
`if` x`:` El`;`
`if` y`:` El`;`
then `meet``(`x`,` y`)``!``;`
then `meet``(`y`,` x`)` `=` `meet``(`x`,` y`)``;`
`}`
`//` Invalid: meet(x, y) is not equal to a term that occurs earlier.
rule `{`
`if` u `=` `meet``(`x`,` `meet``(`y`,` z`)``)``!``;`
then `meet``(``meet``(`x`,` y`)``,` z`)` `=` u`;`
`}`
`//` Valid: All of u, Meet(x, y) and z are introduced in an if statement.
rule `{`
`if` u `=` `meet``(`x`,` `meet``(`y`,` z`)``)``!``;`
`if` `meet``(`x`,` y`)``!``;`
then u `=` `meet``(``meet``(`x`,` y`)``,` z`)``;`
`}`

## Generated rust interfaces

Eqlog translates each

to an `.`eqlog

file.
The rust file must be declared inside a module of the `.`rs

directory (e.g. `src`

or `lib .rs`

`main``.`rs

) using the `eqlog_runtime``::`eqlog_mod`!`

macro.Eqlog generates documented rust code. To build and view this documentation, run

`cargo`` doc`` --`document-private-items` --`open

The public API of the generated rust module consists of the following symbols:

- For each Eqlog type, a Rust type

of element ids for this type.`<`TypeName`>` - The model structure.
Its name is derived by converting the file name to upper camel case.
For example, for

, this would be`semi_group``.`eqlog

.`SemiGroup`

The model structure has the following member functions:

`fn``new``(``)``->``Self`

Creates an empty model structure.`fn``close``(``&``mut``self``)`

Close the model under all rules.`pub``fn``close_until``(``&``mut``self`,`condition``:`impl Fn`(`&`Self``)`->`bool``)``->``bool`

Close the model under all rules until a condition is satisfied. Returns false if the model could be closed under all rules but the condition still does not hold.- For each type:
`fn``new_``<`sort_name`>``(``&``mut``self``)``->``<`TypeName`>`

Adjoins a new element to the model structure.`fn``equate_``<`sort_name`>``(``&``mut``self`,`lhs``:`,`<`TypeName`>``rhs``:``<`TypeName`>``)`

Enforces the equality

in the model structure.`lhs``=`rhs`fn``are_equal_``<`sort_name`>``(``&``self`,`lhs``:`,`<`TypeName`>``rhs``:``<`TypeName`>``)``->``bool`

Returns true if

and`lhs`

represent the same element.`rhs``fn``root_``<`sort_name`>``(``&``self`,`el``:``<`TypeName`>``)``->``<`TypeName`>`

Returns the canonical/root element of the equivalence class of an element.

- For each predicate:
`fn``<`pred_name`>``(``&``self``,`arg_1`:``<`Type_1`>``,``...``,`arg_n`:``<`Type_n`>``)`

Checks whether the predicate holds on the given arguments.`fn``iter_``<`pred_name`>``(``&``self``)``->`impl`'``_``+``Iterator``<`Item =`<`arity`>``>`

Returns an iterator over tuples satisfying the predicate. The item type yielded by the iterator is a tuple type determined by the arity.`fn``insert_``<`pred_name`>``(``&``mut``self`,`arg_1``:`, ...,`<`Type_1`>``arg_n``:``<`Type_n`>``)`

Enforces that the predicate holds for the given arguments.

- For each function:
`fn``<`func_name`>``(``&``self``,`arg_1`:``<`Type_1`>``,``...``,`arg_n`:``<`Type_n`>``)``->``Option``<``<`ResultType`>``>`

Evaluates the function on the given arguments.`fn``iter_``<`func_name`>``(``&``self``)``->`impl`'``_``+``Iterator``<`Item =`<`arity`>``>`

Returns an iterator over tuples in the graph of the function. The item type yielded by the iterator is a tuple type determined by the arity.`fn``define_``<`func_name`>``(``&``mut``self`,`arg_1``:`, ...,`<`Type_1`>``arg_n``:``<`Type_n`>``)``->``<`ResultType`>`

Enforces that the function is defined on the given arguments, adjoining a new element if necessary.`fn``insert_``<`func_name`>``(``&``mut``self`,`arg_1``:`, ...,`<`Type_1`>``arg_n``:`,`<`Type_n`>``result``:``<`ResultType`>``)`

Insert a tuple into the graph of the function.

For example, these are the public declarations of the rust module generated from the semilattice theory:

`struct` `El``;`
`struct` `Semilattice``;`
`impl` `Semilattice` `{`
`fn` `new``(``)`` ``->` `Self``;`
`fn` `close``(``&``mut` `self``)``;`
`fn` `close_until``(``&``mut` `self`, `condition``:` impl Fn`(`&`Self``)` -> `bool``)`` ``->` `bool``;`
`fn` `new_el``(``&``mut` `self``)`` ``->` El`;`
`fn` `equate_el``(``&``mut` `self`, `lhs``:` El, `rhs``:` El`)``;`
`fn` `are_equal_el``(``&``self`, `lhs``:` El, `rhs``:` El`)`` ``->` `bool``;`
`fn` `root_el``(``&``self`, `el``:` El`)`` ``->` El`;`
`fn` `le``(``&``self`, `arg0``:` El, `arg1``:` El`)`` ``->` `bool``;`
`fn` `iter_le``(``&``self``)`` ``->` impl `'``_` `+` `Iterator``<`Item = `(`El, El`)``>``;`
`fn` `insert_le``(``&``mut` `self`, `arg0``:` El, `arg1``:` El`)``;`
`fn` `meet``(``&``self`, `arg0``:` El, `arg1``:` El`)`` ``->` `Option``<`El`>``;`
`fn` `iter_meet``(``&``self``)`` ``->` impl `'``_` `+` `Iterator``<`Item = `(`El, El, El`)``>``;`
`fn` `define_meet``(``&``mut` `self`, `arg0``:` El, `arg1``:` El`)`` ``->` El`;`
`fn` `insert_meet``(``&``mut` `self`, `arg0``:` El, `arg1``:` El, `result``:` El`)``;`
`}`

## Data model and algorithms

### Standard datalog features

The theory model structures generated by eqlog can be thought of as in-memory SQL databases, with schema given by type, predicate and function declarations.
Elements of a given type are simple integer ids, and the model structure maintains a list of valid ids for each type.
Every predicate

is represented as a table whose rows are the tuples of elements for which the predicate holds.
Functions are represented as graphs.
For example, if `p`

is a function in one argument, then the internal table for `f`

will consist of rows of the form `f`

.`(`x`,` `f``(`x`)``)`

The

function repeatedly enumerates all matches of if statements in rules and adjoins data corresponding to the then statements in the rule to the model.
Eventually, a fixed point is reached (unless the theory contains non-surjective rules, see below) and the algorithm stops.
For example, for the transitivity axiom`close`

`rule ``{`
`if` `le``(`x`,` y`)``;`
`if` `le``(`y`,` z`)``;`
then `le``(`x`,` z`)``;`
`}`

the close function enumerates all rows

and `(`x`,` y_0`)`

in the `(`y_1`,` z`)`

table such that `Le`

, and then adds the row `y_0 = y_1`

`(`x`,` z`)`

to `Le`

.
Eventually, the `Le`

table will already contain all the new rows `(`x`,` z`)`

we find, which means that we have reached the fixed point and can stop:
The `le`

predicate is transitive now.The enumeration of instances of if statements is known as (inner) join in SQL terminology.
SQL databases speed up inner joins using indices, and eqlog automatically selects and maintains indices to speed up the joins required to enumerate the rules listed in the eqlog file.
In each iteration, it is not necessary to enumerate if statements that were already enumerated in the previous iteration.
This optimization is known as *semi-naive evaluation*, and is again something that eqlog uses to speed up the

function.`close`

### Equalities

In addition to the standard datalog features discussed so far, eqlog supports equalities in

statements.
One example is the antisymmetry axiom of partial orders:`then`

`rule ``{`
`if` `le``(`x`,` y`)``;`
`if` `le``(`y`,` x`)``;`
then x `=` y`;`
`}`

Another source of equality in

statements are the implicit functionality axioms for functions:
For functions `then`

, if we find both `f`

and `(`x_1`,` `...``,` x_n`,` y`)`

in the graph of `(`x1`,` `,` `...``,` x_n`,` z`)`

, then we must derive `f`

.
If we denote the graph of `y = z`

`f`

by `g_f`

, then the implicit functionality axiom can be stated as follows:`rule ``{`
`if` `g_f``(`x_1`,` `...``,` x_n`,` y`)``;`
`if` `g_f``(`x_1`,` `...``,` x_n`,` z`)``;`
then y `=` z`;`
`}`

Note, however, that graphs of functions cannot be referred to directly in eqlog files.

To account for derived equalities, eqlog model structures maintain union-find data structures for each type.
When an equality

is derived during a call to `x = y`

`close`

, eqlog merges the equivalence classes of `x`

and `y`

.To speed up computation of the joins required when matching if statements, eqlog maintains indices for each table. However, these indices can only be used if element ids can be compared for equality directly instead of consulting a union find data structure. Eqlog thus maintains the invariant that all predicate and function graph tables contain canonical type elements only, i.e. only elements whose ids are root nodes with respect to the corresponding union find data structure.

This invariant is temporarily violated when eqlog merges the equivalence class of some element

into that of an element `x`

.
To restore the invariant, eqlog removes all rows from tables that contain `y`

, replaces `x`

by the new root id `x`

, and inserts the rows again.
To speed up this process, eqlog maintains a list for each root id that contains all rows in which the root id appears.`y`

### Non-surjective rules and non-termination

Recall that eqlog rules need to be surjective unless the exclamation mark operator

is used:
Every element in a then statement must be equal to some element in an if statement.
Closing model structures under surjective rules does not increase the number of elements in the model, which guarantees that the `!`

function eventually terminates.`close`

If there are non-surjective rules, then this is not guaranteed. Consider the following eqlog theory that formalizes natural numbers:

`type` `N``;`
zero`:` N`;`
succ`:` N `->` N`;`
rule `{`
then Zero`(``)``!``;`
`}`
rule `{`
`if` n`:` N`;`
then Succ`(`n`)``!``;`
`}`

Closing the empty model structure will first add an element

to the model, then the element `zero`

, then `succ``(`zero`)`

and so forth.
However, the presence of non-surjective rules does not necessarily mean that the close function must run indefinitely.
For example, the semilattice theory contains the non-surjective rule`succ``(``succ``(`zero`)``)`

`rule ``{`
`if` x`:` El`;`
`if` y `:` El`;`
then `meet``(`x`,` y`)``!``;`
`}`

but

nevertheless terminates.`close`

If a theory contains non-surjective rules, the generated

function will consist of nested close loops:
The outer loop is responsible for non-surjective rules, and the inner loop is responsible for surjective rules.
In pseudo-code, the `close`

function looks a bit like this:`close`

`//` Match `if` statements in a rule and change the model such that the
`//` corresponding `then` statements holds. Returns true if the model was
`//` changed, i.e. if new facts were concluded.
`fn` `adjoin_conclusions``(``ax``:` Axiom`)`` ``->` `bool` `{`
`...`
`}`
`fn` `close``(``)`` ``{`
`loop` `{`
`loop` `{`
`let` model_changed `=` surjective_rules`.``iter``(``)``.``any``(`adjoin_conclusion`)``;`
`if` `!`model_changed `{`
`break``;`
`}`
`}`
`let` model_changed `=` non_surjective_rules`.``iter``(``)``.``any``(`adjoin_conclusion`)``;`
`if` `!`model_changed `{`
`break``;`
`}`
`}`
`}`

## Background

For a more thorough explanation of how Eqlog works, have a look at these papers:

*An Evaluation Algorithm for Datalog with Equality*[Web] [PDF]*Algebraic Semantics of Datalog with Equality*[Web] [PDF]

Also have a look at the egglog project and the corresponding paper. Even though its surface language looks a bit different, Egglog is based on very similar ideas as those underlying Eqlog, and can be used for many of the same applications.

#### Dependencies

~4–6.5MB

~106K SLoC