10 releases

0.2.2 Mar 1, 2021
0.2.1 Jan 12, 2021
0.2.0 Dec 26, 2020
0.1.6 Oct 9, 2020
0.1.3 Sep 4, 2020

#655 in Math

MIT/Apache and LGPL-3.0+

4.5MB
2.5K SLoC

Rust implementation of LFSC

Installation Instructions

  1. Install the Rust toolchain using Rustup.
    • Be sure to add Cargo's bin directory to your path
      • i.e. by adding export PATH="$HOME/.cargo/bin/:$PATH" to your rc file.
  2. Run cargo install rlfsc

Possible improvements over the original LFSC

  • Tokenizer and checker are separate.
  • Tokenizer is not character-by-character
    • It's implemented using the logos library.
  • Automatically reference-counted expressions using std::rc::Rc.
  • No gotos/crazy stuff

State of Affairs

On our small test proof (pf), we're ~30% slower than LFSC (10.8ms vs 8.0ms) on my machine.

On the CVC4 signatures (sig), we're ~30% faster than LFSC (3.0ms vs 4.4ms) on my machine.

This suggests that our tokenization is indeed better, but our checking is slower.

We do lots of recursion, without tail recursion, causing us to blow small stacks.

Type checking algorithm

It's based on to Aaron's 2008 manuscript, "Proof Checking Technology for Satisfiability Modulo Theories", but, in general, has no optimizations.

Notably:

  • We always construct an AST for checked terms.
  • In-scope identifiers are kept in a global map
  • When an abstraction expression is instantiated, we substitute the provide value
    • e.g. (\x t) v becomes t[v / x], literally. I.e. we take t apart, looking for all the x's, replace them with v's, and rebuild t.
  • In code expressions, we do not do this.
    • e.g., when we evaluate (let x v t), we put x -> v in the environment, and evaluate t, rather than substituting.

Todos

  • Tail Recursion
    • C historically has had poor support for tail recursion, which is why the original LFSC manually implemented it with gotos.
    • We can't (and won't) do that in Rust.
    • We should implement a trampoline-based approach
  • Term destruction
    • When destructing terms (auto-implemented), we go into deep recursion.
    • We should manually implement some destructors to prevent this.
  • Implement a streaming tokenizer.
    • Right now we need to bring the whole input into memory before tokenizing.
    • Option 1: Modify logos to allow this by implementing logos::Source for std::io::Read.
      • Doesn't quite work. We'll need to modify the requirements of Source a bit.
    • Option 2: Switch to a different tokenizing library.
      • Perhaps nom? It's typically used for parsing, but it does support streaming.
    • Option 3: Roll our own.
      • I think this is a bad idea. If this blog post taught me anything, it's that high-performance text processing is more complex than most people think. I think it would be very hard to match the perf of a dedicated library like logos or nom.

Dependencies

~7.5MB
~105K SLoC