#ir #compiler #framework #representation #high-level

bin+lib yair

A compiler framework written entirely in Rust

1 unstable release

0.1.0 Apr 1, 2021

CC0 license

260KB
5K SLoC

🦉 yair

Actions Status Crates.io API Docs

Yet Another Intermediate Representation (pronounced Ya! IR) is a compiler intermediate representation written entirely in Rust. Key design decisions make the representation unique:

  • Single Static Assignment representation [1].
  • No Φ (phi) nodes, basic blocks take arguments instead [2].
  • Pointers are type agnostic - they are just addresses into memory [3].
  • Target agnostic representation for de-coupling of components.
  • Strong seperation between library components (you don't need to build, link, or use components you don't need).

TODOs

  • Core:
    • Add per-domain functions and function multi-versioning.
  • Verifier:
    • When we have per-domain functions (CPU-only for instance) check for:
      • Recursion.
      • Calling a function in a conflicting domain (call GPU from CPU).
  • Add a cranelift code generation library.
  • Add an optimizer!

Features

The following features are present in the yair crate.

io

The 'io' feature is a default feature of the yair crate. It lets you consume and produce binary or textual intermediate representation files from yair. This allows for inspection, testing, and serialization to the intermediate representation.

When this feature is enabled, two additional binaries are produced alongside the library crate - yair-as and yair-dis, allowing for assembling and disassembling of the intermediate representation between the human readable textual form, and the binary form.

Additionally, there is a yair::io module that lets users read/write the textual or binary representation into a yair Library that they can work with.

.ya Files

The human readable representation of yair are .ya files. An example file is:

mod "😀" {
  fn foo(a : i64, b : i64) : i64 {
    bar(a : i64, b : i64):
      r = or a, b
      ret r
  }
}

Constants in .ya files are slightly strange - constants as used in the Library object are unique per the value and type combination for that given constant. But in the intermediate representation, constants are treated like any other value within the body of a basic block:

mod "😀" {
  fn foo(a : i64) : i64 {
    bar(a : i64):
      b = const i64 4
      r = or a, b
      ret r
  }
}

This means that constants behave like regular SSA notes for the purposes of the intemediate representation.

References

References 1

Static single assignment form.

References 2

This approach is similar in some ways to the Swift Intermediate Language approach - Swift's High-Level IR: A Case Study of Complementing LLVM IR with Language-Specific Optimization.

References 3

This is similar to something that was proposed for LLVM in 2015 but not yet (as of January 2021) enacted - Moving towards a singular pointer type.

Dependencies

~1.3–2.4MB
~48K SLoC