3 releases (breaking)

Uses old Rust 2015

0.8.0 Sep 11, 2017
0.6.3 Sep 3, 2017
0.5.0 Aug 19, 2017

#73 in FFI

25 downloads per month

GPL-3.0+ AND BSD-3-Clause

145KB
3.5K SLoC

Citrus: Convert C to Rust

This is a tool that helps convert C programs to Rust programs. It transforms C syntax to Rust syntax, but ignores C semantics.

The generated programs will not run, and may even not compile. However, the tool produces readable source code that can be manually refactored into a Rust program.

Note: This is a very early version. There are still obvious bugs and key language features missing. Please contribute bug reports with test cases and, if you can, fixes!

Example

void gz_compress(FILE *in, gzFile out) {
    char buf[BUFLEN];
    int len;
    int err;

    for (;;) {
        len = fread(buf, 1, sizeof(buf), in);
        if (ferror(in)) {
            perror("fread");
            exit(1);
        }
        if (len == 0) break;
        if (gzwrite(out, buf, (unsigned)len) != len) error(gzerror(out, &err));
    }
    fclose(in);
    if (gzclose(out) != Z_OK) error("failed gzclose");
}
#[no_mangle]
unsafe extern "C" fn gz_compress(mut in_: *mut FILE, mut out: gzFile) {
    let mut buf: [i8; 16384];
    let mut len;
    let mut err;
    loop  {
        len = fread(buf, 1, std::mem::size_of_val(&buf), in_);
        if ferror(in_) { perror("fread"); exit(1); }
        if len == 0 { break  }
        if gzwrite(out, buf, len as c_uint) != len {
            error(gzerror(out, &mut err))
        };
    }
    fclose(in_);
    if gzclose(out) != Z_OK { error("failed gzclose") };
}

Installation

See releases for binary downloads.

Usage

Usage: citrus <file> [<compiler args…>]

citrus program.c -I./include

The typical workflow is:

  1. Clean up the C code — bad C code makes even worse Rust code.
  2. Convert C syntax to Rust syntax (this one is automated by Citrus!).
  3. Keep rewriting C-isms into Rust-isms until it compiles (some easy translations are also done by Citrus).
  4. Refactor into idiomatic Rust code.

C is very weird from Rust perspective. The generated code will be very un-Rust-like. Please don't judge Rust by this :)

Preparing C code for conversion

  • Use size_t for all lenghts and array indexing (or ssize_t/ptrdiff_t if you need negative values). Rust is super picky about this.

  • Change as much as you can to const: variables, function arguments, globals. In Rust things are immutable by default.

    • Don't reuse variables, e.g. instead of one i reused throughout a function, (re)define it for each loop.
    • If you get "discards qualifiers in nested pointer types" error, it's not you, it's a "bug" in the C language.
  • Minimize use of macros. Non-trivial macros are expanded during conversion and their high-level syntax is lost.

    • Change function-like macros to inline functions. For conversion you may even want to undefine assert, MAX, and offsetof.
    • If you use macros to generate several versions of a function (such as func_int, func_float), keep just one version with a unique typedef for the type. You'll be able to replace the typedefed name with a generic parameter later.
  • Replace int and long with types of specific size, such as <stdint.h>'s int32_t, int64_t or size_t.

    • Use bool from <stdbool.h>
    • Use signed char only when you really mean to use it for negative numbers.
    • unsigned char, short, float, double and long long can be left as-is.
  • In function arguments use arr[] for arrays, and *ptr for exactly one element.

    • Prefer array indexing over pointer arithmetic (arr[i] yes, ptr+i no).
    • Bonus points for f(size_t length, arr[static length]) (yes, it's a valid C syntax).
  • Add __attribute__((nonnull)) to functions that should not be called with NULL arguments.

  • Change all internal functions to static. Syntax of extern C functions will become noisy.

  • Change for loops to be in format for(size_t i = start; i < end; i++).

    • if you can't, then use while instead (but avoid do..while).
  • Don't use var++ in expressions. Use ++var or put it on its own line. Rust only allows var += 1;

  • Remove all goto and its labels.

  • Remove "clever" micro-optimizations. They are really painful to convert, and most end up being not applicable.

    • Avoid tricks with pointers, unions, type casting, or reuse same chunk of memory for different purposes.
    • Consider returning things by value instead of taking in-out pointers, e.g. if you need to return multiple things, return a small struct by value instead.
  • Having tests helps. Not only unit tests, but also a high-level test such as a known-good whole program output.

Cleanup post conversion

  • Verify code side-by-side to ensure nothing is missing or mistranslated.
  • Replace fixed-size arrays with slices or Vec. Rust's fixed-size arrays are PITA.

Building the Citrus tool from scratch

Because if the C3 dependency it requires exactly LLVM 4.0 and a corresponding static Clang library (libclang.a + headers). You may need to build Clang from source for this (sorry). The stable C API of clang is not sufficient for the task, so Citrus has to use a fragile C++ clang API, which is not guaranteed to be compatible with anything.

Build LLVM 4 and static Clang from source. See more detailed build instructions. Set variables to LLVM and Clang locations:

# Must have 'libclang.a'
export LIBCLANG_STATIC_PATH=…/clang/build/lib/

# Path straight to the 'llvm-config' executable
export LLVM_CONFIG_PATH=…/llvm/bin/llvm-config

# Should contain 'clang' and 'clang-c' sub-directories
export LIBCLANG_INCLUDE_PATH=…/clang/include
cargo build

Dependencies

~5MB
~102K SLoC