2 unstable releases
0.10.2 | Oct 22, 2023 |
---|---|
0.8.0 | Sep 11, 2017 |
0.6.3 |
|
0.5.0 |
|
#49 in FFI
180KB
4.5K
SLoC
Citrus: Convert C to Rust
This is a tool that helps convert C programs to Rust programs. It transforms C syntax to Rust syntax, but mostly ignores details of C semantics.
The generated programs may not run, and may even not compile. However, the tool produces readable source code that can be manually refactored into a Rust program.
Example
void gz_compress(FILE *in, gzFile out) {
char buf[BUFLEN];
int len;
int err;
for (;;) {
len = fread(buf, 1, sizeof(buf), in);
if (ferror(in)) {
perror("fread");
exit(1);
}
if (len == 0) break;
if (gzwrite(out, buf, (unsigned)len) != len) error(gzerror(out, &err));
}
fclose(in);
if (gzclose(out) != Z_OK) error("failed gzclose");
}
#[no_mangle]
pub unsafe extern "C" fn gz_compress(mut in_: *mut FILE, mut out: gzFile) {
let mut buf: [i8; 16384];
let mut len;
let mut err;
loop {
len = fread(buf, 1, std::mem::size_of_val(&buf), in_);
if ferror(in_) != 0 { perror("fread"); exit(1); }
if len == 0 { break ; }
if gzwrite(out, buf, len as c_uint) != len {
error(gzerror(out, &mut err));
};
}
fclose(in_);
if gzclose(out) != Z_OK { error("failed gzclose"); };
}
Installation
See releases for binary downloads.
Requires clang
to be installed. On macOS requires Xcode.
Usage
Converts one file at a time, prints to stdout:
citrus [<citrus options…>] <file.c> [<compiler args…>]
Options are:
--api=rust
— Allow Rust-only types in function arguments and don't export functions to C. Use this if you're porting all code at once.--api=c
— Generate all function arguments for C interoperability. Use this if you're porting code function by function.
Compiler args are standard flags required to compile the C file, such as -I<include dir>
and -D<macro>
.
citrus program.c -I./include
citrus --api=rust program.c > program.rs
The typical workflow is:
- Clean up the C code — bad C code makes even worse Rust code.
- Convert C syntax to Rust syntax (this one is automated by Citrus!).
- Keep rewriting C-isms into Rust-isms until it compiles (some easy translations are also done by Citrus).
- Refactor into idiomatic Rust code.
C is very weird from Rust perspective. The generated code will be very un-Rust-like. Please don't judge Rust by this :)
Preparing C code for conversion
-
Use
size_t
for all lenghts and array indexing (orssize_t
/ptrdiff_t
if you need negative values). Rust is super picky about this. -
Change as much as you can to
const
: variables, function arguments, globals. In Rust things are immutable by default.- Don't reuse variables, e.g. instead of one
i
reused throughout a function, (re)define it for each loop. - If you get "discards qualifiers in nested pointer types" error, it's not you, it's a "bug" in the C language.
- Don't reuse variables, e.g. instead of one
-
Minimize use of macros. Non-trivial macros are expanded during conversion and their high-level syntax is lost.
- Change function-like macros to
inline
functions. For conversion you may even want to undefineassert
,MAX
, andoffsetof
. - If you use macros to generate several versions of a function (such as
func_int
,func_float
), keep just one version with a unique typedef for the type. You'll be able to replace the typedefed name with a generic parameter later.
- Change function-like macros to
-
Replace
int
andlong
with types of specific size, such as<stdint.h>
'sint32_t
,int64_t
orsize_t
.- Use
bool
from<stdbool.h>
- Use
signed char
only when you really mean to use it for negative numbers. unsigned char
,short
,float
,double
andlong long
can be left as-is.
- Use
-
In function arguments use
arr[]
for arrays, and*ptr
for exactly one element.- Prefer array indexing over pointer arithmetic (
arr[i]
yes,ptr+i
no). - Bonus points for
f(size_t length, arr[static length])
(yes, it's a valid C syntax).
- Prefer array indexing over pointer arithmetic (
-
Add
__attribute__((nonnull))
to functions that should not be called withNULL
arguments. -
Change
for
loops to be in formatfor(size_t i = start; i < end; i++)
.- if you can't, then use
while
instead (but avoiddo..while
).
- if you can't, then use
-
Don't use
var++
in expressions. Use++var
or put it on its own line. Rust only allowsvar += 1;
-
Remove all
goto
and its labels. -
Remove "clever" micro-optimizations. They are really painful to convert, and most end up being not applicable.
- Avoid tricks with pointers, unions, type casting, or reuse same chunk of memory for different purposes.
- Consider returning things by value instead of taking in-out pointers, e.g. if you need to return multiple things, return a small struct by value instead.
- Make use of malloc/free clear and simple. Don't reuse/move pointers between different types of objects, because during conversion some of them may end up using libc malloc, and some Rust's
Vec
.
-
Having tests helps a lot. Not only unit tests, but also a high-level test such as a known-good whole program output.
Cleanup post conversion
- Add
use std::os::raw::*; use std::slice; use std::ptr
; - Verify code side-by-side to ensure nothing is missing or mistranslated.
- Watch out for implicit conversion to int and wrapping arithmetic.
- If you're going to port code function-by-function:
- Make all C functions extern (i.e. remove
static
) and usebindgen
to generate bindings for calling back to C. - Keep note which functions were really supposed to be public, because during transition everything ends up being
pub
.
- Make all C functions extern (i.e. remove
- Replace fixed-size arrays with slices or
Vec
. Rust's fixed-size arrays are PITA. - Fix array indexing with
let foo = slice::from_raw_parts(foo_ptr, number_of_elements_not_bytes)
.
Building the Citrus tool from scratch
Because if the C3 dependency it requires exactly LLVM 5.0 and a corresponding static Clang library (libclang.a
+ headers). You may need to build Clang from source for this (sorry). The stable C API of clang is not sufficient for the task, so Citrus has to use a fragile C++ clang API, which is not guaranteed to be compatible with anything.
Build LLVM 5 and static Clang from source. See more detailed build instructions. Set variables to LLVM and Clang locations:
# Must have 'libclang.a'
export LIBCLANG_STATIC_PATH=…/clang/build/lib/
# Path straight to the 'llvm-config' executable
export LLVM_CONFIG_PATH=…/llvm/bin/llvm-config
# Should contain 'clang' and 'clang-c' sub-directories
export LIBCLANG_INCLUDE_PATH=…/clang/include
cargo build
Dependencies
~4–5.5MB
~118K SLoC