#syscalls #abi #operating-system #kernel #interface #define #calls

no-std syscall_encode

Traits and macros to help define a syscall interface for a kernel

5 releases

0.1.11 Aug 23, 2023
0.1.10 Aug 14, 2023

#563 in Encoding

Download history 5/week @ 2024-03-12 80/week @ 2024-04-02 13/week @ 2024-04-09

64 downloads per month

Custom license

60KB
1.5K SLoC

Syscall Encode

So, the way we do function calls is pretty nice. In Rust, roughly,

fn some_functionality(some_arg: ArgType, ...) -> ReturnType...;

System calls, on the other hand, don't look like this. They are usually structured more like stuffing a bunch of things into some registers and then issuing a "jumpy" instruction. Of course, under the hood, normal function calls are "no different", but the thing is that the language doesn't provide a nice abstraction for syscalls like it does for normal functions.

Instead we rely on the standard library, which may rely on libc, or other crates, to issue syscalls, partially because they are such a pain to write. Fortunately most people are writing code for, like, a real operating system that has real libraries. But... what if we are the ones writing the operating system? Wouldn't it be nice if we had a simple way to define syscall arguments, auto-encode them into registers, derive the syscall table semi-automatically, and just get away from the annoying tedium of manually going around implementing syscall types.

That's this crate.

A syscall is, instead, a collection of things:

  • A type that defines the arguments.
  • A type that defines the errors.
  • A type that defines the success return values.
  • A number.
  • A kernel side receiver.

That syscall can then be issued against some ABI that implements the SyscallAbi trait. On the kernel side, we define a handler that catches incoming syscalls and passes them to the code generated by the syscall_api macro.

There are two advantages to defining syscalls in terms of types instead of functions. The first is that we can use the type system of a competent language to constrain the behavior and use of a syscall type, and that happens to be the other reason also.

For example, a common pattern in Rust is to ensure that your code is correct by construction. In this case, by limiting how a struct can even be created, you limit your API consumers' ability to do the Wrong Thing. If we have a syscall, Foo, which can only be created as a result of calling syscall Bar, we have just ensured that (without unsafe) the user cannot issue a call to Foo without first calling Bar. Now, of course the kernel needs to be a little more careful than just blindly assuming that. But it helps userspace code avoid some classes of bugs.

Is it safe?

Well. It passes Miri with strict provenance. At least, in the test harness, which doesn't make actual syscalls. It does use unsafe, but each occurrence is documented. It's for syscalls, you gotta expect a little unsafe.

How fast is it?

We provide two ways of defining a syscall. One is using the "normal" API (implementing the SyscallApi trait), which can be applied to any type that has derived the SyscallEncodable trait (which can be derived). The other is the "fast" API, which requires the type to implement a number of Into and From methods for it to be usable as a syscall (implementing the SyscallFastApi trait).

Use the SyscallFastApi trait if the syscall in question is performance-critical and can easily fit within the registers as defined by the ABI. Use SyscallApi otherwise (it's more ergonomic and easier to implement, but not as fast).

Bench!

On my Mac M2:

     Running benches/encode.rs (target/release/deps/encode-913d3832e9f8a495)
encode_normal           time:   [175.53 ns 176.04 ns 176.59 ns]
                        change: [+0.2809% +0.6424% +1.0522%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

encode_fast             time:   [2.3385 ns 2.3446 ns 2.3514 ns]
                        change: [+0.6778% +1.0142% +1.3560%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe

Dependencies

~0.3–1MB
~22K SLoC