#optimization #architecture #caching #no-std #cache-optimizations

macro no-std cuneiform

Cache optimizations for Rust, revived from the slabs of Sumer

3 unstable releases

0.1.1 Feb 22, 2020
0.1.0 Dec 29, 2019
0.0.0 Dec 27, 2019

#86 in #architecture

Download history 12/week @ 2023-11-20 8/week @ 2023-11-27 4/week @ 2023-12-04 20/week @ 2023-12-11 31/week @ 2023-12-18 12/week @ 2023-12-25 7/week @ 2024-01-08 10/week @ 2024-01-22 5/week @ 2024-02-05 42/week @ 2024-02-12 41/week @ 2024-02-19 29/week @ 2024-02-26 20/week @ 2024-03-04

132 downloads per month
Used in 5 crates (via cuneiform-fields)

Apache-2.0/MIT

16KB
198 lines

In memory optimizations for Rust, revived from the slabs of Sumer.

Build Status Latest Version Rust Documentation

This crate provides proc macro attributes to improve in memory data access times.

Cuneiform's main macro can take various arguments at attribute position:

  • hermetic = true|false (default is true when #[cuneiform])
    • Hermetic enables cuneiform to detect cache sizes from OSes which have API to fetch.
    • Currently, hermetic argument works only Linux kernel 2.6.32 and above.
    • If system is different than supported systems it falls back to slabs.
  • slab = "board_or_architecture_name (e.g. #[cuneiform(slab = "powerpc_mpc8xx")])
    • Slabs are either embedded system boards or other specific architecture.
    • Slab checking have two stages:
      • First, it checks the given board/architecture if exist.
      • If not slabs fall back to Rust supported architectures.
      • Still architecture is not detected, it will fall back to default values.
  • force = u8 (e.g. #[cuneiform(force = 16)])
    • Forces a given cache alignment. Overrides all other systems mentioned above.
[dependencies]
cuneiform = "0.1"

Examples

Basic usage can be:

// Defaults to `hermetic = true`
#[cuneiform]
pub struct Varying {
    data: u8,
    data_2: u16,
}

Targeting specific architecture:

#[cuneiform(slab = "powerpc_mpc8xx")]
pub struct SlabBased {
    data: u8,
    data_2: u16,
}

Overriding the default cache alignment:

#[cuneiform(force = 16)]
pub struct Forced {
    data: u8,
    data_2: u16,
}

Field level cache optimizations

Check out cuneiform-fields for field level optimizations.

Design choices

  • Cuneiform doesn't have specific instruction or architecture specific code.
  • Works with crates like #![no_std]. Ease your pain for cache optimizations. With allocator you can compile on the board too.
  • Not based on assumptions. Based on Linux tree, OS checks, provider manuals and related documentation.

Before opening a PR

  • If it is big.LITTLE architecture, separate both parts in slabs. Apply the naming conventions.
  • Check existing slabs before opening a PR. Please update it when you add one.
  • If you come up with instructionless detection for hermetic alignment. Be sure that tests are included and not breaking existing platforms.

Existing Slabs

  • powerpc_mpc8xx
  • powerpc64bridge
  • powerpc_e500mc
  • power_7
  • power_8
  • power_9
  • exynos_big
  • exynos_LITTLE
  • krait
  • neoverse_n1

Dependencies

~1.2–1.7MB
~35K SLoC