#parse #data #handling #untrusted

no-std dangerous

Safely and explicitly parse untrusted / dangerous data

5 releases (3 breaking)

new 0.3.0 Oct 27, 2020
0.2.0 Oct 15, 2020
0.1.1 Oct 8, 2020
0.1.0 Oct 8, 2020
0.0.0 Aug 30, 2020

#45 in Parser tooling

27 downloads per month

MIT license

155KB
3.5K SLoC

Build Status Coverage Status Crate Docs

rust-dangerous

Rust library for safely and explicitly handling untrusted aka dangerous data
Documentation hosted on docs.rs.

dangerous = "0.3"

Goals

  • Fast parsing.
  • Fast to compile.
  • Zero panics [1].
  • Zero-cost abstractions.
  • Minimal dependencies [2].
  • Retry/stream protocol support.
  • no-std / suitable for embedded.
  • Zero heap-allocations on success paths [3].
  • Primitive type support.
  • Optional verbose errors.

[1] Panics due to OOM are out-of-scope. Disable heap-allocations if this is a concern.
[2] Zero dependencies when both unicode and bytecount features are disabled.
[3] Zero heap-allocations when the full-context feature is disabled.

This library's intentions are to provide a simple interface for explicitly parsing untrusted data safely. dangerous really shines with parsing binary or simple text data formats and protocols. It is not a deserialisation library like what serde provides, but you could write a parser with dangerous that could be used within a deserialiser.

Panics and unhandled/unacknowledged data are two footguns this library seeks to prevent. An optional, but solid, debugging interface with sane input formatting and helpful errors is included to weed out problems before, or after they arise in production.

Usage

fn decode_message<'i, E>(r: &mut Reader<'i, E>) -> Result<Message<'i>, E>
where
    E: Error<'i>,
{
    r.context("message", |r| {
        // Expect version 1
        r.context("version", |r| r.consume_u8(0x01))?;
        // Read the body length
        let body_len = r.context("body len", |r| r.read_u8())?;
        // Take the body input
        let body = r.context("body", |r| {
            let body_input = r.take(body_len as usize)?;
            // Decode the body input as a UTF-8 str
            body_input.to_dangerous_str::<E>()
        })?;
        // We did it!
        Ok(Message { body })
    })
}

let input = dangerous::input(/* data */);
let result: Result<_, Invalid> = input.read_all(decode_message);

Errors

Custom errors for protocols often do not provide much context around why and where a specific problem occurs within input. Passing down errors as simple as core::str::Utf8Error may be useful enough to debug while in development, however when just written into logs without the input/context, often amount to noise. At this stage you are almost better off with a simple input error.

This problem is amplified with any trivial recursive-descent parser as the context around a sub-slice is lost, rendering any error offsets useless when passed back up to the root. dangerous fixes this by capturing the context around and above the error.

Ever tried working backwards from something like this?

[ERRO]: ahhh!: Utf8Error { valid_up_to: 2, error_len: Some(1) }

Wouldn't it be better if this was the alternative?

[ERRO]: ahhh!: error attempting to convert input to str: expected utf-8 code point
> ['h' 'e' ff 'l' 'o']
           ^^
additional:
  error offset: 2, input length: 5
backtrace:
  1. `read all`
  2. `read` (expected message)
  3. `read` (expected body)
  4. `convert input to str` (expected utf-8 code point)

Safety

This library has a instance of unsafe required for wrapping a byte slice into the Input DST and multiple instances required for str::from_utf8_unchecked used in the display section module.

Inspiration

This project was originally inspired by untrusted.

Dependencies