2 unstable releases
Uses old Rust 2015
0.2.0 | Nov 26, 2017 |
---|---|
0.1.0 | Nov 13, 2017 |
#1588 in Cryptography
52KB
1.5K
SLoC
PEM Iterator
Iterate over PEM-encoded data.
Features
- Enables decoding PEM formatted data via iterators.
- Fast. Current benchmarks put it at about 2x-4x faster than
pem
crate. - No dependencies, no unsafe, no dynamic allocation, only requires
core
. - Highly customizable encapsulation boundary parsing.
- Resilient parsing. Errors generated by the underlying stream don't lose state.
Usage
Cargo.toml
:
[dependencies]
pem-iterator = "0.2"
Crate root:
extern crate pem_iterator;
Example
extern crate pem_iterator;
use pem_iterator::boundary::{BoundaryType, BoundaryParser, LabelMatcher};
use pem_iterator::body::Single;
const SAMPLE: &'static str = "-----BEGIN RSA PRIVATE KEY-----
MIIBPQIBAAJBAOsfi5AGYhdRs/x6q5H7kScxA0Kzzqe6WI6gf6+tc6IvKQJo5rQc
dWWSQ0nRGt2hOPDO+35NKhQEjBQxPh/v7n0CAwEAAQJBAOGaBAyuw0ICyENy5NsO
2gkT00AWTSzM9Zns0HedY31yEabkuFvrMCHjscEF7u3Y6PB7An3IzooBHchsFDei
AAECIQD/JahddzR5K3A6rzTidmAf1PBtqi7296EnWv8WvpfAAQIhAOvowIXZI4Un
DXjgZ9ekuUjZN+GUQRAVlkEEohGLVy59AiEA90VtqDdQuWWpvJX0cM08V10tLXrT
TTGsEtITid1ogAECIQDAaFl90ZgS5cMrL3wCeatVKzVUmuJmB/VAmlLFFGzK0QIh
ANJGc7AFk4fyFD/OezhwGHbWmo/S+bfeAiIh2Ss2FxKJ
-----END RSA PRIVATE KEY-----";
let mut input = SAMPLE.chars().enumerate();
let mut label_buf = String::new();
{
let mut parser = BoundaryParser::from_chars(BoundaryType::Begin, &mut input, &mut label_buf);
assert_eq!(parser.next(), None);
assert_eq!(parser.complete(), Ok(()));
}
println!("PEM label: {}", label_buf);
// Parse the body
let data: Result<Vec<u8>, _> = Single::from_chars(&mut input).collect();
let data = data.unwrap();
// Verify the end boundary has the same label as the begin boundary
{
let mut parser = BoundaryParser::from_chars(BoundaryType::End, &mut input, LabelMatcher(label_buf.chars()));
assert_eq!(parser.next(), None);
assert_eq!(parser.complete(), Ok(()));
}
println!("data: {:?}", data);
BoundaryParser
and Label
The first task in parsing a PEM formatted data is parsing the BEGIN
boundary. Enter BoundaryParser
. This iterator type takes three parameters to construct:
- An enum value for
BEGIN
vsEND
. - The stream to get characters from.
- An object to deal with the label.
That third parameter holds a lot of power. Basically as the parser encounters the label, it will notify this parameter of each character via the Label
trait. This enables a bunch of different behaviors, such as:
- Accumulating the characters into a buffer (e.g.
&mut String
) - Matching against known characters. (e.g.
LabelMatcher("CERTIFICATE".chars())
) - Discarding the characters completely (e.g.
DiscardLabel
)
In addition to a simple Mismatch
error, this label processing also has the option to return custom, complex errors. Enabling significant versatility and expandability.
Since parsing the BEGIN
label is totally separate from parsing END
, one can mix and match strategies to customize the level of strictness (e.g. BEGIN
and END
can have different labels).
Chunked vs Single
For parsing the body this crate provides 2 iterators, Chunked
and Single
. The basic difference is Chunked
emits 3 bytes of output at a time (corresponding to 4 characters of input), while Single
emits only 1 byte at a time.
There may be some performance differences between the two, but presently they seem nearly identical. Originally there was more of a distinction between the two and a trade-off in performance vs functionality, but at this point, the difference is largely an ergonomic one.
Resilient parsing
The major types of this crate (BoundaryParser
, Chunked
, and Single
), are all iterators. It's obvious why the body parsers are iterators: they need to iterate over the bytes of output. But why is BoundaryParser
?
Basically, it makes parsing more resilient. If the underlying stream emits an errors, it can be forwarded to the caller and dealt with without losing parsing state. Is this useful? Probably not. Most of the time you'd just want to fail if the stream errors. But it is kind of neat.