#rfc-5322 #parser-serializer #rfc2822 #address

addr-spec

A wicked fast UTF-8 email address parser and serializer

11 releases

0.8.1 Feb 11, 2024
0.7.1 Feb 6, 2023

#51 in Email

48 downloads per month

MIT/Apache

58KB
1.5K SLoC

addr-spec

crates.io license ci docs

A wicked fast UTF-8 email address parser and serializer. It provides

  • unopinionated, correct parsing of email addresses (defined as addr-spec in RFC 5322),
  • extremely fast serialization and deserialization using low-level memory management,
  • guarantees on the uniqueness of an email address,
  • UTF-8 support with NFC normalization as recommended in RFC 6532, and
  • format validation based on the grammar set out in Section 3.4.1, RFC 5322 and Section 3.2, RFC 6532 with position-accurate errors.

Features

This crate supports the following features:

  • normalization - This enables (NFC) normalization of addresses.
  • comments - This allows parsing (but not serialization; see Caveats) of comments.
  • literals - This allows parsing and serialization of literal domains.
  • white-spaces - This allows parsing (but not serialization; see Caveats) of whitepaces.

By default, normalization is enabled.

Caveats

Folding white spaces

Serializing folding white spaces (abbr. FWS) is not supported since it is dependent on the transport mechanism. If you need to break the address along foldable boundaries, you can use into_serialized_parts which returns serialized versions of the local part and domain.

Comments

Serializing comments is not supported since it is dependent on the transport mechanism. At the moment, comments are parsed, but skipped as there is no uniform way of handling them. If you would like comments to be stored, please file an issue with your use-case.

Alternatives

email_address

This crate provides a "newtype" EmailAddress which under the hood just validates and wraps an address string.

It does not support white spaces, comments, and UTF-8 normalization, nor does it support address normalization (e.g. "te st"@example.com is equivalent to test@example.com, but this cannot be distinguished with email_address).

Benchmarks

In scenarios supported by email_address (no comments, no white-spaces, no UTF-8/address normalization), email_address slightly outperforms addr_spec by about 5% with all features off which makes sense since email_address cannot distinguish equivalent addresses.

Migration

It's highly recommended to use only addr_spec in production since addr_spec provides guarantees on uniqueness for storage and lookup as well as other special perks (position-based errors, SMTP-style Display writer, etc.). If this is not feasible, we provide Into<EmailAddress> and Into<AddrSpec> for those coming from email_address. Note that Into<AddrSpec> is a only a right inverse of Into<EmailAddress>, i.e. AddrSpec -> EmailAddress -> AddrSpec will always yield the same AddrSpec, but EmailAddress -> AddrSpec -> EmailAddress may not yield the same EmailAddress.

Dependencies

~0–370KB
~13K SLoC