#normalization #archive #surt #web-archiving

bin+lib surt-rs

A Rust implementation of the Sort-friendly URI Reordering Transform (SURT)

2 releases

0.1.1 Apr 1, 2024
0.1.0 Mar 23, 2024

#551 in Text processing

Download history 134/week @ 2024-03-22 148/week @ 2024-03-29 16/week @ 2024-04-05

63 downloads per month

MIT license

245 lines


This library provides a Rust implementation for generating a Sort-friendly URI Reordering Transform (SURT) from a given URL. These are predominantly used in the Web Archiving world to provide a normalised and sortable variant of a URL for use at replay time.


use surt_rs::generate_surt;

let url = "http://example.com/path?query=value#fragment";
let surt = generate_surt(url).unwrap();
println!("{}", surt);  // prints: "com,example)/path?query=value#fragment"


generate_surt(url: &str) -> Result<String, ParseError>

Generates a SURT from the given URL. Returns a Result that contains the SURT as a String if the URL is valid, or a ParseError if the URL is not valid.

normalize_surt(surt: &str) -> String

Normalizes the given SURT by replacing whitespace with '%20' and removing trailing slashes unless it's the root path.

normalize_url(url: &str) -> String

Normalizes the given URL by removing trailing slashes and the 'www.' subdomain after the scheme.


This project is licensed under the MIT License.


~115K SLoC