#algorithm #weight #kondrak-aline #aline #c-vwl

bin+lib kondrak-aline

Implementation of Kondrak's ALINE alignment algorithm

2 releases

Uses new Rust 2024

new 0.1.3 May 19, 2025
0.1.2-1 May 19, 2025
0.1.1 May 18, 2025
0.1.0 May 18, 2025

#12 in #weight

Download history

97 downloads per month

MIT license

16KB
372 lines

Aline-rs

This is a Aline implementation based on the python nltk implementation.


lib.rs:

ALINE https://webdocs.cs.ualberta.ca/~kondrak/ Copyright 2002 by Grzegorz Kondrak.

ALINE is an algorithm for aligning phonetic sequences, described in [1]. This module is a port of Kondrak's (2002) ALINE. It provides functions for phonetic sequence alignment and similarity analysis. These are useful in historical linguistics, sociolinguistics and synchronic phonology.

ALINE has parameters that can be tuned for desired output. These parameters are:

  • C_skip, C_sub, C_exp, C_vwl
  • Salience weights
  • Segmental features

In this implementation, some parameters have been changed from their default values as described in [1], in order to replicate published results. All changes are noted in comments.

Get optimal alignment of two phonetic sequences

use aline::align;

let alignment = align("θin", "tenwis", 0.0);

assert_eq!(
    alignment,
    vec![
        vec![
            ("θ", "t"),
            ("i", "e"),
            ("n", "n")
        ].iter()
        .map(|(a, b)| (a.to_string(), b.to_string()))
        .collect::<Vec<(String, String)>>()
    ]
);

[1] G. Kondrak. Algorithms for Language Reconstruction. PhD dissertation, University of Toronto.

Dependencies

~1–2MB
~38K SLoC