#diff #lcs #diffn #ndiff

no-std multidiff

Diff an arbitary number of inputs

1 unstable release

0.0.0 Dec 23, 2023

#15 in #lcs

Zlib OR Apache-2.0 OR MIT

5KB
54 lines

syn-merge

Latest version Documentation CI

Merge syn structures by adding cfgs.

Thoughts about diffing

Should operate directly on Rust source code (not on some custom AST).

Able to operate on multiple files (in contrast to most diffing out there).

Longest common subsequence algorithm?

How do we handle ties? Some kind of weighting?

Basically:

let contents = vec![...];
let files = contents.iter().map(syn::parse_file).flatten()?;
let merged = syn_merge::merge(files)?;
file.write(prettyplease::unparse(merged))?;

Literature and prior work

The similar crate implements various diffing algorithms.

difftastic/src/diff also has a few algorithms.

The wu-diff crate, which implements Wu, Sun; Manber, Udi; Myers, Gene (1989). "An O(NP) Sequence Comparison Algorithm".

Hunt, James W; Szymanski, Thomas G. (1977). "A fast algorithm for computing longest common subsequences".

Department of Mathematics and Computer Science. University of Southern Denmark (January 12, 2017). "The Hunt-Szymanski Algorithm for LCS".


However, there is very little research on diffing between multiple files.

Khanna, Sanjeev & Kunal, Keshav & Pierce, Benjamin. (2007). A Formal Investigation of Diff3.

"Generalized Longest Common Subsequence / LCS"


lib.rs:

Diff an arbitary number of inputs

Fundamentally, it works by constructing an input matrix like:

Key Data
0 aab
1 baa
2 bc
3 abc

And producing an output matrix like (may vary depending on algorithm details):

Char Appears In
a 0, 3
a 0
b 0, 1, 2, 3
a 1
a 1
c 2, 3

Note how this can encode the same information as a diff between just two elements:

Key Data
old aabc
new baac
Char Appears In
b new
a old, new
a old, new
b old
c old, new
+b
 a
 a
-b
 c

Dependencies

~185KB