Hunt-Szymanski diff

A port of the Hunt-Szymanski diff algorithm to Elixir, with a focus on performance and memory efficiency.

This library is particularly suited for longer texts such as wiki pages or blog articles.

Credits

This library was generated with assistance from ChatGPT.
While I guided its development and reviewed the implementation, the core algorithm was recreated by AI.

Usage

left = """
Roses are red,
Violets are blue,
Sugar is sweet,
And so are you.
"""

right = """
Roses are blue,
Violets are blue,
I love fluffy clouds,
And so are you.
"""

# 1. Diff two strings

diff = HSDiff.diff(left, right)

# => 
# [
#   del: ["Roses are red,"],
#   ins: ["Roses are blue,"],
#   eq: ["Violets are blue,"],
#   del: ["Sugar is sweet,"],
#   ins: ["I love fluffy clouds,"],
#   eq: ["And so are you."],
#   eq: [""]
# ]


# 2. Optimize the diff result

optimized = HSDiff.optimize(diff)

# => 
# [
#   del: 1,
#   ins: ["Roses are blue,"],
#   eq: 1,
#   del: 1,
#   ins: ["I love fluffy clouds,"],
#   eq: 2
# ]


# 3. Patch 

restored = HSDiff.patch(left, optimized) 

# => "Roses are blue,\nViolets are blue,\nI love fluffy clouds,\nAnd so are you.\n"

Why Use Hunt-Szymanski Diff?

Elixir’s standard library already provides String.myers_difference/2 to generate diffs between strings (and lists). However:

High Memory Usage: For larger inputs (e.g., a typical article or blog post), String.myers_difference/2 can quickly use gigabytes of memory. (Note: this can be alleviated by chunking the input into smaller pieces)
Small vs Large Inputs: String.myers_difference/2 performs well for short strings or larger strings with small changes. But for large-scale diffs—such as comparing entire documents—the Hunt–Szymanski line-based diff is more memory-efficient and performs better.

In short, if you need a diff algorithm for big textual data, Hunt–Szymanski can help you avoid the heavy memory footprint that comes with other approaches.

Installation

Add hunt_szymanski_diff to your list of dependencies in mix.exs:

def deps do
  [
    {:hunt_szymanski_diff, "~> 0.1.0"}
  ]
end

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
config		config
lib		lib
test		test
.formatter.exs		.formatter.exs
.gitignore		.gitignore
.tool-versions		.tool-versions
LICENSE		LICENSE
README.md		README.md
mix.exs		mix.exs
mix.lock		mix.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hunt-Szymanski diff

Credits

Usage

Why Use Hunt-Szymanski Diff?

Installation

About

Uh oh!

Releases

Packages

Languages

License

danielres/hunt_szymanski_diff

Folders and files

Latest commit

History

Repository files navigation

Hunt-Szymanski diff

Credits

Usage

Why Use Hunt-Szymanski Diff?

Installation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages