Crate wu_manber

source ·
Expand description

This crate gives an implementation of Wu and Manber’s algorithm for finding one of several strings (which we will call “needles”) in a much larger string (the “haystack”). This is not to be confused with Wu and Manber’s algorithm for fuzzy matching.

The Wu-Manber algorithm is very efficient when all of the strings to be matched are long. It requires a pre-processing step with a fair amount of memory overhead – currently about 32kb in this implementation, but future improvements may reduce that when there are not too many needles.

This implementation supports a maximum of 65536 needles, each of which can be at most 65536 bytes long. These requirements may be relaxed in the future.

§Example

use wu_manber::{Match, TwoByteWM};
let needles = vec!["quick", "brown", "lazy", "wombat"];
let haystack = "The quick brown fox jumps over the lazy dog.";
let searcher = TwoByteWM::new(&needles);
let mat = searcher.find(haystack).next().unwrap();
assert_eq!(mat, Match { start: 4, end: 9, pat_idx: 0 });

Structs§

  • TwoByteWM stores the precomputed tables needed for a two-byte-wide implementation of the Wu-Manber algorithm.