GitXplorerGitXplorer
h

common_substrings_rust

public
6 stars
2 forks
0 issues

Commits

List of commits on branch master.
Verified
bd703f149535136d76d3f3c9213ec39ed666f635

Update README.md

hhanwencheng committed 5 years ago
Verified
a47ea6659b44aeb3716980c8dd719d11be6fc0fc

Update README.md

hhanwencheng committed 5 years ago
Unverified
26c1ece4f7fa396f3939677ca01b8fc469e2e6ef

rephrase readme

hhanwencheng committed 5 years ago
Unverified
b219765e80fdc85362a2ac16c587033234dbb1d8

remove unused files

hhanwencheng committed 5 years ago
Unverified
fde4d42ebb8bdf88ce7110b3bd9ec4dce6336544

minors 🍺

hhanwencheng committed 5 years ago
Verified
b080352ad5279e6f92d67e38cd3d8c5c80fdd0a4

Merge pull request #1 from hanwencheng/skip-short-strings

hhanwencheng committed 5 years ago

README

The README file for this repository.

Find all common substrings

versom

A method for finding all common strings. Check it on Crates.io.

The algorithms uses a two dimension trie to get all the fragment. The vertical one is the standard suffix trie, but all the node of the last word in each suffix is linked, which I call them virtually horizontally linked.

Usage

Use the function get_substrings to get all the common strings in the strings list,

Example

use common_substrings::get_substrings;
let input_strings = vec!["java", "javascript", "typescript", "coffeescript", "coffee"];
let result_substrings = get_substrings(input_strings, 2, 3);

which gives the result list of

Substring(sources: {2, 3}, name: escript, weight: 14)
Substring(sources: {1, 0}, name: java, weight: 8)
Substring(sources: {4, 3}, name: coffee, weight: 12)

Arguments

  • input - The target input string vector.
  • min_occurrences The minimal occurrence of the captured common substrings.
  • min_length The minimal length of the captured common substrings.

Algorithm

Explanation here

Other implementations

License

Apache-2.0