|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object edu.northwestern.at.utils.corpuslinguistics.LevensteinDistance
public class LevensteinDistance
Computes the Levenstein edit distance between two strings.
The Levenstein edit distance is the number of insertions, deletions, substitutions, and adjacent transpositions required to transform one string into another. The larger the Levenstein distance, the more different the strings are.
The edit distance between two strings s1 and s2 can be converted to a similarity measure as follows:
max_length = max( length of s1 , length of s2 ) edit_distance = edit distance between s1 and s2 similarity = 1.0 - ( edit_distance / max_length )
This implementation of Levenstein distance is based upon one by Michael Gilleland and Charles Emerick.
Constructor Summary | |
---|---|
LevensteinDistance()
|
Method Summary | |
---|---|
static boolean |
areAlike(java.lang.String s1,
java.lang.String s2)
Are two strings alike based upon edit distance. |
static int |
editDistance(java.lang.String s1,
java.lang.String s2)
Compute Levenstein edit distance between two strings. |
static double |
similarity(java.lang.String s1,
java.lang.String s2)
Compute similarity between two strings. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public LevensteinDistance()
Method Detail |
---|
public static int editDistance(java.lang.String s1, java.lang.String s2)
s1
- First string.s2
- Second string.,
public static double similarity(java.lang.String s1, java.lang.String s2)
s1
- First string.s2
- Second string.
The similarity is computed from the edit distance between s1 and s2 as follows:
max_length = max( length of s1 , length of s2 ) edit_distance = edit distance between s1 and s2 similarity = 1.0 - ( edit_distance / max_length )
public static boolean areAlike(java.lang.String s1, java.lang.String s2)
s1
- First string.s2
- Second string.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |