Package net.sf.okapi.lib.translation
Class TextMatcher
- java.lang.Object
-
- net.sf.okapi.lib.translation.TextMatcher
-
public class TextMatcher extends Object
Provides a simple way to compare two lists of tokens using basic fuzzy matching algorithms.
-
-
Field Summary
Fields Modifier and Type Field Description static intIGNORE_CASEFlag indicating to ignore case differences.static intIGNORE_PUNCTUATIONFlag indication to ignore punctuation differences.static intIGNORE_WHITESPACESFlag indicating to ignore whitespaces differences.
-
Constructor Summary
Constructors Constructor Description TextMatcher(LocaleId locale1, LocaleId locale2)Creates a new TextMatcher object.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description intcompare(TextFragment frag1, TextFragment frag2, int options)Compare two textFragment content.intcompareToBaseTokens(String text1, List<String> tokens1, TextFragment frag2)Compare a list of tokens to aTextFragmentobject.protected static shortminimum(int value1, int value2, int value3)Returns the minimum value between three given values.List<String>prepareBaseTokens(String plainText)Creates a list of tokens from a string to use with thecompareToBaseTokens(String, List, TextFragment).
-
-
-
Field Detail
-
IGNORE_CASE
public static final int IGNORE_CASE
Flag indicating to ignore case differences.- See Also:
- Constant Field Values
-
IGNORE_WHITESPACES
public static final int IGNORE_WHITESPACES
Flag indicating to ignore whitespaces differences.- See Also:
- Constant Field Values
-
IGNORE_PUNCTUATION
public static final int IGNORE_PUNCTUATION
Flag indication to ignore punctuation differences.- See Also:
- Constant Field Values
-
-
Method Detail
-
minimum
protected static short minimum(int value1, int value2, int value3)Returns the minimum value between three given values.- Parameters:
value1- the first given value.value2- the second given value.value3- the third given value.- Returns:
- the minimum value between three given values.
-
compare
public int compare(TextFragment frag1, TextFragment frag2, int options)
Compare two textFragment content.- Parameters:
frag1- The base fragment.frag2- the fragment to compare against the base fragment.options- Comparison options.- Returns:
- A score between 0 (no match) and 100 (exact match).
-
prepareBaseTokens
public List<String> prepareBaseTokens(String plainText)
Creates a list of tokens from a string to use with thecompareToBaseTokens(String, List, TextFragment).- Parameters:
plainText- the based text.- Returns:
- the list of tokens for the given fragment.
-
compareToBaseTokens
public int compareToBaseTokens(String text1, List<String> tokens1, TextFragment frag2)
Compare a list of tokens to aTextFragmentobject.- Parameters:
text1- the original plain text.tokens1- the list of tokens.frag2- the fragment to compare against list of tokens.- Returns:
- A score between 0 (no match) and 100 (exact match).
-
-