Package net.sf.okapi.steps.gcaligner
Class GaleAndChurch<T>
- java.lang.Object
-
- net.sf.okapi.steps.gcaligner.GaleAndChurch<T>
-
- All Implemented Interfaces:
AlignmentScorer<T>
public class GaleAndChurch<T> extends Object implements AlignmentScorer<T>
-
-
Constructor Summary
Constructors Constructor Description GaleAndChurch()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description intcontractionScore(T p_sourceTuv1, T p_sourceTuv2, T p_targetTuv)Calculate the cost of contracting two source segments to one target segment.intdeletionScore(T p_sourceTuv)Calculate the cost of deletion of source segment.intexpansionScore(T p_sourceTuv, T p_targetTuv1, T p_targetTuv2)Calculate the cost of expanding one source segment to two target segments.intinsertionScore(T p_targetTuv)Calculate the cost of insertion of target segment.intmatch(int len1, int len2)Return -100 * log probability that an source sentence of length len1 is a translation of a foreign sentence of length len2.intmeldingScore(T p_sourceTuv1, T p_sourceTuv2, T p_targetTuv1, T p_targetTuv2)Calculate the cost of melding of two source segments to two target segments.doubleprob(int len1, int len2)Return the probability that an source sentence of length len1 is a translation of a foreign sentence of length len2.voidsetLocales(LocaleId p_sourceLocale, LocaleId p_targetLocale)Set source and target locales.intsubstitutionScore(T p_sourceTuv, T p_targetTuv)Calculate the cost of substitution of source segment by target segment.
-
-
-
Method Detail
-
setLocales
public void setLocales(LocaleId p_sourceLocale, LocaleId p_targetLocale)
Set source and target locales.- Specified by:
setLocalesin interfaceAlignmentScorer<T>- Parameters:
p_sourceLocale- Source localep_targetLocale- Target locale
-
substitutionScore
public int substitutionScore(T p_sourceTuv, T p_targetTuv)
Calculate the cost of substitution of source segment by target segment.- Specified by:
substitutionScorein interfaceAlignmentScorer<T>- Parameters:
p_sourceTuv- Source TUV. Source is in X sequence in the DP map.p_targetTuv- Target TUV. Target is in Y sequence in the DP map.- Returns:
- cost of the substitution
-
deletionScore
public int deletionScore(T p_sourceTuv)
Calculate the cost of deletion of source segment.- Specified by:
deletionScorein interfaceAlignmentScorer<T>- Parameters:
p_sourceTuv- Source TUV. Source is in X sequence in the DP map.- Returns:
- cost of the deletion
-
insertionScore
public int insertionScore(T p_targetTuv)
Calculate the cost of insertion of target segment.- Specified by:
insertionScorein interfaceAlignmentScorer<T>- Parameters:
p_targetTuv- Target TUV. Target is in Y sequence in the DP map.- Returns:
- cost of the insertion
-
contractionScore
public int contractionScore(T p_sourceTuv1, T p_sourceTuv2, T p_targetTuv)
Calculate the cost of contracting two source segments to one target segment.- Specified by:
contractionScorein interfaceAlignmentScorer<T>- Parameters:
p_sourceTuv1- Source TUV1. Source is in X sequence in the DP map.p_sourceTuv2- Source TUV2. Source is in X sequence in the DP map.p_targetTuv- Target TUV. Target is in Y sequence in the DP map.- Returns:
- cost of the contraction
-
expansionScore
public int expansionScore(T p_sourceTuv, T p_targetTuv1, T p_targetTuv2)
Calculate the cost of expanding one source segment to two target segments.- Specified by:
expansionScorein interfaceAlignmentScorer<T>- Parameters:
p_sourceTuv- Source TUV. Source is in X sequence in the DP map.p_targetTuv1- Target TUV1. Target is in Y sequence in the DP map.p_targetTuv2- Target TUV2. Target is in Y sequence in the DP map.- Returns:
- cost of the expansion
-
meldingScore
public int meldingScore(T p_sourceTuv1, T p_sourceTuv2, T p_targetTuv1, T p_targetTuv2)
Calculate the cost of melding of two source segments to two target segments.- Specified by:
meldingScorein interfaceAlignmentScorer<T>- Parameters:
p_sourceTuv1- Source TUV1. Source is in X sequence in the DP map.p_sourceTuv2- Source TUV2. Source is in X sequence in the DP map.p_targetTuv1- Target TUV1. Target is in Y sequence in the DP map.p_targetTuv2- Target TUV2. Target is in Y sequence in the DP map.- Returns:
- cost of the melding
-
match
public int match(int len1, int len2)Return -100 * log probability that an source sentence of length len1 is a translation of a foreign sentence of length len2. The probability is based on two parameters, the mean and variance of number of foreign characters per source character. Gale and Church hardcoded foreign_chars_per_eng_char as 1. It apparently works OK for European language alignment. We take the coefficient as a parameter so that non European languages can be aligned as well.
-
prob
public double prob(int len1, int len2)Return the probability that an source sentence of length len1 is a translation of a foreign sentence of length len2. The probability is based on two parameters, the mean and variance of number of foreign characters per source character. Gale and Church hardcoded foreign_chars_per_eng_char as 1. It apparently works OK for European language alignment. We take the coefficient as a parameter so that non European languages can be aligned as well.
-
-