Match Types
Jump to navigation
Jump to search
When working with TM systems, MT engines, leveraging steps, and other components that try to match an existing translation to a given source, all Okapi resources use the same categories to identify the types of match.
For example, in the XLIFF files generated by Okapi components, the type of match is reported in the okp:matchType
attribute of the <alt-trans>
element. The match-quality
attribute provides a percentage.
The following table shows the different match types in decreasing order (best type first):
Type of Match | Description |
HUMAN_RECOMMENDED | Improved translation edited by a human. |
EXACT_UNIQUE_ID | Matches EXACT and matches a unique id. |
EXACT_PREVIOUS_VERSION | Matches EXACT and comes from the preceding version of the same document (i.e., if v4 is leveraged this match must come from v3, not v2 or v1!!). |
EXACT_LOCAL_CONTEXT | Matches EXACT and a small number of segments before and/or after. |
EXACT_DOCUMENT_CONTEXT | Matches EXACT and comes from the same document either existing or different version. See also EXACT_PREVIOUS_VERSION |
EXACT_STRUCTURAL | Matches EXACT and the structural type of the segment (title, paragraph, list element etc.) |
EXACT | Matches text and codes exactly. |
EXACT_TEXT_ONLY_UNIQUE_ID | Matches EXACT_TEXT_ONLY and matches a unique id. |
EXACT_TEXT_ONLY_PREVIOUS_VERSION | Matches EXACT_TEXT_ONLY and comes from a previous version of the same document. |
EXACT_TEXT_ONLY | Matches text exactly, but there is a difference in one or more codes. |
EXACT_REPAIRED | Matches text and codes exactly, but only after the result of some automated repair (e.g. number replacement, code repair, capitalization, punctuation etc.) |
FUZZY_UNIQUE_ID | Matches FUZZY and matches a unique id. |
FUZZY_PREVIOUS_VERSION | Matches FUZZY and comes from a previous version of the same document. |
FUZZY | Matches both text and/or codes partially. |
FUZZY_REPAIRED | Matches both text and/or codes partially and some automated repair (e.g. number replacement, code repair, capitalization, punctuation etc..) was applied to the target. |
PHRASE_ASSEMBLED | Matches assembled from phrases in the TM or other resources (different algorithms could be used). |
MT | Indicates a translation coming from an MT engine. |
UNKNOWN | Unknown match type. Used as default value only when it cannot be identified with another type. A UNKOWN type always sorts below all other matches. |