ITS Components: Difference between revisions

From Okapi Framework
Jump to navigation Jump to search
m (1 revision imported)
No edit summary
 
Line 922: Line 922:
Okapi provides two main components for XLIFF 1.2:
Okapi provides two main components for XLIFF 1.2:
* the [[XLIFF Filter]] which is used to read an existing XLIFF 1.2 document, extract its content and rewrite back the modified document.
* the [[XLIFF Filter]] which is used to read an existing XLIFF 1.2 document, extract its content and rewrite back the modified document.
* the [http://okapi.opentag.com/javadoc/net/sf/okapi/common/filterwriter/XLIFFWriter.html XLIFF Writer] which provides a way to create XLIFF document from the API.
* the [http://okapiframework.org/javadoc/net/sf/okapi/common/filterwriter/XLIFFWriter.html XLIFF Writer] which provides a way to create XLIFF document from the API.


Both components have extensive ITS support.
Both components have extensive ITS support.

Latest revision as of 08:30, 5 June 2016

Overview

This page provides a status of what components implement ITS 1.0 and 2.0 and to what degree.

The specification for ITS 1.0 is here: http://www.w3.org/TR/its/

The specification for ITS 2.0 is here: http://www.w3.org/TR/its20/

XML and HTML5

Okapi offers extensive ITS support for the XML Filter and the HTML5-ITS Filter, both for global and local markup.

Legend:

  • ITS Engine - A Yes indicates that the data category is processed and the resulting information available using the ITraversal interface.
    It also indicates that the implementation passes the ITS Test Suite for that data category.
  • Read - The markup existing in the input document is interpreted and represented in the extracted Okapi resources.
  • Modify - If the Okapi representation of that data category is modified, it is modified in the output document as well.
  • Remove - If the Okapi representation of that data category is removed, it is removed in the output document as well.
  • Add - If an Okapi representation of that data category is added, the corresponding markup is also added in the output document.
  • Global for structural - Denotes the capabilities for the data category when defined in a global rule and when related to an element that is not "within text".
  • Global for inline - Denotes the capabilities for the data category when defined in a global rule and when related to an element that is declared as "within text".
  • Local on structural - Denotes the capabilities for the data category when defined locally on an element that is not "within text".
  • Local on inline - Denotes the capabilities for the data category when defined locally on an element that is declared as "within text".
  • TBD - Means "To be Decided". TBI - Means "To Be Improved". N/A - Means "Not Applicable".


Data Category Scope ITS Engine XML Filter HTML5-ITS Filter Okapi Representation
Read Modify Remove Add Read Modify Remove Add
Translate Global for structural Yes Yes N/A N/A N/A Yes N/A N/A N/A Not to translate: not extracted
Global for inline Yes Yes N/A N/A N/A Yes N/A N/A N/A Not to translate: inline code
Local on structural Yes Yes No No TBD Yes No No TBD Not to translate: not extracted
Local on inline Yes Yes No No TBD Yes No No TBD Not to translate: inline code
Localization Note Global for structural Yes Yes TBD TBD TBD Yes TBD TBD TBD Note property on TextUnit
Global for inline Yes Yes TBD TBD TBD Yes TBD TBD TBD LOCNOTE annotation on <mrk>
Local on structural Yes Yes TBD TBD TBD Yes TBD TBD TBD Note property on TextUnit
Local on inline Yes Yes TBD TBD TBD Yes TBD TBD TBD LOCNOTE annotation on <mrk>
Terminology Global for structural Yes Yes, TBI TBD TBD TBD Yes, TBI TBD TBD TBD TermAnnotation on source TextContainer
Global for inline Yes Yes, TBI TBD TBD TBD Yes, TBI TBD TBD TBD TermAnnotation on source TextContainer
Local on structural Yes Yes, TBI TBD TBD TBD Yes, TBI TBD TBD TBD TermAnnotation on source TextContainer
Local on inline Yes Yes, TBI TBD TBD TBD Yes, TBI TBD TBD TBD TermAnnotation on source TextContainer
Directionality Global for structural Yes Not supported
Global for inline Yes Not supported
Local on structural Yes Not supported
Local on inline Yes Not supported
Language Information Global for structural Yes Not supported
Global for inline Yes Not supported
Local on structural Yes Not supported
Local on inline Yes Not supported
Element Within Text Global for structural Yes Yes, partially TBD TBD TBD Yes, partially TBD TBD TBD TextUnit or inline code
Global for inline Yes Yes, partially TBD TBD TBD Yes, partially TBD TBD TBD TextUnit or inline code
Local on structural Yes Yes, partially TBD TBD TBD Yes, partially TBD TBD TBD TextUnit or inline code
Local on inline Yes Yes, partially TBD TBD TBD Yes, partially TBD TBD TBD TextUnit or inline code
Domain Global for structural Yes Yes TBD TBD TBD Yes TBD TBD TBD DOMAIN annotation on TextUnit
Global for inline Yes Not supported
Text Analysis Global for structural Yes Yes TBD TBD TBD Yes TBD TBD TBD TA annotation on source TextContainer
Global for inline Yes Yes TBD TBD TBD Yes TBD TBD TBD TA annotation on Code
Local on structural Yes Yes TBD TBD TBD Yes TBD TBD TBD TA annotation on source TextContainer
Local on inline Yes Yes TBD TBD TBD Yes TBD TBD Yes TA annotation on Code
Locale Filter Global for structural Yes Yes N/A N/A N/A Yes N/A N/A N/A Not to localize: not extracted
Global for inline Yes Yes N/A N/A N/A Yes N/A N/A N/A Not to localize: inline code
Local on structural Yes Yes No No No Yes No No No Not to localize: not extracted
Local on inline Yes Yes No No No Yes No No No Not to localize: inline code
Provenance Global for structural Yes Yes TBD TBD TBD Yes TBD TBD TBD PROV annotation on TextUnit
Global for inline Yes TBD TBD TBD TBD TBD TBD TBD TBD TBD
Local on structural Yes Yes TBD TBD TBD Yes TBD TBD TBD PROV annotation on TextUnit
Local on inline Yes TBD TBD TBD TBD TBD TBD TBD Yes TBI PROV annotation on Code
External Resource Global for structural Yes Yes No No No Yes No No No EXTERNALRES annotation on TextUnit
Global for inline Yes Yes No No No Yes No No No EXTERNALRES annotation on Code
Target Pointer Global for structural Yes TBD No No No TBD No No No Target content of the TextUnit
Global for inline Yes TBD No No No TBD No No No Target content of the Code
Id Value Global for structural Yes Yes N/A N/A N/A Yes N/A N/A N/A TextUnit name
Global for inline Yes N/A N/A N/A N/A N/A N/A N/A N/A N/A
Local on structural Yes Yes N/A N/A N/A Yes N/A N/A N/A TextUnit name
Local on inline Yes N/A N/A N/A N/A N/A N/A N/A N/A N/A
Preserve Space Global for structural Yes Yes No No No Yes N/A N/A N/A Preserve Space on TextUnit
Global for inline Yes TBD No No No TBD N/A N/A N/A PRESERVEWS annotation on Code
Local on structural Yes Yes No No No No N/A N/A N/A Preserve Space on TextUnit
Local on inline Yes TBD No No No TBD N/A N/A N/A PRESERVEWS annotation on Code
Localization Quality Issue Global for structural Yes Yes TBD TBD TBD Yes TBD TBD TBD LQI annotation on TextUnit
Global for inline Yes Yes TBD TBD TBD Yes TBD TBD TBD LQI annotation on source/target TextContainer or inline code
Local on structural Yes Yes TBD TBD TBD Yes TBD TBD TBD LQI annotation on TextUnit
Local on inline Yes Yes TBD TBD TBD Yes TBD TBD Yes TBI LQI annotation on source/target TextContainer or inline code
Localization Quality Rating Local on structural Yes Yes TBD TBD TBD Yes TBD TBD TBD LQR annotation on TextUnit
Local on inline Yes TBD TBD TBD TBD TBD TBD TBD TBD TBD
MT Confidence Global for structural Yes Yes, TBI TBD TBD TBD Yes, TBI TBD TBD TBD MTCONFIDENCE annotation on source TextContainer
Global for inline Yes TBD TBD TBD TBD TBD TBD TBD TBD TBD
Local on structural Yes Yes, TBI TBD TBD TBD Yes, TBI TBD TBD TBD MTCONFIDENCE annotation on source TextContainer
Local on inline Yes TBD TBD TBD TBD TBD TBD TBD TBD TBD
Allowed Characters Global for structural Yes Yes TBD TBD TBD Yes TBD TBD TBD ALLOWEDCHARS annotation on TextUnit
Global for inline Yes Yes TBD TBD TBD Yes TBD TBD TBD ALLOWEDCHARS annotation on Code
Local on structural Yes Yes TBD TBD TBD Yes TBD TBD TBD ALLOWEDCHARS annotation on TextUnit
Local on inline Yes Yes TBD TBD TBD Yes TBD TBD Yes ALLOWEDCHARS annotation on Code
Storage Size Global for structural Yes Yes TBD TBD TBD Yes TBD TBD TBD STORAGESIZE annotation on TextUnit
Global for inline Yes Yes TBD TBD TBD Yes TBD TBD TBD STORAGESIZE annotation on Code
Local on structural Yes Yes TBD TBD TBD Yes TBD TBD TBD STORAGESIZE annotation on TextUnit
Local on inline Yes Yes TBD TBD TBD Yes TBD TBD Yes STORAGESIZE annotation on Code

You can find more information about ITS on this page.

XLIFF 1.2

Okapi provides two main components for XLIFF 1.2:

  • the XLIFF Filter which is used to read an existing XLIFF 1.2 document, extract its content and rewrite back the modified document.
  • the XLIFF Writer which provides a way to create XLIFF document from the API.

Both components have extensive ITS support.

Notes:

  • Not all ITS data categories can be used in all XLIFF elements. ITS markup that is not at the locations defined in the table is not processed.
  • For the XLIFF Filter, the Modify, Add and Remove actions apply only to annotation in the target container.
  • The Modify, Add and Remove actions listed here are for the XLIFF Filter only. That is: to perform the same action on the original document, the filter used to create the XLIFF document must also support for those actions.

ITS is implemented as followed:

Data Category XLIFF 1.2 Markup XLIFF 1.2 Filter Okapi Representation XLIFF 1.2 Writer
Read Rewrite Modify Remove Add
Translate translate in <trans-unit> Yes Yes No No No ITextUnit.[is/setIs]Translatable() Yes
mtype='protected' in <mrk>
or inline code
Yes Yes No No Yes Inline code or TRANSLATE annotation on Code Yes
Localization Note <note> element in the text unit. Yes Yes TBD TBD TBD NOTE property on TextUnit Yes
comment='TEXT' and
itsxlf:locNoteType='alert|description'in <mrk>
Yes Yes Yes Yes Yes LOCNOTE annotation on Code Yes
Terminology mtype='term' and
itsxlf:termInfo
itsxlf:termInfoRef and
itsxlf:termConfidence in <mrk>
Yes TBI Yes TBI TBD TBD TBD TERM annotation on Code Yes
Directionality N/A N/A N/A N/A N/A N/A N/A N/A
Language Information xml:lang in <mrk> Yes Yes Yes Yes Yes LANG annotation on Code Yes
Element Within Text Inline codes Yes Yes Yes Yes Yes Inline codes Yes
Domain itsxlf:domains in <trans-unit> Yes Yes TBD TBD Yes DOMAIN annotation on TextUnit Yes
Text Analysis ITS attributes in <mrk> Yes TBI Yes TBI TBD TBD Yes TA annotation on Code Yes
Locale Filter ITS attributes in <trans-unit> (and possibly translate="no") Yes Yes No No No LOCFILTER annotation on TextUnit Yes
ITS attributes in <mrk> (and possibly mtype="protected") Yes Yes TBD TBD Yes LOCFILTER annotation on Code Yes
Provenance ITS attributes in <file>, <group>, <trans-unit>, <source>, <target> Yes Yes Yes Yes Yes PROV annotation (ITSProvenanceAnnotations) on StartSubDocument, StartGroup, TextUnit or TextContainer Yes
ITS attributes in <mrk> Yes Yes TBD TBD Yes PROV annotation (ITSProvenanceAnnotations) on Code Yes
External Resource itsxlf:externalResourceRef in <trans-unit> Yes Yes TBD TBD TBD EXTERNALRES annotation on TextUnit Yes
itsxlf:externalResourceRef in inline code Yes Yes TBD TBD Yes EXTERNALRES annotation on Code Yes
Id Value resname in <trans-unit> Yes Yes No No No ITextUnit.[get/set]Name() Yes
Preserve Space xml:space in <trans-unit> Yes Yes No No No ITextUnit.[preserve/setPreserve]Whitespaces() Yes
xml:space in <mrk> Yes Yes TBD TBD Yes PRESERVEWS annotation on Code Yes
Localization Quality Issue ITS attributes in <source> or <target> Yes Yes Yes Yes Yes LQI annotation (ITSLQIAnnotations) on TextContainer Yes
ITS attributes in <mrk> Yes Yes TBD TBD Yes LQI annotation (ITSLQIAnnotations) on Code Yes
Localization Quality Rating ITS attribute in <target> Yes Yes TBD TBD TBD LQR annotation on TextContainer Yes
ITS attribute in <mrk mtype="seg"> Yes Yes TBD TBD TBD LQR annotation on Segment Yes
ITS attribute in <mrk> Yes Yes TBD TBD Yes LQR annotation on Code Yes
MT Confidence ITS attribute in <target> Yes Yes TBD TBD TBD MTCONFIDENCE annotation on TextContainer Yes TBI
ITS attribute in <mrk mtype="seg"> Yes Yes TBD TBD TBD MTCONFIDENCE annotation on Segment Yes TBI
Allowed Characters ITS attribute in <source> or <target> Yes Yes TBD TBD TBD ALLOWEDCHARS annotation on TextContainer Yes
ITS attribute in <mrk> Yes Yes Yes Yes Yes ALLOWEDCHARS annotation on Code Yes
Storage Size ITS attributes in in <source> or <target> Yes Yes TBD TBD TBD STORAGESIZE annotation on TextContainer Yes
ITS attributes in <mrk> Yes Yes Yes Yes Yes STORAGESIZE annotation on Code Yes

You can find more information on the XLIFF 1.2 Filter on this page.

OpenOffice Filter

Okapi provides support for several data categories in the OpenOffice Filter.

The ODFFilter class implements support for Translate, Localization Note, Terminology and Locale Filter data categories, for local markup.

Enrycher Step

Support for the Enrycher Web service is implemented in the Enrycher Step.

This step allows you to markup the source content of text units with Text Analysis annotations (TA annotation on inline codes).

LanguageTool Step

The LanguageTool library is used by the LanguageTool Step to annotate extracted content with Localization Quality Issue items.

The step can be used separately or from within CheckMate, an application dedicated to quality verification.

Microsoft Batch Translation Step

The Domain data category can be used to select the Microsoft Translator Hub engine to utilize by the Microsoft Batch Translation Step.

Quality Check Step

The Quality Check Step implements support for the Allowed Characters, Storage Size and Localization Quality Issue data categories.

The step can be used separately or from within CheckMate, an application dedicated to quality verification.

Terminology Extraction Step

The Text Analysis and the Terminology data categories can be utilized by the Term Extraction Step to extract term candidates.