MultilingualWeb-LT D3.1.4

From Okapi Framework
Revision as of 08:56, 5 June 2016 by Ysavourel (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Between January 2012 and December 2013, the European Commission funded the MultilingualWeb-LT (LT-Web) Project.

Several companies, universities and research centers worked to define the Internationalization Tag Set 2.0 (ITS 2.0) through the W3C MultilingualWeb-LT Working Group.

One of the deliverables aimed at implementing the Web-LT metadata (i.e. ITS 2.0) within the Okapi Framework, especially in connection with XLIFF-related components.

This page provides an online summary of that deliverable: D3.1.4.

Goals and Results

The following table shows the initial goals for D3.1.4 as stated in the project’s Description of Work document and the corresponding results.

Goals Results
Improved existing Okapi filters (components able to extract to and merge from XLIFF) to implement metadata support when possible. The XML Filter and the HTML5-ITS Filter implement support for all relevant ITS data categories.

The OpenOffice Filter implements supports for some data categories.

XLIFF Reader with metadata support. The XLIFF Filter now implements support for ITS annotations. The implementation follows the ITS/XLIFF Mapping best practices document being produced by the ITS Interest Group.
XLIFF Writer with metadata support. The XLIFFWriter and XLIFFContent classes provide support of ITS 2.0 metadata. The implementation follows the ITS/XLIFF Mapping best practices document being produced by the ITS Interest Group.
Okapi pipeline steps to add, remove, modify and otherwise manipulate metadata; other necessary components as needed. Several steps have been updated or created to take advantage of the ITS 2.0 metadata. For example: the Enrycher Step, the Quality Check Step, the Term Extraction Step, etc.
The components will be accessible through Okapi tools such as Rainbow, Longhorn and Tikal, as well as directly from other application using the Okapi libraries (including Maven packages) and Web services (e.g. Longhorn clients). The ITS-enabled components are accessible through Rainbow, Longhorn, Tikal, CheckMate, etc. They are also available as Maven artifacts for any Java application or library that needs them.

The mapping of ITS 2.0 in XLIFF 1.2 is defined by the International Tag Set Interest Group and is publicly available here:


Downloads of the latest version of the running applications:

General documentation of the Okapi tools and components:

Maven artifacts (releases):

Maven artifacts (snapshots, i.e. development version):

Google Code project for the Okapi Framework:

Source code of the Okapi Framework:

Site of the continuous build:

Business Benefits

Having ITS 2.0 support in the Okapi Framework provides several business benefits:

From the document authors’ viewpoint:

  • Making the case to provide ITS metadata inside XML or HTML5 documents is now easier to make: There is a set of low-level or high-level components available across platforms, and under an open-source license, that can be used to build processes using such metadata.

From the localization tools developers’ viewpoint:

  • Java developers can easily take advantage of the ITS Engine to access ITS metadata at the node level in both XML and HTML5 documents. Because this access is low-level, it can be leveraged to construct many types of applications at a higher level. The developers do not have to process or even know much about global and local rules: they have the resolved data at their finger tips with just a few lines of code.
  • The ITS-enabled filters provide also a simple way to access ITS information along with extracted data. Quite a few in-house tools and commercial applications utilize Okapi’s filters. Using these filters, they can now access ITS data without any extra development work. From there they can integrate and take advantage that information into their own workflows.

From the localization tools users’ viewpoint:

  • Localization tools users can utilize Okapi’s applications out-of-the-box to create pipelines taking advantage of the ITS capabilities of many components. For example: add Text Analysis annotations, mark up Terminology, use standardized Localization Quality Issue reporting, perform Domain-sensitive machine translation, and much more.