Rainbow TKit - OmegaT Project

From Okapi Framework
Jump to: navigation, search

Overview

The OmegaT Project package is one of the types of translation packages you can create with the Rainbow Translation Kit Creation Step. Such package can be post-processed using the Rainbow Translation Kit Merging Step.


In this package the input documents are placed in an OmegaT project. OmegaT is a cross-platform open-source translation tool.

  • The project is created. If one already exists it is overwritten. The input documents are extracted into XLIFF documents that are placed in the source sub-directory of the project.
  • Existing and leveraged translation are output into the tm sub-directory of the project, or directly into the TMX of the project.
  • Segmented text units are represented in XLIFF and handled properly by OmegaT.
  • By default, inline codes are represented using the placeholder notation (<g>/<x/>/<bx>/<ex>) in the XLIFF documents, which are represented as numbered letter codes in OmegaT (e.g. <g1>/<x2/>/<b3>/<e3>). Optionally you may use encapsulation notation (<bpt>/<ept>/<ph>/<it>). See below for notes on compatibility with various versions of OmegaT.

This package can be opened directly with OmegaT.

Note that you can also use some of filters of the Okapi Framework directly from OmegaT (i.e. without creating this package) by working with the Okapi Filters Plugin for OmegaT.

Options

Use <g></g> and <x/> notation — Set this option to create an XLIFF output where inline codes are coded as place-holders that do not include the original data. Some file formats have inline codes that correspond to large chunk of data that are not useful for the translator: This notation allows you to not include them in the XLIFF output. This setting is suitable for use with versions of OmegaT prior to 3.0, or OmegaT 3.0 or higher with the XLIFF filter set to Compatibility with OmegaT 2.6 (see Options > File Filters... > XLIFF > Options...).

Disabling this option will result in encapsulation-style codes (<bpt>/<ept>/<ph>/<it>); please use OmegaT 3.0 or later with default settings (Compatibility with OmegaT 2.6 option off for XLIFF filter) to ensure correct handling.

Allow segmentation in the OmegaT project — set this option to set the sentence_seg flag of the project to true. This option is ignore, and the flag always set to false, if there are segmented text units in the extracted text. That is, if there is a Segmentation Step active before the creation of the translation kit, or if one of the input file has pre-segmented entries.

If you want to override the pre-segmentation, you can use the Desegmentation Step before the creation of the translation kit.

Include post-processing hook — set this option to include a hook in the OmegaT project that will allow OmegaT's Create Translated Documents command to automatically invoke the Rainbow Translation Kit Merging process. In order for the command to be invoked, you must check the Also allow per-project external commands checkbox in OmegaT in the Options > Save... panel.

Note: Support for this feature requires version 2.6.2 or higher of OmegaT.

The hook is in the form of a CLI command:

 java -jar /path/to/okapi/lib/rainbow.jar -x TranslationKitPostProcessing -np "${projectRoot}manifest.rkm" -fc okf_rainbowkit-noprompt
  • /path/to/okapi will be an absolute path to the user's Okapi installation or it will be ${OKAPI_HOME} if a valid OKAPI_HOME environment variable is set.
  • ${projectRoot} and ${OKAPI_HOME} are expanded by OmegaT upon execution.
  • On Mac OS X an additional argument, -XstartOnFirstThread, is included. If you move a translation kit to or from an OS X system, you may have to manually either add or remove this argument. See below for instructions.

Note: The hook will fail if rainbow.jar cannot be found at the specified path. This may happen in the following situations:

  • You move your Okapi installation
  • You send the translation kit to a machine with Okapi installed at a different location, or not installed at all

In this case you can either:

  • Re-create the translation kit on the target machine, or
  • Manually fix the path in OmegaT: Go to Project > Properties > External Post-processing Command.

Details

Sub-Directories

The extracted files are stored in the same directory structure as the original files, relative to the root of the file set.

For example if you have two files named index.html in two different sub-directories, they will be both extracted as index.html.xlf but each on its corresponding sub-directory.

Inline Codes

By default, inline codes are represented using the placeholder notation (<g>/<x/>/<bx>/<ex>) in the XLIFF documents, which are represented as numbered letter codes in OmegaT that are independent of the content they contain (e.g. <g1>/<x2/>/<b3>/<e3>).

Optionally you may use encapsulation notation (<bpt>/<ept>/<ph>/<it>). In this mode, the appearance of the tags in OmegaT depends on the content of the tag (e.g. <bpt ...>foo</bpt> will show as <f0>).

The content of the inline codes in the TMX files for this package are output in the same notation as the XLIFF to ensure proper matching.

See above for notes on compatibility with various versions of OmegaT.

Segmentation

Segmentation is supported: OmegaT handles the XLIFF standard segmentation.

If the input is not segmented, you can use the Allow segmentation in the OmegaT project option to allow or not the segmentation to be done by OmegaT when opening the project.

Pre-Translation

Translatable entries can be pre-translated different ways. In this package, the result of pre-translation is as follows:

  • Existing translations marked as approved are put in the project_save.tmx file, located in the omegat sub-directory.
  • Existing translations without flag are put in the unapproved.tmx file, located in the tm sub-directory.
  • Existing translations coming from alternate-translation constructs in the original file (for example the <alt-trans> elements of an XLIFF document) are put in the alternates.tmx file, located in the tm sub-directory.
  • Translations coming from some leveraging steps are put in the leverage.tmx file, located in the tm sub-directory.

Package Layout

Assuming that your package name is pack1, your input root ends with main, the target language is French, you have selected to use the same filenames as the input files for the output files, and you have the following source files:

--- main
    |
    +--- index.html
    +--- myFile.idml
    +--- subDir
         |
         +--- index.html 

The layout of this package after creation will be:

--- pack1
    |
    +--- omegat.project
    +--- manifest.rkm
    +--- glossary
    |--- tm
    |    |
    |    +---- *.tmx
    |    
    +--- omegat
    |    |
    |    +--- project_save.tmx
    |
    +--- original
    |    |
    |    +--- index.html
    |    +--- myFile.idml
    |    +--- subDir
    |         |
    |         +--- index.html
    |
    +--- source
    |    |
    |    +--- index.html.xlf
    |    +--- myFile.idml.xlf
    |    +--- subDir
    |         |
    |         +--- index.html.xlf
    |
    +--- target
  • original contains a copy of the original source documents. You needs those files for post-processing.
  • source contains the documents that are to be translated.
  • target contains no files. This is where the translated files should go. They will be created by OmegaT with the command Create Translated Documents.
  • tm contains any TMX files generated by the step.
  • omegat contains the project's TMX file.
  • glossary contains nothing. This is where you can place your glossaries.

For the post-processing the target sub-directory must contain the translated XLIFF documents. After post-processing a new done sub-directory is created:

--- pack1
    |
    +--- target
    |    |
    |    +--- index.html.xlf
    |    +--- myFile.idml.xlf
    |    +--- subDir
    |         |
    |         +--- index.html.xlf
    |
    +--- done
    |    |
    |    +--- index.html
    |    +--- myFile.idml
    |    +--- subDir
    |         |
    |         +--- index.html
    |
    +... (same as after creation)
  • done contains the merged translated documents. This directory is created during post-processing.

Limitations

This package is BETA