Id-Based Copy Step

From Okapi Framework
Jump to navigation Jump to search

Overview

This step copies into a destination file (first input file) the text of a reference file (second input file) for text units that have the same id. The ids are taken from the name (TextUnit.getName()) of each text unit.

If you are working with lists of input files, they should be paired. For example, in Rainbow, the first file in Input List 1 must correspond to the first file in Input List 2, and so forth.

Any filter that produces unique names (i.e., id) for its text units will work with this aligner, for example the Properties Filter.

Takes: filter events. Sends: filter events.

For each pair of input files:

  1. The reference file (second input file) is read into a table indexed on the name of each text unit.
  2. The destination file (first input file) is processed:
    For each text unit in the destination file:
    1. The step search if there is an entry with the same name in the reference table.
    2. If a match is found: If the reference file is monolingual (like a properties file) the source text of the text unit of the table is copied into the target of the text unit of the destination file. If the reference file is multilingual (like a PO file) the target text is used.
      If no match is found: The text unit of the first file remains untouched.
  3. After processing each pair of input file, the list of entries in the table that were not used is listed as warnings.

A few additional notes:

  • Entries in the destination file that are set to not translatable are not modified.
  • The process copies the complete content of the text units, no adjustment is made for segmented entries. Note that you can remove segmentation with the Desegmentation Step.
  • The entries in the reference and destination files do not have to be sorted.
  • Technically, the reference file can be in a different format than the destination file, but no adjustment is made for the inline codes in the content copied from the reference file into the destination file. In other words: the result may or may not be fine, depending on the file formats, whether there are inline codes or not, and your expectations.
  • Both files must have text units with unique names (e.g. unique resname in XLIFF).
  • No warning is generated if an entry of the destination file is not found in the reference file.
  • A warning generated if an entry is found in the reference file but not the destination file does not necessarily means there is a problem, it can simply be that the given entry is obsolete and has been removed in the newer file.
  • If a destination file is not associated with a reference file, a warning is generated, and the file is treated as with an empty reference file.

Parameters

If the text unit has a match, the following options are available:

Set the text unit as non-translatable — Set this option to set the text unit as non-translatable.

Set the target property 'approved' to 'yes' — Set this option to set the target property approved to yes.

Entries that do not have a match have these properties left as they were.

Limitations

  • Because this step uses a table in memory, you may run into memory limitation if you are working with very large reference files.