YAML Filter

From Okapi Framework
Jump to: navigation, search

Overview

The YAML Filter is an Okapi component that implements the IFilter interface for the YAML files and supports Ruby on Rails message variables. The filter is implemented in the class net.sf.okapi.filters.yaml.YamlFilter of the library.

A YAML file looks like the example below. The highlighted parts are extractable:

fr:
  activerecord:
    errors:
      template:
        header:
          list: [one, two, three]
          map: {key: value, key2: value2}
          one: "Impossible d'enregistrer {{model}}: 1 erreur"
          other: "Impossible d'enregistrer {{model}}: {{count}} erreurs."
        body: "Veuillez vérifier les champs suivants :"
      messages:
        inclusion: "n'est pas inclus(e) dans la liste"
        exclusion: "n'est pas disponible"
        invalid: "n'est pas valide"
        confirmation: "ne concorde pas avec la confirmation"

Processing Details

Input Encoding

The filter decides which encoding to use for the input document using the following logic:

  • If the file has a Unicode Byte-Order-Mark:
    • Then, the corresponding encoding (e.g. UTF-8, UTF-16, etc.) is used.
  • Otherwise, the input encoding used is the default encoding that was specified when opening the document.

Line-Breaks

The type of line-breaks of the output is the same as the one of the original input.

Identifier

Each extracted entry is assigned as name the sequence of all its parents identifiers. For example, in the example above, the name of the text unit with the content "n'est pas disponible" is fr/activerecord/errors/messages/exclusion.

Parameters

Options Tab

Stand-alone strings

Extract strings without associated key — Set this option to extract string that are not associated directly to a key value.

Strings with keys

Extract all key/strings pairs — Set this option to extract all strings that have a key associated. If a regular expression for exceptions is defined, the strings that have a key matching the expression are not extracted.

Do not extract key/string pairs — Set the option to not extract any string that has an associated key. If a regular expression for exceptions is defined, the strings that have a key matching the expression are extracted.

Excepted when the key matches the following regular expression — Enter a regular expression that correspond to the keys that should have a behavior inverse to the default behavior you have selected for the key/strings pairs.

Use the key as the resname — Set this option to use the value of the key as the value of the name of the extracted item (resname in XLIFF).

Use the full key path — Set this option to use the full key path in the resname. For example: menu/value/popup/menuitem/value. The use key name as resname option must be set for this option to take effect. If enabled, exception regular expressions apply to the full path.

Content Processing Tab

Process text content with this sub-filter — Specify an Okapi filter ID (e.g. okf_html) to process the content of all translatable text with that filter. Leave this field blank for default behavior.

Find inline codes by patterns defined below — Set this option to use the specified regular expressions on the text of the extracted items. Any match will be converted to an inline code.

Note: This option cannot be used together with the sub-filtering option.

By default the expression is:

((%(([-0+#]?)[-0+#]?)((\d\$)?)(([\d\*]*)(\.[\d\*]*)?)[dioxXucsfeEgGpn])
|((\\r\\n)|\\a|\\b|\\f|\\n|\\r|\\t|\\v)
|(\{\d.*?\}))

Add — Click this button to add a new rule.

Remove — Click this button to remove the current rule.

Move Up — Click this button to move the current rule upward.

Move down — Click this button to move the current rule downward.

[Top-right text box] — Enter the regular expression for the current rule. Use the Modify button to enter the edit mode. The expression must be a valid regular expression. You can check the syntax (and the effect of the rule) as it automatically tests it against the test data in the text box below and shows the result in the bottom-right text box.

Modify — Click this button to edit the expression of the current rule. This button is labeled Accept when you are in edit mode.

Accept — Click this button to save any changes you have made to the expression and leave the edit mode. This button is labeled Modify when you are not in edit mode.

Discard — Click this button to leave the edit mode and revert the current rule to the expression it had before you started the edit mode.

Patterns — Click this button to display some help on regular expression patterns.

Test using all rules — Set this option to test all the rules at the same time. The syntax of the current rule is automatically checked. See the effect it has on the sample text. The result of the test are displayed in the bottom right result box. The parts of the text that are matches of the expressions are displayed in <> brackets. If the Test using all rules option is set, the test takes all rules of the set in account, if it is not set only the current rule is tested.

[Middle-right text box] — Optional test data to test the regular expression for the current rule or all rules depending on the Test using all rules option.

[Bottom-right text box] — Shows the result of the regular expression applied to the test data.

Limitations

Wrapped lines in the original source document will be unwrapped in the target document. That is, all text will be on a single line. If an illegal character is introduced the user will need to manually add double or single quotes so that the target YAML file is valid. A WARNING will be issued by the filter if this scenario can be detected.