Rainbow - Byte-Order-Mark Conversion

From Okapi Framework
Jump to navigation Jump to search

Overview

Note: Starting in M10, this utility has been replaced by a pre-defined pipeline that uses the BOM Conversion Step.

This utility allows you to add or remove the Byte-Order-Mark (BOM) to or from UTF-8 and UTF-16 files.

The Byte-Order-Mark is a special Unicode mark that is used to indicate if a file is in little-endian (U+FEFF) or big-endian (U+FFFE) mode. For more information on the BOM see http://www.unicode.org/faq/utf_bom.html.

This utility does not use filters.

Caller Parameters

  • The list of the input documents to modify (Input list 1).
  • The default input encoding (when Adding BOM).
  • The names and locations of the output documents.

Parameters

Options Tab

Actions on the Byte-Order-Mark

Remove the Byte-Order-Mark if it is present — Select this option to remove the BOM from the input files if one is detected. By default only the BOM of UTF-8 files are removed.

Remove also UTF-16 BOMs — Select this option to also remove the BOM from UTF-16 files if one is detected. This is not something that is recommended: UTF-16 files must have a BOM.

Add the Byte-Order-Mark if it is not already present — Select this option to add a BOM in the input files if one is not detected. Note that the input files must already be in UTF-8 or UTF-16. When using this option, you also must specify the encoding of each file, so the utility can add the proper type of BOM.