Tikal - Miscellaneous Commands

From Okapi Framework
Jump to navigation Jump to search

Segment Files

This command applies SRX segmentation rules to the input files. If the file format supports segmented output (e.g. XLIFF, TTX) the result of the segmentation is written in the output files.

You can use the -seg option to specify that the extracted text should be segmented. Use -seg without file name to use the default segmentation rules, use "-seg myRules.srx" to specify your own rules. The rules file must be in SRX format.

The output files have a .out extension pre-pended to the original extension. For example, if your original file is myFile.html, the translated document should be myFile.out.html.

The syntax of this command is:

-s [options] inputFile [inputFile2...]

Where the options are:

-fc configId The identifier of the filter configuration to use for the extraction.
-ie encoding The encoding name of the input files. this is used only if the filter cannot detect the encoding from the input file itself.
-sl srcLang The code of the source language of the input files. See more details...
-tl trgLang The code of the target language for the output (also used in the input if the input documents are multilingual). See more details...
-seg [srxFile] The segmentation rules to utilize. To specify the default rules that come with the installation, use -seg without filename. The default rules are in config/defaultSegmentation.srx in your Okapi main directory.
-rd rootDirectory The root directory (by default the user's home directory).

For example:

tikal -s myFile.xlf

Creates an output document named myFile.out.xlf from the input document myFile.xlf. The entries in the output have been segmented according the default segmentation rules.

List Filter Configurations

This command lists all the filter configurations available for Tikal. The configurations listed are the ones you can use as filter configurations the the input files (-fc option). This configuration indicates how to extract the document.

The syntax of this command is:

-lfc | -listconf

For example:

tikal -listconf

Lists all the filter configurations currently available.

Edit Filter Configurations

This command edits or view filter configurations.

Note: This command requires access to UI editors that are available only if you have one of the okapi-apps platform-specific distribution. If you run this command from the okapi-lib cross-platform distribution you will get an error. To edit filter configurations in the okapi-lib distribution, open the .fprm files. Make sure to always save your modifications in UTF-8.

The syntax of this command is:

-e [[-fc] configId]

For example:

tikal -e okf_regex@myConfig

Edits the filter configuration okf_regex@myConfig. This is a user configuration for the Regex Filter.

tikal -e

Opens the Filter Configurations dialog box, where all the available configurations are listed and can be viewed or edited, and from where you can create new configurations.

Output Scoping Report

This command allows you to output a scoping report including word count, matching statistics, etc.

The report will be output to stdout. The content is the same as the Scoping Report Step's default template:

  • Date
  • File list
  • Total word count
  • If leveraging is used:
    • Exact Local Context match word count
    • 100% Match word count
    • Fuzzy Match word count
    • Repetition word count

The syntax of this command is:

 -sr [options] inputFile [inputFile2...]

Available options:

-fc configId The identifier of the filter configuration to use for the extraction.
-ie encoding The encoding name of the input files. this is used only if the filter cannot detect the encoding from the input file itself.
-sl srcLang The code of the source language of the input files. See more details...
-tl trgLang The code of the target language for the output (also used in the input if the input documents are multilingual). See more details...
-seg [srxFile] The segmentation rules to utilize. To specify the default rules that come with the installation, use -seg without filename. The default rules are in config/defaultSegmentation.srx in your Okapi main directory.
-pen tmDirectory|
-tt [hostname[:port]]|
-gs configFile|
-mm [key]|
-gg configFile|
-apertium [configFile]|
-ms configFile|
-tda configFile|
-lingo24 configFile|
-mmt url [context]|
-bi bilingualFile
A translation resource connector to use to translate the document: -pen for the Pensieve TM Connector, -tt for the Translate Toolkit TM Connector, -gs for the GlobalSight TM Connector, -mm for MyMemory TM Connector, -gg for the Google MT v2 Connector, -apertium for the Apertium MT Connector, -ms for the Microsoft Translator Connector, -tda for the TDA Translation Repository Connector, -lingo24 for the Lingo24 Premium MT Connector, -mmt for the ModernMT API Connector and -bi for the Bilingual File Connector.

The leveraging occurs after segmentation, if you have specified segmentation rules.

Note that some Internet-based resource may be slow and result in lengthy processing time. Be also aware that some translation resources may not always provide a good handling of inline codes.

-opt threshold TM query option: The threshold is a number between 0 and 100. If this option is not set the default is 95. Note that this option may be limited for some search engines because of the way they are configured.
-maketmx [tmxFile] Generates a TMX document with all the entries leveraged. You can specify the name of the document, if you do not it will be named pretrans.tmx.