Ratel - Groups and Options

From Okapi Framework
Jump to: navigation, search

This dialog box allows you to create, edit and remove groups of rules and language maps.

Options

Cascade language maps matching — Set this option to enable language map matches to cascade.

Segment sub-flow items — Set this option to set the SRX segmentsubflows attribute to yes. This option is used by some tool to know if they should segment text runs embedded inside other text runs.

Include opening in-line codes — Set this option to include any opening in-line code located just before a break with the segment being closed. If this option is not set such in-line code should be placed in the next segment. For example, assuming a period follow by a space has a break after the period:

 ON: Sentence one.<b> Sentence two.</b> ==> [Sentence one.<b>][ Sentence two.</b>]
OFF: Sentence one.<b> Sentence two.</b> ==> [Sentence one.][<b> Sentence two.</b>]

Include closing in-line codes — Set this option to include any closing in-line code located just before a break with the segment being closed. If this option is not set such in-line code should be placed in the next segment. For example, assuming a period follow by a space has a break after the period:

 ON: Sentence <b>one.</b> Sentence two. ==> [Sentence <b>one.</b>][ Sentence two.]
OFF: Sentence <b>one.</b> Sentence two. ==> [Sentence <b>one.][</b> Sentence two.]

Include isolated in-line codes — Set this option to include any isolated code located just before a break with the segment being closed. If this option is not set such in-line code should be placed in the next segment. For example, assuming a period follow by a space has a break after the period:

 ON: Sentence one.<br/> Sentence two. ==> [Sentence one.<br/>][ Sentence two.]
OFF: Sentence one.<br/> Sentence two. ==> [Sentence one.][<br/> Sentence two.]

Extensions

These options in this group are not standard SRX options, and using them may prevent your rules to be fully interoperable with other SRX-compatible tools.

Use Java regular expressions engine — Set this option have the document's rules interpreted with the Java regular expression engine rather than the standard ICU engine. The Okapi segmenter was not implementing the ICU patterns until M16. See the page "SRX and Java" for more details about the difference.

Include all text for single segments — Set this option to force all entry that have only a single segment to enclose all the text of the entry.

Trim leading white-spaces — Set this option to move any leading white-space in a segment before that segment.

Trim trailing white-spaces — Set this option to move any trailing white-space in a segment after that segment.

Header comments — XML comment located at just after the <srx> start tag in the SRX document.

Language Rules

List of all the groups of rules in the document.

Add — Click this button to add a group of rules to the list.

Rename — Click this button to rename the group of rules currently selected. Note that references to that group in the language maps are not updated automatically.

Remove — Click this button to remove the group of rules currently selected. The program will ask for confirmation. Note that references to that group in the language maps are not updated automatically.

Language Maps

List of all the language maps in the document.

Note: Note that in the case of the Okapi framework, the language codes are usually normalized to lowercase at runtime, in some other software they may not. So you want to make sure your regular expression takes any letter-case into account. For example use '[Ee][Nn]' instead of 'en' or 'EN'.

Add — Click this button to open the Edit Language Map dialog box to add a language map to the list.

Edit — Click this button to open the Edit Language Map dialog box to edit the language map currently selected.

Remove — Click this button to remove the language map currently selected.

Move Up — Click this button to move the language map currently selected upward in the list.

Move Down — Click this button to move the language map currently selected downward in the list.