MIF Filter

From Okapi Framework
Jump to: navigation, search

Overview

This filter allows you to process MIF documents. MIF (Maker Interchange Format) generated and read by Adobe FrameMaker. The specification for MIF 9.0 can be found on the Adobe Web site.

Processing Details

Input Encoding

The encoding of the input MIF document is automatically detected based on the version of the file and different other information in the document.

MIF v8 and above normally use UTF-8.

For MIF v7 and below things are a lot more complicated. At this time only files with text in MacRoman is processed correctly. The MIF encoding FrameRoman is not an encoding supported by Java and further work is being done to support it as well as to detect changes in encoding based on font selection.

Output Encoding

MIF v8 and above are automatically output in UTF-8.

Parameters

Options Tab

Extract variables — Set this option to extract the definitions of the variables.

Extract index markers — Set this option to extract the index markers in the extractable pages. The text of each index entry is extracted in a separate text unit, before the text unit that contains the index marker.

Extract links — Set this option to extract URLs of the the links in the extractable pages. Each URL is extracted in a separate text unit, before the text unit that contains the hypertext marker.

Type of page to extract

Body pages — Set this option to extract the body pages.

Hidden pages — Set this option to extract the hidden pages.

Master pages — Set this option to extract the master pages.

Reference pages — Set this option to extract the reference pages. Note that by default FrameMaker creates its new documents with several reference pages that contain text.

Inline Codes Tab

Has inline codes as defined below — Set this option to use the specified regular expressions on the text of the extracted items. Any match will be converted to an inline code. By default the expression is:

<\$.*?>

Add — Click this button to add a new rule.

Remove — Click this button to remove the current rule.

Move Up — Click this button to move the current rule upward.

Move down — Click this button to move the current rule downward.

[Top-right text box] — Enter the regular expression for the current rule. Use the Modify button to enter the edit mode. The expression must be a valid regular expression. You can check the syntax (and the effect of the rule) as it automatically tests it against the test data in the text box below and shows the result in the bottom-right text box.

Modify — Click this button to edit the expression of the current rule. This button is labeled Accept when you are in edit mode.

Accept — Click this button to save any changes you have made to the expression and leave the edit mode. This button is labeled Modify when you are not in edit mode.

Discard — Click this button to leave the edit mode and revert the current rule to the expression it had before you started the edit mode.

Patterns — Click this button to display some help on regular expression patterns.

Test using all rules — Set this option to test all the rules at the same time. The syntax of the current rule is automatically checked. See the effect it has on the sample text. The result of the test are displayed in the bottom right result box. The parts of the text that are matches of the expressions are displayed in <> brackets. If the Test using all rules option is set, the test takes all rules of the set in account, if it is not set only the current rule is tested.

[Middle-right text box] — Optional test data to test the regular expression for the current rule or all rules depending on the Test using all rules option.

[Bottom-right text box] — Shows the result of the regular expression applied to the test data.

Limitations

  • This filter is BETA.
  • Support for MIF 7.0 with non-Latin-1 language is limited: you may get corrupted extended characters in some cases.
  • You may run into Java heap memory issue if the document includes very large embedded insets (e.g. images). The workaround for this is to link to external objects rather than embed them.
  • The filter does not do font mapping yet, so if the translated file is in a language not supported by the fonts used in the source document, you need to update the paragraph and character catalogs to use fonts providing the proper support.