OpenXML Filter

From Okapi Framework
Jump to: navigation, search

Overview

This filter allows you to process the different types of documents of the Microsoft Office suite from 2007 and later, such as DOCX (text documents), XLSX (spreadsheets) and PPTX (presentations). These documents are based on the OpenXML format, opposed to the binary formats used by pre-2007 versions of Office.

Parameters

The filter parameters are divided into General Options, which apply to all formats, and format-specific options.

General Options

Translate Document Properties
When checked, exposes the following document properties for translation: title, subject, creator, description, category, keywords, content status. Default: on.
Translate Comments
When checked, exposes document comments for translation. Default: on.
Clean Tags Aggressively
When checked, strips additional formatting tags related to text spacing. This is meant to improve filtering in cases where Office documents were converted from other formats (in particular, PDF), and imperfect conversion added a lot of extra formatting noise. Default: off.

Word Options

Translated Headers and Footers
When checked, exposes header and footer content for translation. Default: on.
Translated Hidden Text
When checked, exposes hidden text for translation. Default: on.
Exclude Graphical Metadata
When not checked, labels associated with drawings and word art are exposed for translation. When checked, these labels (which are frequently not displayed in the document) are suppressed. Default: off.
Styles to Exclude
Text using any of the selected styles will not be exposed for translation . Default: none.

Excel Options

Translate Hidden Rows and Columns
When checked, hidden rows and columns are exposed for translation. Default: off.
Exclude Marked Columns in Each Sheet
When checked, columns selected in the "Sheet # Columns to Exclude" lists will be excluded from translation. The filter allows for sheets 1 and 2 to be configured individually. Sheets 3 and higher must be configured as a single group. Default: off.
Colors to Exclude
Text with a foreground color matching any of the selected colors in this option will be excluded from translation. These colors correspond to the standard color palette of Excel 2010. The configuration itself stores these values as RGB, so specific colors not explicitly listed here may be excluded by modifying the .fprm file by hand. Default: none.

PowerPoint Options

Translate Notes
When checked, expose slide notes for translation. Default: off.
Translate Masters
When checked, expose master slides for translation. This will also expose for translation content from layouts that are currently in use by at least one slide. Default: off.

Limitations