<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>http://okapiframework.org/wiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Ctingley</id>
	<title>Okapi Framework - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="http://okapiframework.org/wiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Ctingley"/>
	<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php/Special:Contributions/Ctingley"/>
	<updated>2026-04-15T08:17:57Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.38.2</generator>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Filters&amp;diff=1064</id>
		<title>Filters</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Filters&amp;diff=1064"/>
		<updated>2026-03-23T23:23:04Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* List of the Filters */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Filters are the components that convert input documents from their native file format into a common internal set of [[Glossary#Resource|resources]] that all Okapi components use. The extracted content can be re-written into the original file format. When using the steps, the extraction is done by the [[Raw Document to Filter Events Step]] and the re-writing by the [[Filter Events to Raw Document Step]].&lt;br /&gt;
&lt;br /&gt;
Note: The [[Okapi Filters Plugin for OmegaT]] allows you to use some of the filters directly from [http://www.omegat.org OmegaT].&lt;br /&gt;
&lt;br /&gt;
==List of the Filters==&lt;br /&gt;
&lt;br /&gt;
The framework distribution comes with the following filters:&lt;br /&gt;
&lt;br /&gt;
{| cellpadding=&amp;quot;8&amp;quot; width=100%&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
|&lt;br /&gt;
* [[Archive Filter]]&lt;br /&gt;
* [[DTD Filter]]&lt;br /&gt;
* [[Doxygen Filter]]&lt;br /&gt;
* [[DXF Filter]]&lt;br /&gt;
* [[EPUB Filter]]&lt;br /&gt;
* [[HTML Filter]]&lt;br /&gt;
* [[HTML5-ITS Filter]]&lt;br /&gt;
* [[ICML Filter]]&lt;br /&gt;
* [[IDML Filter]]&lt;br /&gt;
* [[JSON Filter]]&lt;br /&gt;
* [[Markdown Filter]]&lt;br /&gt;
* [[Message Format Filter]]&lt;br /&gt;
* [[MIF Filter]]&lt;br /&gt;
* [[Moses Text Filter]]&lt;br /&gt;
* [[Multi-Parsers Filter]]&lt;br /&gt;
* [[OpenOffice Filter]]&lt;br /&gt;
* [[OpenXML Filter|OpenXML (MS Office) Filter]]&lt;br /&gt;
|&lt;br /&gt;
* [[PDF Filter]]&lt;br /&gt;
* [[Pensieve TM Filter]]&lt;br /&gt;
* [[PHP Content Filter]]&lt;br /&gt;
* [[Plain Text Filter]]&lt;br /&gt;
* [[PO Filter]]&lt;br /&gt;
* [[Properties Filter]]&lt;br /&gt;
* [[Rainbow Translation Kit Filter]]&lt;br /&gt;
* [[Regex Filter]]&lt;br /&gt;
* [[SDL Trados Package Filter]]&lt;br /&gt;
* [[Simplification Filter]]&lt;br /&gt;
* [[Table Filter]]&lt;br /&gt;
* [[TMX Filter]]&lt;br /&gt;
* [[Trados-Tagged RTF Filter]]&lt;br /&gt;
|&lt;br /&gt;
* [[Transifex Filter]]&lt;br /&gt;
* [[TS Filter]]&lt;br /&gt;
* [[TTX Filter]]&lt;br /&gt;
* [[TXML Filter]]&lt;br /&gt;
* [[Wiki Filter]]&lt;br /&gt;
* [[WSXZ Package Filter]]&lt;br /&gt;
* [[Vignette Filter]]&lt;br /&gt;
* [[XLIFF Filter]]&lt;br /&gt;
* [[XLIFF-2 Filter]]&lt;br /&gt;
* [[XML Filter]]&lt;br /&gt;
* [[XML Stream Filter]]&lt;br /&gt;
* [[YAML Filter]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==Supported File Formats==&lt;br /&gt;
&lt;br /&gt;
The following is a list of some of the file formats supported by the distribution through [[Understanding Filter Configurations|pre-defined configurations]]:&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;6&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|+&lt;br /&gt;
| '''Format''' || '''Extensions''' || '''Pre-Defined Configuration''' || '''Filter''' || '''Notes'''&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Android Strings || .xml || &amp;lt;code&amp;gt;okf_xml-AndroidStrings&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Apple Stringsdict || .stringsdict || &amp;lt;code&amp;gt;okf_xml-AppleStringsdict&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Archive || .zip || &amp;lt;code&amp;gt;okf_archive&amp;lt;/code&amp;gt; || [[Archive Filter]] || Meta filter that processes zip files with various formats as one file.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Auto Xliff || .xlf, .xliff || &amp;lt;code&amp;gt;okf_autoxliff&amp;lt;/code&amp;gt; || [[Auto Xliff Filter]] || Detects the version of an XLIFF file and then hands parsing off to the appropriate filter &lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| AutoCAD DXF || .dxf || &amp;lt;code&amp;gt;okf_dxf&amp;lt;/code&amp;gt; || [[DXF Filter]] || Only supports textual DXF, not binary DXF&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| CSV (Comma-separated values files) || .csv, .txt || &amp;lt;code&amp;gt;okf_table_csv&amp;lt;/code&amp;gt; || [[Table Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| CSV (Multiple complex sub-formats) || .csv || &amp;lt;code&amp;gt;okf_multiparsers&amp;lt;/code&amp;gt; || [[Multi-Parsers Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| DITA || .dita, .ditamap, .xml || &amp;lt;code&amp;gt;okf_xmlstream-dita&amp;lt;/code&amp;gt; || [[XML Stream Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| DocBook v5.0 || .xml || &amp;lt;code&amp;gt;okf_xml-docbook&amp;lt;/code&amp;gt; || [[XML Filter]] || Since Okapi 1.42. &amp;amp;lt;footnote&amp;gt; is not handled properly.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| DokuWiki pages || .txt || &amp;lt;code&amp;gt;okf_wiki&amp;lt;/code&amp;gt; || [[Wiki Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Doxygen-commented files || .c, .h, cpp || &amp;lt;code&amp;gt;okf_doxygen&amp;lt;/code&amp;gt; || [[Doxygen Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| DTD || .dtd || &amp;lt;code&amp;gt;okf_dtd&amp;lt;/code&amp;gt; || [[DTD Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| EPUB || .epub || &amp;lt;code&amp;gt;okf_epub&amp;lt;/code&amp;gt; || [[EPUB Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Fixed-Width Columns Table || .txt || &amp;lt;code&amp;gt;okf_table_fwc&amp;lt;/code&amp;gt; || [[Table Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Idiom WorldServer XLIFF || .xlf || &amp;lt;code&amp;gt;okf_xliff-iws&amp;lt;/code&amp;gt; || [[XLIFF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| InCopy ICML || .wcml || &amp;lt;code&amp;gt;okf_icml&amp;lt;/code&amp;gt; || [[ICML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| InDesign IDML || .idml || &amp;lt;code&amp;gt;okf_idml&amp;lt;/code&amp;gt; || [[IDML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| iOS/Mac Strings|| .strings || &amp;lt;code&amp;gt;okf_regex-macStrings&amp;lt;/code&amp;gt; || [[Regex Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Java Properties || .properties || &amp;lt;code&amp;gt;okf_properties&amp;lt;/code&amp;gt; || [[Properties Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Java Properties (Output not escaped) || .properties || &amp;lt;code&amp;gt;okf_properties-outputNotEscaped&amp;lt;/code&amp;gt; || [[Properties Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Java XML Properties || .xml || &amp;lt;code&amp;gt;okf_xml-JavaProperties&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Java XML Properties (HTML strings) || .xml || &amp;lt;code&amp;gt;okf_xmlstream-JavaPropertiesHTML&amp;lt;/code&amp;gt; || [[XML Stream Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| JSON || .json || &amp;lt;code&amp;gt;okf_json&amp;lt;/code&amp;gt; || [[JSON Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Haiku CatKeys || .catkeys || &amp;lt;code&amp;gt;okf_table_catkeys&amp;lt;/code&amp;gt; || [[Table Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| HTML (any) || .html, .htm || &amp;lt;code&amp;gt;okf_html&amp;lt;/code&amp;gt; || [[HTML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| HTML (Well-formed, and XHTML) || .html, .htm|| &amp;lt;code&amp;gt;okf_html-wellFormed&amp;lt;/code&amp;gt; || [[HTML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| HTML5 (and XHTML5) || .html, .htm|| &amp;lt;code&amp;gt;okf_itshtml5&amp;lt;/code&amp;gt; || [[HTML5-ITS Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Markdown || .md || &amp;lt;code&amp;gt;okf_markdown&amp;lt;/code&amp;gt; || [[Markdown Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Microsoft Excel 2007/2010 || .xlsx, .xlsm, .xltx, .xltm || &amp;lt;code&amp;gt;okf_openxml&amp;lt;/code&amp;gt; || [[OpenXML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Microsoft PowerPoint 2007/2010 || .pptx, .pptm, .potx, .potm, .ppsx, .ppsm || &amp;lt;code&amp;gt;okf_openxml&amp;lt;/code&amp;gt; || [[OpenXML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Microsoft Visio || .vsdx, .vsdm || &amp;lt;code&amp;gt;okf_openxml&amp;lt;/code&amp;gt; || [[OpenXML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Microsoft Word 2007/2010 || .docx, .docm, .dotx, .dotm || &amp;lt;code&amp;gt;okf_openxml&amp;lt;/code&amp;gt; || [[OpenXML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| MIF || .mif || &amp;lt;code&amp;gt;okf_mif&amp;lt;/code&amp;gt; || [[MIF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Moses Text || .txt || &amp;lt;code&amp;gt;okf_mosestext&amp;lt;/code&amp;gt; || [[Moses Text Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| OpenOffice.org Calc || .ods, .ots || &amp;lt;code&amp;gt;okf_odf&amp;lt;/code&amp;gt; || [[OpenOffice Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| OpenOffice.org Draw || .odg, .otg || &amp;lt;code&amp;gt;okf_odf&amp;lt;/code&amp;gt; || [[OpenOffice Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| OpenOffice.org Impress || .odp, .otp || &amp;lt;code&amp;gt;okf_odf&amp;lt;/code&amp;gt; || [[OpenOffice Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| OpenOffice.org Writer || .odt, .ott || &amp;lt;code&amp;gt;okf_odf&amp;lt;/code&amp;gt; || [[OpenOffice Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| PDF || .pdf || &amp;lt;code&amp;gt;okf_pdf&amp;lt;/code&amp;gt; || [[PDF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| [[Pensieve TM]] || .pentm || &amp;lt;code&amp;gt;okf_pensieve&amp;lt;/code&amp;gt; || [[Pensieve TM Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| PHP Content || .php || &amp;lt;code&amp;gt;okf_phpcontent&amp;lt;/code&amp;gt; || [[PHP Content Filter]] || Can be used as a subfilter only&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Plain Text (Line = text unit) || .txt || &amp;lt;code&amp;gt;okf_plaintext&amp;lt;/code&amp;gt; || [[ Plain Text Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Plain Text (Paragraph = text unit) || .txt || &amp;lt;code&amp;gt;okf_plaintext_paragraphs&amp;lt;/code&amp;gt; || [[Plain Text Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| PO || .po || &amp;lt;code&amp;gt;okf_po&amp;lt;/code&amp;gt; || [[PO Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| PO (Monolingual style) || .po || &amp;lt;code&amp;gt;okf_po-monolingual&amp;lt;/code&amp;gt; || [[PO Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Rainbow Translation Kit manifests || .rkm || &amp;lt;code&amp;gt;okf_rainbowkit&amp;lt;/code&amp;gt; || [[Rainbow Translation Kit Filter]] || Used as a tkit reader only&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Regex (Any text-based format) || .txt || &amp;lt;code&amp;gt;okf_regex&amp;lt;/code&amp;gt; || [[Regex Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| RDF (Mozilla RDF) || .rdf || &amp;lt;code&amp;gt;okf_xml-MozillaRDF&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| RESX || .resx || &amp;lt;code&amp;gt;okf_xml-resx&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| SDLPPX || .sdlppx || &amp;lt;code&amp;gt;okf_sdlpackage&amp;lt;/code&amp;gt; || [[SDL Trados Package Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| SDLRPX || .sdlrpx || &amp;lt;code&amp;gt;okf_sdlpackage&amp;lt;/code&amp;gt; || [[SDL Trados Package Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| SDL[[XLIFF]] || .sdlxlf || &amp;lt;code&amp;gt;okf_xliff-sdl&amp;lt;/code&amp;gt; || [[XLIFF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Skype Language Files || .lang || &amp;lt;code&amp;gt;okf_properties-skypeLang&amp;lt;/code&amp;gt; || [[Properties Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| SRT (Sub-Rip Text, sub-titles files) || .srt || &amp;lt;code&amp;gt;okf_regex-srt&amp;lt;/code&amp;gt; || [[Regex Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Tab-Delimiter files || .tsv, .txt || &amp;lt;code&amp;gt;okf_table_tsv&amp;lt;/code&amp;gt; || [[Table Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Tex files || .tex || &amp;lt;code&amp;gt;okf_tex&amp;lt;/code&amp;gt; || [[TEX Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| [[TMX]] || .tmx || &amp;lt;code&amp;gt;okf_tmx&amp;lt;/code&amp;gt; || [[TMX Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Transifex project || .txp || &amp;lt;code&amp;gt;okf_transifex&amp;lt;/code&amp;gt; || [[Transifex Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Trados-Tagged RTF || .rtf || &amp;lt;code&amp;gt;okf_tradosrtf&amp;lt;/code&amp;gt; || [[Trados-Tagged RTF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| TS - Qt TS files || .ts || &amp;lt;code&amp;gt;okf_ts&amp;lt;/code&amp;gt; || [[TS Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| TTX - Trados TagEditor TTX files || .ttx || &amp;lt;code&amp;gt;okf_ttx&amp;lt;/code&amp;gt; || [[TTX Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| TXML - Wordfast Pro TXML files || .txml || &amp;lt;code&amp;gt;okf_txml&amp;lt;/code&amp;gt; || [[TXML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Vignette Export/Import Content || .xml || &amp;lt;code&amp;gt;okf_vignette&amp;lt;/code&amp;gt; || [[Vignette Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| WSXZ Package Filter || .wsxz || &amp;lt;code&amp;gt;okf_wsxzpackage&amp;lt;/code&amp;gt; || [[WSXZ Package Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| XHTML || .html, .htm || &amp;lt;code&amp;gt;okf_html-wellFormed&amp;lt;/code&amp;gt; || [[HTML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| WIX (Windows Installer XML) localization files || .wix || &amp;lt;code&amp;gt;okf_xml-WixLocalization&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| [[XLIFF]] v1.2 || .xlf, .xliff || &amp;lt;code&amp;gt;okf_xliff&amp;lt;/code&amp;gt; || [[XLIFF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| [[XLIFF]] v2 || .xlf || &amp;lt;code&amp;gt;okf_xliff2&amp;lt;/code&amp;gt; || [[XLIFF-2 Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| XML (Generic, using [[ITS]] defaults) || .xml || &amp;lt;code&amp;gt;okf_xml&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| XML (Generic, using stream reader) || .xml || &amp;lt;code&amp;gt;okf_xmlstream&amp;lt;/code&amp;gt; || [[XML Stream Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| YAML (Generic YAML filter) || .yml, .yaml || &amp;lt;code&amp;gt;okf_yaml&amp;lt;/code&amp;gt; || [[YAML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Message Format (ICU Message Format Filter) || Any container format that supports subfilters || &amp;lt;code&amp;gt;okf_messageformat&amp;lt;/code&amp;gt; || [[Message Format Filter]] ||&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Note that most filters allow you to [[Understanding Filter Configurations|create your own configurations]] to support more file formats.&lt;br /&gt;
&lt;br /&gt;
==Code Simplification Rules==&lt;br /&gt;
&lt;br /&gt;
There are two levels of code simplification: filter and step (the [[Inline Codes Simplifier Step]] and [[Post-segmentation Inline Codes Removal Step]]). And there are different ways of configuring it:&lt;br /&gt;
&lt;br /&gt;
Firstly, the extraction pipeline can contain just:&lt;br /&gt;
: - [[Raw Document to Filter Events Step]]&lt;br /&gt;
&lt;br /&gt;
At the moment, only [[IDML Filter]], [[XML Filter]] and [[Simplification Filter]] support this. It should be noted that the last one performs like a wrapper for another filter.&lt;br /&gt;
&lt;br /&gt;
Secondly, the extraction pipeline can look like that:&lt;br /&gt;
: - [[Raw Document to Filter Events Step]]&lt;br /&gt;
: - [[Inline Codes Simplifier Step]]&lt;br /&gt;
&lt;br /&gt;
This is the only way for filters that do not support their own code simplification, and it should be used with care because the final merge may not always handle this correctly. The aforementioned [[IDML Filter]] and [[XML Filter]] can perform their own simplification, and the added [[Inline Codes Simplifier Step]] should not affect the events produced.&lt;br /&gt;
&lt;br /&gt;
Thirdly, the extraction pipeline can consist of:&lt;br /&gt;
: - [[Raw Document to Filter Events Step]]&lt;br /&gt;
: - [[Segmentation Step]]&lt;br /&gt;
: - [[Post-segmentation Inline Codes Removal Step]]&lt;br /&gt;
&lt;br /&gt;
Here, the [[Post-segmentation Inline Codes Removal Step]] performs code simplification after segmentation rules are applied, and it may be useful for skipping extra codes between segments.&lt;br /&gt;
&lt;br /&gt;
By default, the [[Inline Codes Simplifier Step]] and [[Post-segmentation Inline Codes Removal Step]] maximise the trimming and merging (aka simplification) of inline codes. This can be tuned via the following string parameters:&lt;br /&gt;
: - &amp;lt;code&amp;gt;removeLeadingTrailingCodes&amp;lt;/code&amp;gt; - &amp;lt;code&amp;gt;true&amp;lt;/code&amp;gt; by default&lt;br /&gt;
: - &amp;lt;code&amp;gt;mergeCodes&amp;lt;/code&amp;gt; - &amp;lt;code&amp;gt;true&amp;lt;/code&amp;gt; by default&lt;br /&gt;
: - &amp;lt;code&amp;gt;rules&amp;lt;/code&amp;gt; - empty by default&lt;br /&gt;
&lt;br /&gt;
Only the [[Inline Codes Simplifier Step]] configuration can be overridden by the optional filter ones via the following parameters:&lt;br /&gt;
: - &amp;lt;code&amp;gt;moveLeadingAndTrailingCodesToSkeleton&amp;lt;/code&amp;gt; - maps to the &amp;lt;code&amp;gt;removeLeadingTrailingCodes&amp;lt;/code&amp;gt;&lt;br /&gt;
: - &amp;lt;code&amp;gt;mergeAdjacentCodes&amp;lt;/code&amp;gt; - maps to the &amp;lt;code&amp;gt;mergeCodes&amp;lt;/code&amp;gt;&lt;br /&gt;
: - &amp;lt;code&amp;gt;simplifierRules&amp;lt;/code&amp;gt; - maps to the &amp;lt;code&amp;gt;rules&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The simplification rules allow the prevention of specific codes trimming or merging. &lt;br /&gt;
&lt;br /&gt;
===General Syntax===&lt;br /&gt;
&lt;br /&gt;
The rules parser ignores irrelevant whitespace. Rules can be separated by spaces, newlines or nothing. This makes it easier to accommodate various container formats and their whitespace normalization rules. When a rule applies, it means &amp;quot;do not simplify the match code&amp;quot;. Uppercase tokens are constants and predefined by the rule parser. Multiple rules are always OR'ed together.&lt;br /&gt;
&lt;br /&gt;
For more details, see the JavaCC grammar: &amp;lt;code&amp;gt;../okapi/core/src/main/javacc/SimplifierRules.jj&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Rule Examples===&lt;br /&gt;
&lt;br /&gt;
If Code has any of these flags, then don't simplify &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if DELETABLE or ADDABLE or CLONEABLE;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;quot;=&amp;quot; is string match&lt;br /&gt;
Match basic TAGTYPE opening, closing or standalone &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if DATA = &amp;quot;a&amp;quot; and TAGTYPE = OPENING;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;quot;~&amp;quot; is regex match&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if DATA ~ &amp;quot;a.*&amp;quot;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can negate any of the match operators &lt;br /&gt;
Don't simplify if the DATA does not match the regex &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if DATA !~ &amp;quot;a.*&amp;quot;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Match on type, linebreak in this case, don't simplify &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if TYPE = &amp;quot;lb&amp;quot;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Don't simplify any rich text types&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if TYPE = &amp;quot;bold&amp;quot; or TYPE = &amp;quot;italic&amp;quot; or TYPE = &amp;quot;underline&amp;quot;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Expressions can be recursive (supports embedded parens)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if TYPE = &amp;quot;bold&amp;quot; or (DATA = &amp;quot;bar&amp;quot; or (DATA = &amp;quot;foo&amp;quot; and TYPE = &amp;quot;underline&amp;quot;));&amp;lt;/pre&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&lt;br /&gt;
===Filter Config Examples===&lt;br /&gt;
&lt;br /&gt;
Examples of using simplifier rules within the filter config formats used by Okapi.&lt;br /&gt;
&lt;br /&gt;
'''YAML:'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
simplifierRules: |&lt;br /&gt;
  if ADDABLE or DELETABLE or CLONEABLE;&lt;br /&gt;
  if DATA = &amp;quot;&amp;lt;br/&amp;gt;&amp;quot; or DATA = &amp;quot;&amp;lt;font&amp;gt;&amp;quot; or DATA = &amp;quot;&amp;lt;/font&amp;gt;&amp;quot; or DATA = &amp;quot;&amp;lt;/a&amp;gt;&amp;quot;;&lt;br /&gt;
  if DATA ~ &amp;quot;\\&amp;lt;font.+&amp;quot; or DATA ~ &amp;quot;\\&amp;lt;img.+&amp;quot; or DATA ~ &amp;quot;\\&amp;lt;a.+&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''ITS:'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;its:rules xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot; version=&amp;quot;1.0&amp;quot; xmlns:itsx=&amp;quot;http://www.w3.org/2008/12/its-extensions&amp;quot; xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;!-- See ITS specification at: http://www.w3.org/TR/its/ --&amp;gt;&lt;br /&gt;
 &amp;lt;its:translateRule selector=&amp;quot;//*&amp;quot; translate=&amp;quot;yes&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;its:withinTextRule selector=&amp;quot;//codeph&amp;quot; withinText=&amp;quot;yes&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;its:withinTextRule selector=&amp;quot;//ph&amp;quot; withinText=&amp;quot;yes&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;okp:simplifierRules moveLeadingAndTrailingCodesToSkeleton=&amp;quot;yes&amp;quot; mergeAdjacentCodes=&amp;quot;yes&amp;quot;&amp;gt;&lt;br /&gt;
 if ADDABLE or DELETABLE or CLONEABLE; if DATA ~ &amp;quot;.+&amp;quot;;&lt;br /&gt;
 &amp;lt;/okp:simplifierRules&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''FPRM (Parameters):'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#v1&lt;br /&gt;
extractNotes.b=true&lt;br /&gt;
simplifierRules=if ADDABLE or DELETABLE or CLONEABLE; if DATA ~ &amp;quot;.+&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Font Mapping==&lt;br /&gt;
&lt;br /&gt;
The font mapping can be considered as a filter's ability to automatically substitute font information in the target document on the fly, according to a provided configuration - this helps to reduce the amount of reformatting and post-translation DTP. It is supported by IDML and OpenXML (DOCX, PPTX and XLSX documents) filters at the moment.&lt;br /&gt;
&lt;br /&gt;
The following font mapping configuration options are available:&lt;br /&gt;
* The source locale regular expression pattern: &amp;lt;code&amp;gt;.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;en.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;en-UK&amp;lt;/code&amp;gt;, etc. It can be ommited to apply the mapping to any source locale.&lt;br /&gt;
* The target locale regular expression pattern: &amp;lt;code&amp;gt;.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ru.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ru-RU&amp;lt;/code&amp;gt;, etc. It can be ommited to apply the mapping to any target locale.&lt;br /&gt;
* The source font name regular expression pattern: &amp;lt;code&amp;gt;.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;Arial.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;Times New Roman&amp;lt;/code&amp;gt;, etc. It can be ommited to apply the mapping to any source font name found.&lt;br /&gt;
* The target font name: &amp;lt;code&amp;gt;Arial&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;Times New Roman&amp;lt;/code&amp;gt;, etc. It should not be empty. And if it is made so, the mapping configuration is ignored.&lt;br /&gt;
&lt;br /&gt;
Also, the configured font mappings are applied in the order they are stated. And the final target font value is determined by a sequential&lt;br /&gt;
substitution of the source font values. I.e. if there is more than one mapping:&lt;br /&gt;
# &amp;lt;code&amp;gt;Arial&amp;lt;/code&amp;gt; -&amp;gt; &amp;lt;code&amp;gt;Times New Roman&amp;lt;/code&amp;gt;&lt;br /&gt;
# &amp;lt;code&amp;gt;Times New Roman&amp;lt;/code&amp;gt; -&amp;gt; &amp;lt;code&amp;gt;Sans Serif&amp;lt;/code&amp;gt;&lt;br /&gt;
then the first mapping will produce &amp;lt;code&amp;gt;Times New Roman&amp;lt;/code&amp;gt; replacement and the second one will be applied to this new value, thus, ending up with the &amp;lt;code&amp;gt;Sans Serif&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The parameters serialisation format can look like that:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fontMappings.0.sourceLocalePattern=en.*&lt;br /&gt;
fontMappings.0.targetLocalePattern=ru.*&lt;br /&gt;
fontMappings.0.sourceFontPattern=Times.*&lt;br /&gt;
fontMappings.0.targetFont=Arial Unicode MS&lt;br /&gt;
fontMappings.1.sourceLocalePattern=ru&lt;br /&gt;
fontMappings.1.targetLocalePattern=fr&lt;br /&gt;
fontMappings.1.sourceFontPattern=The Sims Sans&lt;br /&gt;
fontMappings.1.targetFont=Arial Unicode MS&lt;br /&gt;
fontMappings.number.i=2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When source locale, target locale and source font are omitted:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fontMappings.0.targetFont=Arial Unicode MS&lt;br /&gt;
fontMappings.number.i=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
And this is the same as the abovementioned:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fontMappings.0.sourceLocalePattern=.*&lt;br /&gt;
fontMappings.0.targetLocalePattern=.*&lt;br /&gt;
fontMappings.0.sourceFontPattern=.*&lt;br /&gt;
fontMappings.0.targetFont=Arial Unicode MS&lt;br /&gt;
fontMappings.number.i=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Filters&amp;diff=1063</id>
		<title>Filters</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Filters&amp;diff=1063"/>
		<updated>2026-03-23T23:22:53Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* Supported File Formats */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Filters are the components that convert input documents from their native file format into a common internal set of [[Glossary#Resource|resources]] that all Okapi components use. The extracted content can be re-written into the original file format. When using the steps, the extraction is done by the [[Raw Document to Filter Events Step]] and the re-writing by the [[Filter Events to Raw Document Step]].&lt;br /&gt;
&lt;br /&gt;
Note: The [[Okapi Filters Plugin for OmegaT]] allows you to use some of the filters directly from [http://www.omegat.org OmegaT].&lt;br /&gt;
&lt;br /&gt;
==List of the Filters==&lt;br /&gt;
&lt;br /&gt;
The framework distribution comes with the following filters:&lt;br /&gt;
&lt;br /&gt;
{| cellpadding=&amp;quot;8&amp;quot; width=100%&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
|&lt;br /&gt;
* [[Archive Filter]]&lt;br /&gt;
* [[DTD Filter]]&lt;br /&gt;
* [[Doxygen Filter]]&lt;br /&gt;
* DXF Filter&lt;br /&gt;
* [[EPUB Filter]]&lt;br /&gt;
* [[HTML Filter]]&lt;br /&gt;
* [[HTML5-ITS Filter]]&lt;br /&gt;
* [[ICML Filter]]&lt;br /&gt;
* [[IDML Filter]]&lt;br /&gt;
* [[JSON Filter]]&lt;br /&gt;
* [[Markdown Filter]]&lt;br /&gt;
* [[Message Format Filter]]&lt;br /&gt;
* [[MIF Filter]]&lt;br /&gt;
* [[Moses Text Filter]]&lt;br /&gt;
* [[Multi-Parsers Filter]]&lt;br /&gt;
* [[OpenOffice Filter]]&lt;br /&gt;
* [[OpenXML Filter|OpenXML (MS Office) Filter]]&lt;br /&gt;
|&lt;br /&gt;
* [[PDF Filter]]&lt;br /&gt;
* [[Pensieve TM Filter]]&lt;br /&gt;
* [[PHP Content Filter]]&lt;br /&gt;
* [[Plain Text Filter]]&lt;br /&gt;
* [[PO Filter]]&lt;br /&gt;
* [[Properties Filter]]&lt;br /&gt;
* [[Rainbow Translation Kit Filter]]&lt;br /&gt;
* [[Regex Filter]]&lt;br /&gt;
* [[SDL Trados Package Filter]]&lt;br /&gt;
* [[Simplification Filter]]&lt;br /&gt;
* [[Table Filter]]&lt;br /&gt;
* [[TMX Filter]]&lt;br /&gt;
* [[Trados-Tagged RTF Filter]]&lt;br /&gt;
|&lt;br /&gt;
* [[Transifex Filter]]&lt;br /&gt;
* [[TS Filter]]&lt;br /&gt;
* [[TTX Filter]]&lt;br /&gt;
* [[TXML Filter]]&lt;br /&gt;
* [[Wiki Filter]]&lt;br /&gt;
* [[WSXZ Package Filter]]&lt;br /&gt;
* [[Vignette Filter]]&lt;br /&gt;
* [[XLIFF Filter]]&lt;br /&gt;
* [[XLIFF-2 Filter]]&lt;br /&gt;
* [[XML Filter]]&lt;br /&gt;
* [[XML Stream Filter]]&lt;br /&gt;
* [[YAML Filter]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==Supported File Formats==&lt;br /&gt;
&lt;br /&gt;
The following is a list of some of the file formats supported by the distribution through [[Understanding Filter Configurations|pre-defined configurations]]:&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;6&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|+&lt;br /&gt;
| '''Format''' || '''Extensions''' || '''Pre-Defined Configuration''' || '''Filter''' || '''Notes'''&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Android Strings || .xml || &amp;lt;code&amp;gt;okf_xml-AndroidStrings&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Apple Stringsdict || .stringsdict || &amp;lt;code&amp;gt;okf_xml-AppleStringsdict&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Archive || .zip || &amp;lt;code&amp;gt;okf_archive&amp;lt;/code&amp;gt; || [[Archive Filter]] || Meta filter that processes zip files with various formats as one file.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Auto Xliff || .xlf, .xliff || &amp;lt;code&amp;gt;okf_autoxliff&amp;lt;/code&amp;gt; || [[Auto Xliff Filter]] || Detects the version of an XLIFF file and then hands parsing off to the appropriate filter &lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| AutoCAD DXF || .dxf || &amp;lt;code&amp;gt;okf_dxf&amp;lt;/code&amp;gt; || [[DXF Filter]] || Only supports textual DXF, not binary DXF&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| CSV (Comma-separated values files) || .csv, .txt || &amp;lt;code&amp;gt;okf_table_csv&amp;lt;/code&amp;gt; || [[Table Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| CSV (Multiple complex sub-formats) || .csv || &amp;lt;code&amp;gt;okf_multiparsers&amp;lt;/code&amp;gt; || [[Multi-Parsers Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| DITA || .dita, .ditamap, .xml || &amp;lt;code&amp;gt;okf_xmlstream-dita&amp;lt;/code&amp;gt; || [[XML Stream Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| DocBook v5.0 || .xml || &amp;lt;code&amp;gt;okf_xml-docbook&amp;lt;/code&amp;gt; || [[XML Filter]] || Since Okapi 1.42. &amp;amp;lt;footnote&amp;gt; is not handled properly.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| DokuWiki pages || .txt || &amp;lt;code&amp;gt;okf_wiki&amp;lt;/code&amp;gt; || [[Wiki Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Doxygen-commented files || .c, .h, cpp || &amp;lt;code&amp;gt;okf_doxygen&amp;lt;/code&amp;gt; || [[Doxygen Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| DTD || .dtd || &amp;lt;code&amp;gt;okf_dtd&amp;lt;/code&amp;gt; || [[DTD Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| EPUB || .epub || &amp;lt;code&amp;gt;okf_epub&amp;lt;/code&amp;gt; || [[EPUB Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Fixed-Width Columns Table || .txt || &amp;lt;code&amp;gt;okf_table_fwc&amp;lt;/code&amp;gt; || [[Table Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Idiom WorldServer XLIFF || .xlf || &amp;lt;code&amp;gt;okf_xliff-iws&amp;lt;/code&amp;gt; || [[XLIFF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| InCopy ICML || .wcml || &amp;lt;code&amp;gt;okf_icml&amp;lt;/code&amp;gt; || [[ICML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| InDesign IDML || .idml || &amp;lt;code&amp;gt;okf_idml&amp;lt;/code&amp;gt; || [[IDML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| iOS/Mac Strings|| .strings || &amp;lt;code&amp;gt;okf_regex-macStrings&amp;lt;/code&amp;gt; || [[Regex Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Java Properties || .properties || &amp;lt;code&amp;gt;okf_properties&amp;lt;/code&amp;gt; || [[Properties Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Java Properties (Output not escaped) || .properties || &amp;lt;code&amp;gt;okf_properties-outputNotEscaped&amp;lt;/code&amp;gt; || [[Properties Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Java XML Properties || .xml || &amp;lt;code&amp;gt;okf_xml-JavaProperties&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Java XML Properties (HTML strings) || .xml || &amp;lt;code&amp;gt;okf_xmlstream-JavaPropertiesHTML&amp;lt;/code&amp;gt; || [[XML Stream Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| JSON || .json || &amp;lt;code&amp;gt;okf_json&amp;lt;/code&amp;gt; || [[JSON Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Haiku CatKeys || .catkeys || &amp;lt;code&amp;gt;okf_table_catkeys&amp;lt;/code&amp;gt; || [[Table Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| HTML (any) || .html, .htm || &amp;lt;code&amp;gt;okf_html&amp;lt;/code&amp;gt; || [[HTML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| HTML (Well-formed, and XHTML) || .html, .htm|| &amp;lt;code&amp;gt;okf_html-wellFormed&amp;lt;/code&amp;gt; || [[HTML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| HTML5 (and XHTML5) || .html, .htm|| &amp;lt;code&amp;gt;okf_itshtml5&amp;lt;/code&amp;gt; || [[HTML5-ITS Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Markdown || .md || &amp;lt;code&amp;gt;okf_markdown&amp;lt;/code&amp;gt; || [[Markdown Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Microsoft Excel 2007/2010 || .xlsx, .xlsm, .xltx, .xltm || &amp;lt;code&amp;gt;okf_openxml&amp;lt;/code&amp;gt; || [[OpenXML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Microsoft PowerPoint 2007/2010 || .pptx, .pptm, .potx, .potm, .ppsx, .ppsm || &amp;lt;code&amp;gt;okf_openxml&amp;lt;/code&amp;gt; || [[OpenXML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Microsoft Visio || .vsdx, .vsdm || &amp;lt;code&amp;gt;okf_openxml&amp;lt;/code&amp;gt; || [[OpenXML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Microsoft Word 2007/2010 || .docx, .docm, .dotx, .dotm || &amp;lt;code&amp;gt;okf_openxml&amp;lt;/code&amp;gt; || [[OpenXML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| MIF || .mif || &amp;lt;code&amp;gt;okf_mif&amp;lt;/code&amp;gt; || [[MIF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Moses Text || .txt || &amp;lt;code&amp;gt;okf_mosestext&amp;lt;/code&amp;gt; || [[Moses Text Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| OpenOffice.org Calc || .ods, .ots || &amp;lt;code&amp;gt;okf_odf&amp;lt;/code&amp;gt; || [[OpenOffice Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| OpenOffice.org Draw || .odg, .otg || &amp;lt;code&amp;gt;okf_odf&amp;lt;/code&amp;gt; || [[OpenOffice Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| OpenOffice.org Impress || .odp, .otp || &amp;lt;code&amp;gt;okf_odf&amp;lt;/code&amp;gt; || [[OpenOffice Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| OpenOffice.org Writer || .odt, .ott || &amp;lt;code&amp;gt;okf_odf&amp;lt;/code&amp;gt; || [[OpenOffice Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| PDF || .pdf || &amp;lt;code&amp;gt;okf_pdf&amp;lt;/code&amp;gt; || [[PDF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| [[Pensieve TM]] || .pentm || &amp;lt;code&amp;gt;okf_pensieve&amp;lt;/code&amp;gt; || [[Pensieve TM Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| PHP Content || .php || &amp;lt;code&amp;gt;okf_phpcontent&amp;lt;/code&amp;gt; || [[PHP Content Filter]] || Can be used as a subfilter only&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Plain Text (Line = text unit) || .txt || &amp;lt;code&amp;gt;okf_plaintext&amp;lt;/code&amp;gt; || [[ Plain Text Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Plain Text (Paragraph = text unit) || .txt || &amp;lt;code&amp;gt;okf_plaintext_paragraphs&amp;lt;/code&amp;gt; || [[Plain Text Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| PO || .po || &amp;lt;code&amp;gt;okf_po&amp;lt;/code&amp;gt; || [[PO Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| PO (Monolingual style) || .po || &amp;lt;code&amp;gt;okf_po-monolingual&amp;lt;/code&amp;gt; || [[PO Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Rainbow Translation Kit manifests || .rkm || &amp;lt;code&amp;gt;okf_rainbowkit&amp;lt;/code&amp;gt; || [[Rainbow Translation Kit Filter]] || Used as a tkit reader only&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Regex (Any text-based format) || .txt || &amp;lt;code&amp;gt;okf_regex&amp;lt;/code&amp;gt; || [[Regex Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| RDF (Mozilla RDF) || .rdf || &amp;lt;code&amp;gt;okf_xml-MozillaRDF&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| RESX || .resx || &amp;lt;code&amp;gt;okf_xml-resx&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| SDLPPX || .sdlppx || &amp;lt;code&amp;gt;okf_sdlpackage&amp;lt;/code&amp;gt; || [[SDL Trados Package Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| SDLRPX || .sdlrpx || &amp;lt;code&amp;gt;okf_sdlpackage&amp;lt;/code&amp;gt; || [[SDL Trados Package Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| SDL[[XLIFF]] || .sdlxlf || &amp;lt;code&amp;gt;okf_xliff-sdl&amp;lt;/code&amp;gt; || [[XLIFF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Skype Language Files || .lang || &amp;lt;code&amp;gt;okf_properties-skypeLang&amp;lt;/code&amp;gt; || [[Properties Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| SRT (Sub-Rip Text, sub-titles files) || .srt || &amp;lt;code&amp;gt;okf_regex-srt&amp;lt;/code&amp;gt; || [[Regex Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Tab-Delimiter files || .tsv, .txt || &amp;lt;code&amp;gt;okf_table_tsv&amp;lt;/code&amp;gt; || [[Table Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Tex files || .tex || &amp;lt;code&amp;gt;okf_tex&amp;lt;/code&amp;gt; || [[TEX Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| [[TMX]] || .tmx || &amp;lt;code&amp;gt;okf_tmx&amp;lt;/code&amp;gt; || [[TMX Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Transifex project || .txp || &amp;lt;code&amp;gt;okf_transifex&amp;lt;/code&amp;gt; || [[Transifex Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Trados-Tagged RTF || .rtf || &amp;lt;code&amp;gt;okf_tradosrtf&amp;lt;/code&amp;gt; || [[Trados-Tagged RTF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| TS - Qt TS files || .ts || &amp;lt;code&amp;gt;okf_ts&amp;lt;/code&amp;gt; || [[TS Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| TTX - Trados TagEditor TTX files || .ttx || &amp;lt;code&amp;gt;okf_ttx&amp;lt;/code&amp;gt; || [[TTX Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| TXML - Wordfast Pro TXML files || .txml || &amp;lt;code&amp;gt;okf_txml&amp;lt;/code&amp;gt; || [[TXML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Vignette Export/Import Content || .xml || &amp;lt;code&amp;gt;okf_vignette&amp;lt;/code&amp;gt; || [[Vignette Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| WSXZ Package Filter || .wsxz || &amp;lt;code&amp;gt;okf_wsxzpackage&amp;lt;/code&amp;gt; || [[WSXZ Package Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| XHTML || .html, .htm || &amp;lt;code&amp;gt;okf_html-wellFormed&amp;lt;/code&amp;gt; || [[HTML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| WIX (Windows Installer XML) localization files || .wix || &amp;lt;code&amp;gt;okf_xml-WixLocalization&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| [[XLIFF]] v1.2 || .xlf, .xliff || &amp;lt;code&amp;gt;okf_xliff&amp;lt;/code&amp;gt; || [[XLIFF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| [[XLIFF]] v2 || .xlf || &amp;lt;code&amp;gt;okf_xliff2&amp;lt;/code&amp;gt; || [[XLIFF-2 Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| XML (Generic, using [[ITS]] defaults) || .xml || &amp;lt;code&amp;gt;okf_xml&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| XML (Generic, using stream reader) || .xml || &amp;lt;code&amp;gt;okf_xmlstream&amp;lt;/code&amp;gt; || [[XML Stream Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| YAML (Generic YAML filter) || .yml, .yaml || &amp;lt;code&amp;gt;okf_yaml&amp;lt;/code&amp;gt; || [[YAML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Message Format (ICU Message Format Filter) || Any container format that supports subfilters || &amp;lt;code&amp;gt;okf_messageformat&amp;lt;/code&amp;gt; || [[Message Format Filter]] ||&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Note that most filters allow you to [[Understanding Filter Configurations|create your own configurations]] to support more file formats.&lt;br /&gt;
&lt;br /&gt;
==Code Simplification Rules==&lt;br /&gt;
&lt;br /&gt;
There are two levels of code simplification: filter and step (the [[Inline Codes Simplifier Step]] and [[Post-segmentation Inline Codes Removal Step]]). And there are different ways of configuring it:&lt;br /&gt;
&lt;br /&gt;
Firstly, the extraction pipeline can contain just:&lt;br /&gt;
: - [[Raw Document to Filter Events Step]]&lt;br /&gt;
&lt;br /&gt;
At the moment, only [[IDML Filter]], [[XML Filter]] and [[Simplification Filter]] support this. It should be noted that the last one performs like a wrapper for another filter.&lt;br /&gt;
&lt;br /&gt;
Secondly, the extraction pipeline can look like that:&lt;br /&gt;
: - [[Raw Document to Filter Events Step]]&lt;br /&gt;
: - [[Inline Codes Simplifier Step]]&lt;br /&gt;
&lt;br /&gt;
This is the only way for filters that do not support their own code simplification, and it should be used with care because the final merge may not always handle this correctly. The aforementioned [[IDML Filter]] and [[XML Filter]] can perform their own simplification, and the added [[Inline Codes Simplifier Step]] should not affect the events produced.&lt;br /&gt;
&lt;br /&gt;
Thirdly, the extraction pipeline can consist of:&lt;br /&gt;
: - [[Raw Document to Filter Events Step]]&lt;br /&gt;
: - [[Segmentation Step]]&lt;br /&gt;
: - [[Post-segmentation Inline Codes Removal Step]]&lt;br /&gt;
&lt;br /&gt;
Here, the [[Post-segmentation Inline Codes Removal Step]] performs code simplification after segmentation rules are applied, and it may be useful for skipping extra codes between segments.&lt;br /&gt;
&lt;br /&gt;
By default, the [[Inline Codes Simplifier Step]] and [[Post-segmentation Inline Codes Removal Step]] maximise the trimming and merging (aka simplification) of inline codes. This can be tuned via the following string parameters:&lt;br /&gt;
: - &amp;lt;code&amp;gt;removeLeadingTrailingCodes&amp;lt;/code&amp;gt; - &amp;lt;code&amp;gt;true&amp;lt;/code&amp;gt; by default&lt;br /&gt;
: - &amp;lt;code&amp;gt;mergeCodes&amp;lt;/code&amp;gt; - &amp;lt;code&amp;gt;true&amp;lt;/code&amp;gt; by default&lt;br /&gt;
: - &amp;lt;code&amp;gt;rules&amp;lt;/code&amp;gt; - empty by default&lt;br /&gt;
&lt;br /&gt;
Only the [[Inline Codes Simplifier Step]] configuration can be overridden by the optional filter ones via the following parameters:&lt;br /&gt;
: - &amp;lt;code&amp;gt;moveLeadingAndTrailingCodesToSkeleton&amp;lt;/code&amp;gt; - maps to the &amp;lt;code&amp;gt;removeLeadingTrailingCodes&amp;lt;/code&amp;gt;&lt;br /&gt;
: - &amp;lt;code&amp;gt;mergeAdjacentCodes&amp;lt;/code&amp;gt; - maps to the &amp;lt;code&amp;gt;mergeCodes&amp;lt;/code&amp;gt;&lt;br /&gt;
: - &amp;lt;code&amp;gt;simplifierRules&amp;lt;/code&amp;gt; - maps to the &amp;lt;code&amp;gt;rules&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The simplification rules allow the prevention of specific codes trimming or merging. &lt;br /&gt;
&lt;br /&gt;
===General Syntax===&lt;br /&gt;
&lt;br /&gt;
The rules parser ignores irrelevant whitespace. Rules can be separated by spaces, newlines or nothing. This makes it easier to accommodate various container formats and their whitespace normalization rules. When a rule applies, it means &amp;quot;do not simplify the match code&amp;quot;. Uppercase tokens are constants and predefined by the rule parser. Multiple rules are always OR'ed together.&lt;br /&gt;
&lt;br /&gt;
For more details, see the JavaCC grammar: &amp;lt;code&amp;gt;../okapi/core/src/main/javacc/SimplifierRules.jj&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Rule Examples===&lt;br /&gt;
&lt;br /&gt;
If Code has any of these flags, then don't simplify &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if DELETABLE or ADDABLE or CLONEABLE;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;quot;=&amp;quot; is string match&lt;br /&gt;
Match basic TAGTYPE opening, closing or standalone &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if DATA = &amp;quot;a&amp;quot; and TAGTYPE = OPENING;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;quot;~&amp;quot; is regex match&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if DATA ~ &amp;quot;a.*&amp;quot;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can negate any of the match operators &lt;br /&gt;
Don't simplify if the DATA does not match the regex &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if DATA !~ &amp;quot;a.*&amp;quot;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Match on type, linebreak in this case, don't simplify &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if TYPE = &amp;quot;lb&amp;quot;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Don't simplify any rich text types&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if TYPE = &amp;quot;bold&amp;quot; or TYPE = &amp;quot;italic&amp;quot; or TYPE = &amp;quot;underline&amp;quot;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Expressions can be recursive (supports embedded parens)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if TYPE = &amp;quot;bold&amp;quot; or (DATA = &amp;quot;bar&amp;quot; or (DATA = &amp;quot;foo&amp;quot; and TYPE = &amp;quot;underline&amp;quot;));&amp;lt;/pre&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&lt;br /&gt;
===Filter Config Examples===&lt;br /&gt;
&lt;br /&gt;
Examples of using simplifier rules within the filter config formats used by Okapi.&lt;br /&gt;
&lt;br /&gt;
'''YAML:'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
simplifierRules: |&lt;br /&gt;
  if ADDABLE or DELETABLE or CLONEABLE;&lt;br /&gt;
  if DATA = &amp;quot;&amp;lt;br/&amp;gt;&amp;quot; or DATA = &amp;quot;&amp;lt;font&amp;gt;&amp;quot; or DATA = &amp;quot;&amp;lt;/font&amp;gt;&amp;quot; or DATA = &amp;quot;&amp;lt;/a&amp;gt;&amp;quot;;&lt;br /&gt;
  if DATA ~ &amp;quot;\\&amp;lt;font.+&amp;quot; or DATA ~ &amp;quot;\\&amp;lt;img.+&amp;quot; or DATA ~ &amp;quot;\\&amp;lt;a.+&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''ITS:'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;its:rules xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot; version=&amp;quot;1.0&amp;quot; xmlns:itsx=&amp;quot;http://www.w3.org/2008/12/its-extensions&amp;quot; xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;!-- See ITS specification at: http://www.w3.org/TR/its/ --&amp;gt;&lt;br /&gt;
 &amp;lt;its:translateRule selector=&amp;quot;//*&amp;quot; translate=&amp;quot;yes&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;its:withinTextRule selector=&amp;quot;//codeph&amp;quot; withinText=&amp;quot;yes&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;its:withinTextRule selector=&amp;quot;//ph&amp;quot; withinText=&amp;quot;yes&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;okp:simplifierRules moveLeadingAndTrailingCodesToSkeleton=&amp;quot;yes&amp;quot; mergeAdjacentCodes=&amp;quot;yes&amp;quot;&amp;gt;&lt;br /&gt;
 if ADDABLE or DELETABLE or CLONEABLE; if DATA ~ &amp;quot;.+&amp;quot;;&lt;br /&gt;
 &amp;lt;/okp:simplifierRules&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''FPRM (Parameters):'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#v1&lt;br /&gt;
extractNotes.b=true&lt;br /&gt;
simplifierRules=if ADDABLE or DELETABLE or CLONEABLE; if DATA ~ &amp;quot;.+&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Font Mapping==&lt;br /&gt;
&lt;br /&gt;
The font mapping can be considered as a filter's ability to automatically substitute font information in the target document on the fly, according to a provided configuration - this helps to reduce the amount of reformatting and post-translation DTP. It is supported by IDML and OpenXML (DOCX, PPTX and XLSX documents) filters at the moment.&lt;br /&gt;
&lt;br /&gt;
The following font mapping configuration options are available:&lt;br /&gt;
* The source locale regular expression pattern: &amp;lt;code&amp;gt;.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;en.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;en-UK&amp;lt;/code&amp;gt;, etc. It can be ommited to apply the mapping to any source locale.&lt;br /&gt;
* The target locale regular expression pattern: &amp;lt;code&amp;gt;.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ru.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ru-RU&amp;lt;/code&amp;gt;, etc. It can be ommited to apply the mapping to any target locale.&lt;br /&gt;
* The source font name regular expression pattern: &amp;lt;code&amp;gt;.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;Arial.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;Times New Roman&amp;lt;/code&amp;gt;, etc. It can be ommited to apply the mapping to any source font name found.&lt;br /&gt;
* The target font name: &amp;lt;code&amp;gt;Arial&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;Times New Roman&amp;lt;/code&amp;gt;, etc. It should not be empty. And if it is made so, the mapping configuration is ignored.&lt;br /&gt;
&lt;br /&gt;
Also, the configured font mappings are applied in the order they are stated. And the final target font value is determined by a sequential&lt;br /&gt;
substitution of the source font values. I.e. if there is more than one mapping:&lt;br /&gt;
# &amp;lt;code&amp;gt;Arial&amp;lt;/code&amp;gt; -&amp;gt; &amp;lt;code&amp;gt;Times New Roman&amp;lt;/code&amp;gt;&lt;br /&gt;
# &amp;lt;code&amp;gt;Times New Roman&amp;lt;/code&amp;gt; -&amp;gt; &amp;lt;code&amp;gt;Sans Serif&amp;lt;/code&amp;gt;&lt;br /&gt;
then the first mapping will produce &amp;lt;code&amp;gt;Times New Roman&amp;lt;/code&amp;gt; replacement and the second one will be applied to this new value, thus, ending up with the &amp;lt;code&amp;gt;Sans Serif&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The parameters serialisation format can look like that:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fontMappings.0.sourceLocalePattern=en.*&lt;br /&gt;
fontMappings.0.targetLocalePattern=ru.*&lt;br /&gt;
fontMappings.0.sourceFontPattern=Times.*&lt;br /&gt;
fontMappings.0.targetFont=Arial Unicode MS&lt;br /&gt;
fontMappings.1.sourceLocalePattern=ru&lt;br /&gt;
fontMappings.1.targetLocalePattern=fr&lt;br /&gt;
fontMappings.1.sourceFontPattern=The Sims Sans&lt;br /&gt;
fontMappings.1.targetFont=Arial Unicode MS&lt;br /&gt;
fontMappings.number.i=2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When source locale, target locale and source font are omitted:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fontMappings.0.targetFont=Arial Unicode MS&lt;br /&gt;
fontMappings.number.i=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
And this is the same as the abovementioned:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fontMappings.0.sourceLocalePattern=.*&lt;br /&gt;
fontMappings.0.targetLocalePattern=.*&lt;br /&gt;
fontMappings.0.sourceFontPattern=.*&lt;br /&gt;
fontMappings.0.targetFont=Arial Unicode MS&lt;br /&gt;
fontMappings.number.i=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Filters&amp;diff=1062</id>
		<title>Filters</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Filters&amp;diff=1062"/>
		<updated>2026-03-23T23:21:42Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: Add stub entry for DXF filter&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Filters are the components that convert input documents from their native file format into a common internal set of [[Glossary#Resource|resources]] that all Okapi components use. The extracted content can be re-written into the original file format. When using the steps, the extraction is done by the [[Raw Document to Filter Events Step]] and the re-writing by the [[Filter Events to Raw Document Step]].&lt;br /&gt;
&lt;br /&gt;
Note: The [[Okapi Filters Plugin for OmegaT]] allows you to use some of the filters directly from [http://www.omegat.org OmegaT].&lt;br /&gt;
&lt;br /&gt;
==List of the Filters==&lt;br /&gt;
&lt;br /&gt;
The framework distribution comes with the following filters:&lt;br /&gt;
&lt;br /&gt;
{| cellpadding=&amp;quot;8&amp;quot; width=100%&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
|&lt;br /&gt;
* [[Archive Filter]]&lt;br /&gt;
* [[DTD Filter]]&lt;br /&gt;
* [[Doxygen Filter]]&lt;br /&gt;
* DXF Filter&lt;br /&gt;
* [[EPUB Filter]]&lt;br /&gt;
* [[HTML Filter]]&lt;br /&gt;
* [[HTML5-ITS Filter]]&lt;br /&gt;
* [[ICML Filter]]&lt;br /&gt;
* [[IDML Filter]]&lt;br /&gt;
* [[JSON Filter]]&lt;br /&gt;
* [[Markdown Filter]]&lt;br /&gt;
* [[Message Format Filter]]&lt;br /&gt;
* [[MIF Filter]]&lt;br /&gt;
* [[Moses Text Filter]]&lt;br /&gt;
* [[Multi-Parsers Filter]]&lt;br /&gt;
* [[OpenOffice Filter]]&lt;br /&gt;
* [[OpenXML Filter|OpenXML (MS Office) Filter]]&lt;br /&gt;
|&lt;br /&gt;
* [[PDF Filter]]&lt;br /&gt;
* [[Pensieve TM Filter]]&lt;br /&gt;
* [[PHP Content Filter]]&lt;br /&gt;
* [[Plain Text Filter]]&lt;br /&gt;
* [[PO Filter]]&lt;br /&gt;
* [[Properties Filter]]&lt;br /&gt;
* [[Rainbow Translation Kit Filter]]&lt;br /&gt;
* [[Regex Filter]]&lt;br /&gt;
* [[SDL Trados Package Filter]]&lt;br /&gt;
* [[Simplification Filter]]&lt;br /&gt;
* [[Table Filter]]&lt;br /&gt;
* [[TMX Filter]]&lt;br /&gt;
* [[Trados-Tagged RTF Filter]]&lt;br /&gt;
|&lt;br /&gt;
* [[Transifex Filter]]&lt;br /&gt;
* [[TS Filter]]&lt;br /&gt;
* [[TTX Filter]]&lt;br /&gt;
* [[TXML Filter]]&lt;br /&gt;
* [[Wiki Filter]]&lt;br /&gt;
* [[WSXZ Package Filter]]&lt;br /&gt;
* [[Vignette Filter]]&lt;br /&gt;
* [[XLIFF Filter]]&lt;br /&gt;
* [[XLIFF-2 Filter]]&lt;br /&gt;
* [[XML Filter]]&lt;br /&gt;
* [[XML Stream Filter]]&lt;br /&gt;
* [[YAML Filter]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==Supported File Formats==&lt;br /&gt;
&lt;br /&gt;
The following is a list of some of the file formats supported by the distribution through [[Understanding Filter Configurations|pre-defined configurations]]:&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;6&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|+&lt;br /&gt;
| '''Format''' || '''Extensions''' || '''Pre-Defined Configuration''' || '''Filter''' || '''Notes'''&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Android Strings || .xml || &amp;lt;code&amp;gt;okf_xml-AndroidStrings&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Apple Stringsdict || .stringsdict || &amp;lt;code&amp;gt;okf_xml-AppleStringsdict&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Archive || .zip || &amp;lt;code&amp;gt;okf_archive&amp;lt;/code&amp;gt; || [[Archive Filter]] || Meta filter that processes zip files with various formats as one file.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Auto Xliff || .xlf, .xliff || &amp;lt;code&amp;gt;okf_autoxliff&amp;lt;/code&amp;gt; || [[Auto Xliff Filter]] || Detects the version of an XLIFF file and then hands parsing off to the appropriate filter &lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| CSV (Comma-separated values files) || .csv, .txt || &amp;lt;code&amp;gt;okf_table_csv&amp;lt;/code&amp;gt; || [[Table Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| CSV (Multiple complex sub-formats) || .csv || &amp;lt;code&amp;gt;okf_multiparsers&amp;lt;/code&amp;gt; || [[Multi-Parsers Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| DITA || .dita, .ditamap, .xml || &amp;lt;code&amp;gt;okf_xmlstream-dita&amp;lt;/code&amp;gt; || [[XML Stream Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| DocBook v5.0 || .xml || &amp;lt;code&amp;gt;okf_xml-docbook&amp;lt;/code&amp;gt; || [[XML Filter]] || Since Okapi 1.42. &amp;amp;lt;footnote&amp;gt; is not handled properly.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| DokuWiki pages || .txt || &amp;lt;code&amp;gt;okf_wiki&amp;lt;/code&amp;gt; || [[Wiki Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Doxygen-commented files || .c, .h, cpp || &amp;lt;code&amp;gt;okf_doxygen&amp;lt;/code&amp;gt; || [[Doxygen Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| DTD || .dtd || &amp;lt;code&amp;gt;okf_dtd&amp;lt;/code&amp;gt; || [[DTD Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| EPUB || .epub || &amp;lt;code&amp;gt;okf_epub&amp;lt;/code&amp;gt; || [[EPUB Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Fixed-Width Columns Table || .txt || &amp;lt;code&amp;gt;okf_table_fwc&amp;lt;/code&amp;gt; || [[Table Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Idiom WorldServer XLIFF || .xlf || &amp;lt;code&amp;gt;okf_xliff-iws&amp;lt;/code&amp;gt; || [[XLIFF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| InCopy ICML || .wcml || &amp;lt;code&amp;gt;okf_icml&amp;lt;/code&amp;gt; || [[ICML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| InDesign IDML || .idml || &amp;lt;code&amp;gt;okf_idml&amp;lt;/code&amp;gt; || [[IDML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| iOS/Mac Strings|| .strings || &amp;lt;code&amp;gt;okf_regex-macStrings&amp;lt;/code&amp;gt; || [[Regex Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Java Properties || .properties || &amp;lt;code&amp;gt;okf_properties&amp;lt;/code&amp;gt; || [[Properties Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Java Properties (Output not escaped) || .properties || &amp;lt;code&amp;gt;okf_properties-outputNotEscaped&amp;lt;/code&amp;gt; || [[Properties Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Java XML Properties || .xml || &amp;lt;code&amp;gt;okf_xml-JavaProperties&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Java XML Properties (HTML strings) || .xml || &amp;lt;code&amp;gt;okf_xmlstream-JavaPropertiesHTML&amp;lt;/code&amp;gt; || [[XML Stream Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| JSON || .json || &amp;lt;code&amp;gt;okf_json&amp;lt;/code&amp;gt; || [[JSON Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Haiku CatKeys || .catkeys || &amp;lt;code&amp;gt;okf_table_catkeys&amp;lt;/code&amp;gt; || [[Table Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| HTML (any) || .html, .htm || &amp;lt;code&amp;gt;okf_html&amp;lt;/code&amp;gt; || [[HTML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| HTML (Well-formed, and XHTML) || .html, .htm|| &amp;lt;code&amp;gt;okf_html-wellFormed&amp;lt;/code&amp;gt; || [[HTML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| HTML5 (and XHTML5) || .html, .htm|| &amp;lt;code&amp;gt;okf_itshtml5&amp;lt;/code&amp;gt; || [[HTML5-ITS Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Markdown || .md || &amp;lt;code&amp;gt;okf_markdown&amp;lt;/code&amp;gt; || [[Markdown Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Microsoft Excel 2007/2010 || .xlsx, .xlsm, .xltx, .xltm || &amp;lt;code&amp;gt;okf_openxml&amp;lt;/code&amp;gt; || [[OpenXML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Microsoft PowerPoint 2007/2010 || .pptx, .pptm, .potx, .potm, .ppsx, .ppsm || &amp;lt;code&amp;gt;okf_openxml&amp;lt;/code&amp;gt; || [[OpenXML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Microsoft Visio || .vsdx, .vsdm || &amp;lt;code&amp;gt;okf_openxml&amp;lt;/code&amp;gt; || [[OpenXML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Microsoft Word 2007/2010 || .docx, .docm, .dotx, .dotm || &amp;lt;code&amp;gt;okf_openxml&amp;lt;/code&amp;gt; || [[OpenXML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| MIF || .mif || &amp;lt;code&amp;gt;okf_mif&amp;lt;/code&amp;gt; || [[MIF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Moses Text || .txt || &amp;lt;code&amp;gt;okf_mosestext&amp;lt;/code&amp;gt; || [[Moses Text Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| OpenOffice.org Calc || .ods, .ots || &amp;lt;code&amp;gt;okf_odf&amp;lt;/code&amp;gt; || [[OpenOffice Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| OpenOffice.org Draw || .odg, .otg || &amp;lt;code&amp;gt;okf_odf&amp;lt;/code&amp;gt; || [[OpenOffice Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| OpenOffice.org Impress || .odp, .otp || &amp;lt;code&amp;gt;okf_odf&amp;lt;/code&amp;gt; || [[OpenOffice Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| OpenOffice.org Writer || .odt, .ott || &amp;lt;code&amp;gt;okf_odf&amp;lt;/code&amp;gt; || [[OpenOffice Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| PDF || .pdf || &amp;lt;code&amp;gt;okf_pdf&amp;lt;/code&amp;gt; || [[PDF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| [[Pensieve TM]] || .pentm || &amp;lt;code&amp;gt;okf_pensieve&amp;lt;/code&amp;gt; || [[Pensieve TM Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| PHP Content || .php || &amp;lt;code&amp;gt;okf_phpcontent&amp;lt;/code&amp;gt; || [[PHP Content Filter]] || Can be used as a subfilter only&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Plain Text (Line = text unit) || .txt || &amp;lt;code&amp;gt;okf_plaintext&amp;lt;/code&amp;gt; || [[ Plain Text Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Plain Text (Paragraph = text unit) || .txt || &amp;lt;code&amp;gt;okf_plaintext_paragraphs&amp;lt;/code&amp;gt; || [[Plain Text Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| PO || .po || &amp;lt;code&amp;gt;okf_po&amp;lt;/code&amp;gt; || [[PO Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| PO (Monolingual style) || .po || &amp;lt;code&amp;gt;okf_po-monolingual&amp;lt;/code&amp;gt; || [[PO Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Rainbow Translation Kit manifests || .rkm || &amp;lt;code&amp;gt;okf_rainbowkit&amp;lt;/code&amp;gt; || [[Rainbow Translation Kit Filter]] || Used as a tkit reader only&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Regex (Any text-based format) || .txt || &amp;lt;code&amp;gt;okf_regex&amp;lt;/code&amp;gt; || [[Regex Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| RDF (Mozilla RDF) || .rdf || &amp;lt;code&amp;gt;okf_xml-MozillaRDF&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| RESX || .resx || &amp;lt;code&amp;gt;okf_xml-resx&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| SDLPPX || .sdlppx || &amp;lt;code&amp;gt;okf_sdlpackage&amp;lt;/code&amp;gt; || [[SDL Trados Package Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| SDLRPX || .sdlrpx || &amp;lt;code&amp;gt;okf_sdlpackage&amp;lt;/code&amp;gt; || [[SDL Trados Package Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| SDL[[XLIFF]] || .sdlxlf || &amp;lt;code&amp;gt;okf_xliff-sdl&amp;lt;/code&amp;gt; || [[XLIFF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Skype Language Files || .lang || &amp;lt;code&amp;gt;okf_properties-skypeLang&amp;lt;/code&amp;gt; || [[Properties Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| SRT (Sub-Rip Text, sub-titles files) || .srt || &amp;lt;code&amp;gt;okf_regex-srt&amp;lt;/code&amp;gt; || [[Regex Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Tab-Delimiter files || .tsv, .txt || &amp;lt;code&amp;gt;okf_table_tsv&amp;lt;/code&amp;gt; || [[Table Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Tex files || .tex || &amp;lt;code&amp;gt;okf_tex&amp;lt;/code&amp;gt; || [[TEX Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| [[TMX]] || .tmx || &amp;lt;code&amp;gt;okf_tmx&amp;lt;/code&amp;gt; || [[TMX Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Transifex project || .txp || &amp;lt;code&amp;gt;okf_transifex&amp;lt;/code&amp;gt; || [[Transifex Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Trados-Tagged RTF || .rtf || &amp;lt;code&amp;gt;okf_tradosrtf&amp;lt;/code&amp;gt; || [[Trados-Tagged RTF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| TS - Qt TS files || .ts || &amp;lt;code&amp;gt;okf_ts&amp;lt;/code&amp;gt; || [[TS Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| TTX - Trados TagEditor TTX files || .ttx || &amp;lt;code&amp;gt;okf_ttx&amp;lt;/code&amp;gt; || [[TTX Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| TXML - Wordfast Pro TXML files || .txml || &amp;lt;code&amp;gt;okf_txml&amp;lt;/code&amp;gt; || [[TXML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Vignette Export/Import Content || .xml || &amp;lt;code&amp;gt;okf_vignette&amp;lt;/code&amp;gt; || [[Vignette Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| WSXZ Package Filter || .wsxz || &amp;lt;code&amp;gt;okf_wsxzpackage&amp;lt;/code&amp;gt; || [[WSXZ Package Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| XHTML || .html, .htm || &amp;lt;code&amp;gt;okf_html-wellFormed&amp;lt;/code&amp;gt; || [[HTML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| WIX (Windows Installer XML) localization files || .wix || &amp;lt;code&amp;gt;okf_xml-WixLocalization&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| [[XLIFF]] v1.2 || .xlf, .xliff || &amp;lt;code&amp;gt;okf_xliff&amp;lt;/code&amp;gt; || [[XLIFF Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| [[XLIFF]] v2 || .xlf || &amp;lt;code&amp;gt;okf_xliff2&amp;lt;/code&amp;gt; || [[XLIFF-2 Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| XML (Generic, using [[ITS]] defaults) || .xml || &amp;lt;code&amp;gt;okf_xml&amp;lt;/code&amp;gt; || [[XML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| XML (Generic, using stream reader) || .xml || &amp;lt;code&amp;gt;okf_xmlstream&amp;lt;/code&amp;gt; || [[XML Stream Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| YAML (Generic YAML filter) || .yml, .yaml || &amp;lt;code&amp;gt;okf_yaml&amp;lt;/code&amp;gt; || [[YAML Filter]] ||&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| Message Format (ICU Message Format Filter) || Any container format that supports subfilters || &amp;lt;code&amp;gt;okf_messageformat&amp;lt;/code&amp;gt; || [[Message Format Filter]] ||&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Note that most filters allow you to [[Understanding Filter Configurations|create your own configurations]] to support more file formats.&lt;br /&gt;
&lt;br /&gt;
==Code Simplification Rules==&lt;br /&gt;
&lt;br /&gt;
There are two levels of code simplification: filter and step (the [[Inline Codes Simplifier Step]] and [[Post-segmentation Inline Codes Removal Step]]). And there are different ways of configuring it:&lt;br /&gt;
&lt;br /&gt;
Firstly, the extraction pipeline can contain just:&lt;br /&gt;
: - [[Raw Document to Filter Events Step]]&lt;br /&gt;
&lt;br /&gt;
At the moment, only [[IDML Filter]], [[XML Filter]] and [[Simplification Filter]] support this. It should be noted that the last one performs like a wrapper for another filter.&lt;br /&gt;
&lt;br /&gt;
Secondly, the extraction pipeline can look like that:&lt;br /&gt;
: - [[Raw Document to Filter Events Step]]&lt;br /&gt;
: - [[Inline Codes Simplifier Step]]&lt;br /&gt;
&lt;br /&gt;
This is the only way for filters that do not support their own code simplification, and it should be used with care because the final merge may not always handle this correctly. The aforementioned [[IDML Filter]] and [[XML Filter]] can perform their own simplification, and the added [[Inline Codes Simplifier Step]] should not affect the events produced.&lt;br /&gt;
&lt;br /&gt;
Thirdly, the extraction pipeline can consist of:&lt;br /&gt;
: - [[Raw Document to Filter Events Step]]&lt;br /&gt;
: - [[Segmentation Step]]&lt;br /&gt;
: - [[Post-segmentation Inline Codes Removal Step]]&lt;br /&gt;
&lt;br /&gt;
Here, the [[Post-segmentation Inline Codes Removal Step]] performs code simplification after segmentation rules are applied, and it may be useful for skipping extra codes between segments.&lt;br /&gt;
&lt;br /&gt;
By default, the [[Inline Codes Simplifier Step]] and [[Post-segmentation Inline Codes Removal Step]] maximise the trimming and merging (aka simplification) of inline codes. This can be tuned via the following string parameters:&lt;br /&gt;
: - &amp;lt;code&amp;gt;removeLeadingTrailingCodes&amp;lt;/code&amp;gt; - &amp;lt;code&amp;gt;true&amp;lt;/code&amp;gt; by default&lt;br /&gt;
: - &amp;lt;code&amp;gt;mergeCodes&amp;lt;/code&amp;gt; - &amp;lt;code&amp;gt;true&amp;lt;/code&amp;gt; by default&lt;br /&gt;
: - &amp;lt;code&amp;gt;rules&amp;lt;/code&amp;gt; - empty by default&lt;br /&gt;
&lt;br /&gt;
Only the [[Inline Codes Simplifier Step]] configuration can be overridden by the optional filter ones via the following parameters:&lt;br /&gt;
: - &amp;lt;code&amp;gt;moveLeadingAndTrailingCodesToSkeleton&amp;lt;/code&amp;gt; - maps to the &amp;lt;code&amp;gt;removeLeadingTrailingCodes&amp;lt;/code&amp;gt;&lt;br /&gt;
: - &amp;lt;code&amp;gt;mergeAdjacentCodes&amp;lt;/code&amp;gt; - maps to the &amp;lt;code&amp;gt;mergeCodes&amp;lt;/code&amp;gt;&lt;br /&gt;
: - &amp;lt;code&amp;gt;simplifierRules&amp;lt;/code&amp;gt; - maps to the &amp;lt;code&amp;gt;rules&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The simplification rules allow the prevention of specific codes trimming or merging. &lt;br /&gt;
&lt;br /&gt;
===General Syntax===&lt;br /&gt;
&lt;br /&gt;
The rules parser ignores irrelevant whitespace. Rules can be separated by spaces, newlines or nothing. This makes it easier to accommodate various container formats and their whitespace normalization rules. When a rule applies, it means &amp;quot;do not simplify the match code&amp;quot;. Uppercase tokens are constants and predefined by the rule parser. Multiple rules are always OR'ed together.&lt;br /&gt;
&lt;br /&gt;
For more details, see the JavaCC grammar: &amp;lt;code&amp;gt;../okapi/core/src/main/javacc/SimplifierRules.jj&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Rule Examples===&lt;br /&gt;
&lt;br /&gt;
If Code has any of these flags, then don't simplify &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if DELETABLE or ADDABLE or CLONEABLE;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;quot;=&amp;quot; is string match&lt;br /&gt;
Match basic TAGTYPE opening, closing or standalone &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if DATA = &amp;quot;a&amp;quot; and TAGTYPE = OPENING;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;quot;~&amp;quot; is regex match&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if DATA ~ &amp;quot;a.*&amp;quot;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can negate any of the match operators &lt;br /&gt;
Don't simplify if the DATA does not match the regex &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if DATA !~ &amp;quot;a.*&amp;quot;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Match on type, linebreak in this case, don't simplify &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if TYPE = &amp;quot;lb&amp;quot;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Don't simplify any rich text types&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if TYPE = &amp;quot;bold&amp;quot; or TYPE = &amp;quot;italic&amp;quot; or TYPE = &amp;quot;underline&amp;quot;;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Expressions can be recursive (supports embedded parens)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;if TYPE = &amp;quot;bold&amp;quot; or (DATA = &amp;quot;bar&amp;quot; or (DATA = &amp;quot;foo&amp;quot; and TYPE = &amp;quot;underline&amp;quot;));&amp;lt;/pre&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
&lt;br /&gt;
===Filter Config Examples===&lt;br /&gt;
&lt;br /&gt;
Examples of using simplifier rules within the filter config formats used by Okapi.&lt;br /&gt;
&lt;br /&gt;
'''YAML:'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
simplifierRules: |&lt;br /&gt;
  if ADDABLE or DELETABLE or CLONEABLE;&lt;br /&gt;
  if DATA = &amp;quot;&amp;lt;br/&amp;gt;&amp;quot; or DATA = &amp;quot;&amp;lt;font&amp;gt;&amp;quot; or DATA = &amp;quot;&amp;lt;/font&amp;gt;&amp;quot; or DATA = &amp;quot;&amp;lt;/a&amp;gt;&amp;quot;;&lt;br /&gt;
  if DATA ~ &amp;quot;\\&amp;lt;font.+&amp;quot; or DATA ~ &amp;quot;\\&amp;lt;img.+&amp;quot; or DATA ~ &amp;quot;\\&amp;lt;a.+&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''ITS:'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;UTF-8&amp;quot;?&amp;gt;&lt;br /&gt;
&amp;lt;its:rules xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot; version=&amp;quot;1.0&amp;quot; xmlns:itsx=&amp;quot;http://www.w3.org/2008/12/its-extensions&amp;quot; xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;!-- See ITS specification at: http://www.w3.org/TR/its/ --&amp;gt;&lt;br /&gt;
 &amp;lt;its:translateRule selector=&amp;quot;//*&amp;quot; translate=&amp;quot;yes&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;its:withinTextRule selector=&amp;quot;//codeph&amp;quot; withinText=&amp;quot;yes&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;its:withinTextRule selector=&amp;quot;//ph&amp;quot; withinText=&amp;quot;yes&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;okp:simplifierRules moveLeadingAndTrailingCodesToSkeleton=&amp;quot;yes&amp;quot; mergeAdjacentCodes=&amp;quot;yes&amp;quot;&amp;gt;&lt;br /&gt;
 if ADDABLE or DELETABLE or CLONEABLE; if DATA ~ &amp;quot;.+&amp;quot;;&lt;br /&gt;
 &amp;lt;/okp:simplifierRules&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
'''FPRM (Parameters):'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#v1&lt;br /&gt;
extractNotes.b=true&lt;br /&gt;
simplifierRules=if ADDABLE or DELETABLE or CLONEABLE; if DATA ~ &amp;quot;.+&amp;quot;;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Font Mapping==&lt;br /&gt;
&lt;br /&gt;
The font mapping can be considered as a filter's ability to automatically substitute font information in the target document on the fly, according to a provided configuration - this helps to reduce the amount of reformatting and post-translation DTP. It is supported by IDML and OpenXML (DOCX, PPTX and XLSX documents) filters at the moment.&lt;br /&gt;
&lt;br /&gt;
The following font mapping configuration options are available:&lt;br /&gt;
* The source locale regular expression pattern: &amp;lt;code&amp;gt;.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;en.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;en-UK&amp;lt;/code&amp;gt;, etc. It can be ommited to apply the mapping to any source locale.&lt;br /&gt;
* The target locale regular expression pattern: &amp;lt;code&amp;gt;.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ru.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;ru-RU&amp;lt;/code&amp;gt;, etc. It can be ommited to apply the mapping to any target locale.&lt;br /&gt;
* The source font name regular expression pattern: &amp;lt;code&amp;gt;.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;Arial.*&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;Times New Roman&amp;lt;/code&amp;gt;, etc. It can be ommited to apply the mapping to any source font name found.&lt;br /&gt;
* The target font name: &amp;lt;code&amp;gt;Arial&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;Times New Roman&amp;lt;/code&amp;gt;, etc. It should not be empty. And if it is made so, the mapping configuration is ignored.&lt;br /&gt;
&lt;br /&gt;
Also, the configured font mappings are applied in the order they are stated. And the final target font value is determined by a sequential&lt;br /&gt;
substitution of the source font values. I.e. if there is more than one mapping:&lt;br /&gt;
# &amp;lt;code&amp;gt;Arial&amp;lt;/code&amp;gt; -&amp;gt; &amp;lt;code&amp;gt;Times New Roman&amp;lt;/code&amp;gt;&lt;br /&gt;
# &amp;lt;code&amp;gt;Times New Roman&amp;lt;/code&amp;gt; -&amp;gt; &amp;lt;code&amp;gt;Sans Serif&amp;lt;/code&amp;gt;&lt;br /&gt;
then the first mapping will produce &amp;lt;code&amp;gt;Times New Roman&amp;lt;/code&amp;gt; replacement and the second one will be applied to this new value, thus, ending up with the &amp;lt;code&amp;gt;Sans Serif&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
The parameters serialisation format can look like that:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fontMappings.0.sourceLocalePattern=en.*&lt;br /&gt;
fontMappings.0.targetLocalePattern=ru.*&lt;br /&gt;
fontMappings.0.sourceFontPattern=Times.*&lt;br /&gt;
fontMappings.0.targetFont=Arial Unicode MS&lt;br /&gt;
fontMappings.1.sourceLocalePattern=ru&lt;br /&gt;
fontMappings.1.targetLocalePattern=fr&lt;br /&gt;
fontMappings.1.sourceFontPattern=The Sims Sans&lt;br /&gt;
fontMappings.1.targetFont=Arial Unicode MS&lt;br /&gt;
fontMappings.number.i=2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
When source locale, target locale and source font are omitted:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fontMappings.0.targetFont=Arial Unicode MS&lt;br /&gt;
fontMappings.number.i=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
And this is the same as the abovementioned:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fontMappings.0.sourceLocalePattern=.*&lt;br /&gt;
fontMappings.0.targetLocalePattern=.*&lt;br /&gt;
fontMappings.0.sourceFontPattern=.*&lt;br /&gt;
fontMappings.0.targetFont=Arial Unicode MS&lt;br /&gt;
fontMappings.number.i=1&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=OpenXML_Filter&amp;diff=1035</id>
		<title>OpenXML Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=OpenXML_Filter&amp;diff=1035"/>
		<updated>2025-06-25T22:01:23Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* PowerPoint Options */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
This filter allows you to process the different types of documents of the Microsoft Office suite from 2007 and later, such as DOCX (text documents), XLSX (spreadsheets) and PPTX (presentations).  These documents are based on the OpenXML format, opposed to the binary formats used by pre-2007 versions of Office.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
The filter parameters are divided into '''General Options''', which apply to all formats, and format-specific options.&lt;br /&gt;
&lt;br /&gt;
===General Options===&lt;br /&gt;
; Translate Document Properties&lt;br /&gt;
: When checked, exposes the following document properties for translation: title, subject, creator, description, category, keywords, content status. Default: on.&lt;br /&gt;
; Translate Comments&lt;br /&gt;
: When checked, exposes document comments for translation.  Default: on.&lt;br /&gt;
; Clean Tags Aggressively&lt;br /&gt;
: When checked, strips additional formatting tags related to text spacing.  This is meant to improve filtering in cases where Office documents were converted from other formats (in particular, PDF), and imperfect conversion added a lot of extra formatting noise.  Default: off.&lt;br /&gt;
; Ignore Whitespace Styles&lt;br /&gt;
: When checked under the &amp;quot;Clean Tags Aggressively&amp;quot;, the whitespace character styles (formatting) are ignored and considered equal to the consequential ones.  Default: off.&lt;br /&gt;
&lt;br /&gt;
=== Word Options ===&lt;br /&gt;
; Translated Headers and Footers&lt;br /&gt;
: When checked, exposes header and footer content for translation. Default: on.&lt;br /&gt;
; Translate Numbering Level Text&lt;br /&gt;
: When checked, exposes numbering-level text for translation. Default: off.&lt;br /&gt;
; Translated Hidden Text&lt;br /&gt;
: When checked, exposes hidden text for translation. Default: on.&lt;br /&gt;
; Exclude Graphical Metadata&lt;br /&gt;
: When not checked, labels associated with drawings and word art are exposed for translation.  When checked, these labels (which are frequently not displayed in the document) are suppressed. Default: off.&lt;br /&gt;
; Ignored Styles &amp;gt; Ignore Font Colours&lt;br /&gt;
: When checked, font colours will be ignored. Default: off.&lt;br /&gt;
: If &amp;lt;cite&amp;gt;Clean Tags Aggressively&amp;lt;/cite&amp;gt; and this option are checked and the ignorance thresholds are empty, the font colour run properties are removed from the document structure on filtering. This means that the font colour information is absent on merge as well.&lt;br /&gt;
; Ignored Styles &amp;gt; Font Colours Minimum Ignorance Threshold&lt;br /&gt;
: When defined, font colours will be ignored starting from the specified value. It can be empty (considered as a white colour by default), and contain preset colour values or RGB hex strings: black, Black, 000000 - thresholds in white. Default: none.&lt;br /&gt;
; Ignored Styles &amp;gt; Font Colours Maximum Ignorance Threshold&lt;br /&gt;
: When defined, font colours will be ignored ending by the specified value. It can be empty (considered as a white colour by default), and contain preset colour values or RGB hex strings: white, White, FFFFFF - thresholds in white. Default: none.&lt;br /&gt;
; Excluded/Included Styles&lt;br /&gt;
: Depending on the radio switch (exclude or include), text using any selected styles will be excluded or included for translation. Default: none.&lt;br /&gt;
; Excluded/Included Highlight Colors&lt;br /&gt;
: Depending on the radio switch (exclude or include), text using any selected colours will be excluded or included for translation. &lt;br /&gt;
* If the switch is set to &amp;quot;Include&amp;quot;, only text in the specified colors will be extracted for translation.&lt;br /&gt;
* If the switch is set to &amp;quot;Exclude&amp;quot;, all content &amp;lt;b&amp;gt;except&amp;lt;/b&amp;gt; for text in the specified colors will be extracted for translation.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;i&amp;gt;Note&amp;lt;/i&amp;gt;: Text that is excluded using this mechanism will be treated as hidden; that means the &amp;quot;Translate Everything Hidden&amp;quot; options will extract it.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;i&amp;gt;Note&amp;lt;/i&amp;gt;: Starting in 1.48.0, this option also applies to content in PowerPoint files.&lt;br /&gt;
&lt;br /&gt;
Default: the switch is set to &amp;quot;Exclude&amp;quot; and no colors are selected, meaning that all visible content will be extracted for translation.&lt;br /&gt;
&lt;br /&gt;
; Excluded Font Colours&lt;br /&gt;
: Text using any selected colours will not be exposed for translation. Default: none.&lt;br /&gt;
; Allow Style Optimisation&lt;br /&gt;
: When checked, the optimisation of styles is allowed - common formatting of all runs in a paragraph is moved to the styles part. Default: on.&lt;br /&gt;
&lt;br /&gt;
=== Excel Options ===&lt;br /&gt;
; Translate Hidden Rows and Columns&lt;br /&gt;
: When checked, hidden rows and columns are exposed for translation.  Default: off.&lt;br /&gt;
; Colors to Exclude&lt;br /&gt;
: Text with a foreground or background color matching any of the selected colors in this option will be excluded from translation.  Default: none.&lt;br /&gt;
:* The named colors available in the UI correspond to the standard color palette of Excel 2010.  &lt;br /&gt;
:* The configuration itself also supports colors specified as RGB in the format &amp;lt;code&amp;gt;RRGGBB&amp;lt;/code&amp;gt;, so specific colors not explicitly listed in the UI may be excluded by modifying the .fprm file by hand.  For example, to exclude #69b3e7 (Pantone 292), you could modify the &amp;lt;code&amp;gt;tsExcelExcludedColors&amp;lt;/code&amp;gt; section of the configuration file like this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
tsExcelExcludedColors.i=1&lt;br /&gt;
ccc0=69b3e7&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
; Translate Cells Copied&lt;br /&gt;
: When checked, cell data are copied on extraction to allow contextualised and independent translations.  Default: on.&lt;br /&gt;
; Preserve Styles In Target Columns&lt;br /&gt;
: When checked, the cell styles in target columns are preserved.  Default: off.&lt;br /&gt;
; Extract Source And Target Columns Joined&lt;br /&gt;
: When checked, the source and target columns (cells in a row) are joined on extraction.  Default: off.&lt;br /&gt;
; Worksheet Configurations&lt;br /&gt;
: The list of configurations spotting the exclusion from translation rows and/or columns and/or marking such rows and/or columns as metadata per a worksheet name pattern.&lt;br /&gt;
: For one configuration it is possible to specify:&lt;br /&gt;
:* Name Pattern - a regular expression, by which all other operations are matched and applied. For formatting options please refer to &amp;lt;code&amp;gt;java.util.regex.Pattern&amp;lt;/code&amp;gt;. E.g.: &amp;lt;code&amp;gt;Sheet1&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Source Columns - a list of ALPHA-26 numbers, specifying columns that are copied over the target ones for translation/extraction. E.g.: &amp;lt;code&amp;gt;A,B&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Target Columns - a list of ALPHA-26 numbers, specifying columns that are overwritten by the source ones for translation/extraction. E.g.: &amp;lt;code&amp;gt;C,D&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Target Columns Max Characters - a list of decimal unsigned integers [0, 2^32]. When specified, the maxwidth and size-unit properties are attached to text units specified in the target columns. E.g.: &amp;lt;code&amp;gt;25,30&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Excluded Rows - a list of integers, pointing out row numbers that are excluded from translation/extraction. E.g.: &amp;lt;code&amp;gt;1,2&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Excluded Columns - a list of ALPHA-26 numbers, specifying columns that are excluded from translation/extraction. E.g.: &amp;lt;code&amp;gt;A,B&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Metadata Rows - a list of integers, pointing out row numbers that are treated and extracted as metadata. E.g.: &amp;lt;code&amp;gt;3,4&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Metadata Columns - a list of ALPHA-26 numbers, specifying columns that are treated and extracted as metadata. E.g.: &amp;lt;code&amp;gt;C,D&amp;lt;/code&amp;gt;.&lt;br /&gt;
: Let's consider a simple table as an example and find out what can be done with all those configurations.&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; style=&amp;quot;margin:auto&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! colspan=&amp;quot;2&amp;quot;|Metadata Header A1 !! colspan=&amp;quot;2&amp;quot;|Metadata Header C1&lt;br /&gt;
|-&lt;br /&gt;
! Metadata Header A2 !! Metadata Header B2 || Metadata Header C2 !! Metadata Header D2&lt;br /&gt;
|-&lt;br /&gt;
| A3 || B3 || C3 || Metadata D3&lt;br /&gt;
|-&lt;br /&gt;
| A4 || B4 || C4 || Metadata D4&lt;br /&gt;
|-&lt;br /&gt;
| A5 || B5 || C5 || Metadata D5&lt;br /&gt;
|}&lt;br /&gt;
: Firstly, let's suppose we would like to translate column A only and place the translation in column B. At the same time we do not want to translate the 1st and the 2nd rows.&lt;br /&gt;
: This requirement can be configured in the following way (using the &amp;lt;code&amp;gt;net.sf.okapi.common.ParametersString&amp;lt;/code&amp;gt; format as an example):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
worksheetConfigurations.number.i=1&lt;br /&gt;
worksheetConfigurations.0.namePattern=Sheet1&lt;br /&gt;
worksheetConfigurations.0.sourceColumns=A&lt;br /&gt;
worksheetConfigurations.0.targetColumns=B&lt;br /&gt;
worksheetConfigurations.0.excludedRows=1,2&lt;br /&gt;
worksheetConfigurations.0.excludedColumns=C,D&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
: Then the XLIFF would look like this after extraction and translation:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;group id=&amp;quot;P76C545-sg1&amp;quot; resname=&amp;quot;Sheet1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg1&amp;quot; resname=&amp;quot;1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg2&amp;quot; resname=&amp;quot;2&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg3&amp;quot; resname=&amp;quot;3&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu1&amp;quot; resname=&amp;quot;Sheet1!B3&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A3&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;A3-tr&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg4&amp;quot; resname=&amp;quot;4&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu2&amp;quot; resname=&amp;quot;Sheet1!B4&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A4&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;A4-tr&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg5&amp;quot; resname=&amp;quot;5&amp;quot;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu3&amp;quot; resname=&amp;quot;Sheet1!B5&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A5&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;A5-tr&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
&amp;lt;/group&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
: And the merged representation would be the following:&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; style=&amp;quot;margin:auto&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! colspan=&amp;quot;2&amp;quot;|Metadata Header A1 !! colspan=&amp;quot;2&amp;quot;|Metadata Header C1&lt;br /&gt;
|-&lt;br /&gt;
! Metadata Header A2 !! Metadata Header B2 || Metadata Header C2 !! Metadata Header D2&lt;br /&gt;
|-&lt;br /&gt;
| A3 || A3-tr || C3 || Metadata D3&lt;br /&gt;
|-&lt;br /&gt;
| A4 || A4-tr || C4 || Metadata D4&lt;br /&gt;
|-&lt;br /&gt;
| A5 || A5-tr || C5 || Metadata D5&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
: Furthermore, let's suppose we would like to translate columns A and B, and treat column D as metadata for each of the translatable cell in a row. At the same time, we would like to consider the 1st and 2nd rows as metadata about the metadata in columns. And, we would like not to extract the 5th row.&lt;br /&gt;
: All these requirements can be written as the following configurations:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
worksheetConfigurations.number.i=1&lt;br /&gt;
worksheetConfigurations.0.namePattern=Sheet1&lt;br /&gt;
worksheetConfigurations.0.excludedRows=5&lt;br /&gt;
worksheetConfigurations.0.excludedColumns=C&lt;br /&gt;
worksheetConfigurations.0.metadataRows=1,2&lt;br /&gt;
worksheetConfigurations.0.metadataColumns=D&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
: Then, the extraction to XLIFF should look like that:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;group id=&amp;quot;P76C545-sg1&amp;quot; resname=&amp;quot;Sheet1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg1&amp;quot; resname=&amp;quot;1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg2&amp;quot; resname=&amp;quot;2&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg3&amp;quot; resname=&amp;quot;3&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;context-group name=&amp;quot;row-metadata&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;context context-type=&amp;quot;x-Metadata Header C1;Metadata Header D2&amp;quot;&amp;gt;Metadata D3&amp;lt;/context&amp;gt;&lt;br /&gt;
    &amp;lt;/context-group&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu1&amp;quot; resname=&amp;quot;Sheet1!A3&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A3&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu2&amp;quot; resname=&amp;quot;Sheet1!B3&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;B3&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg4&amp;quot; resname=&amp;quot;4&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;context-group name=&amp;quot;row-metadata&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;context context-type=&amp;quot;x-Metadata Header C1;Metadata Header D2&amp;quot;&amp;gt;Metadata D4&amp;lt;/context&amp;gt;&lt;br /&gt;
    &amp;lt;/context-group&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu3&amp;quot; resname=&amp;quot;Sheet1!A4&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A4&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu4&amp;quot; resname=&amp;quot;Sheet1!B4&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;B4&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg5&amp;quot; resname=&amp;quot;5&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;context-group name=&amp;quot;row-metadata&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;context context-type=&amp;quot;x-Metadata Header C1;Metadata Header D2&amp;quot;&amp;gt;Metadata D5&amp;lt;/context&amp;gt;&lt;br /&gt;
    &amp;lt;/context-group&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
&amp;lt;/group&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== PowerPoint Options ===&lt;br /&gt;
; Translate Document Properties&lt;br /&gt;
: When checked and the same option is checked under '''the Gereral Options''' (''they will be separated after the next release''), the following document properties are exposed for translation: title, subject, creator, description, category, keywords, content status. Default: on.&lt;br /&gt;
; Reorder Document Properties&lt;br /&gt;
: When checked, the document properties are reordered and placed after the root relationship part (_rels/.rels). Default: off.&lt;br /&gt;
; Reorder Relationships&lt;br /&gt;
: When checked, the relationship parts are reordered and placed after the related slide or layout or master part. Default: off.&lt;br /&gt;
; Translate Diagram Data&lt;br /&gt;
: When checked, the diagram data are exposed for translation. Default: on.&lt;br /&gt;
; Reorder Diagram Data&lt;br /&gt;
: When checked, the diagram data parts are reordered and placed after the related slide or layout or master part and after their relationship parts. Default: off.&lt;br /&gt;
; Translate Charts&lt;br /&gt;
: When checked, the charts are exposed for translation. Default: on.&lt;br /&gt;
; Reorder Charts&lt;br /&gt;
: When checked, the chart parts are reordered and placed after the related slide or layout or master part and after their diagram data parts. Default: off.&lt;br /&gt;
; Translate Notes&lt;br /&gt;
: When checked, the slide notes exposed for translation. Default: off.&lt;br /&gt;
; Reorder Notes&lt;br /&gt;
: When checked, the note parts are reordered and placed after the related slide part and after its chart parts. Default: off.&lt;br /&gt;
; Translate Comments&lt;br /&gt;
: When checked and the same option is checked under '''the Gereral Options''' (''they will be separated after the next release''), the document comments are exposed for translation. Default: on.&lt;br /&gt;
; Reorder Comments&lt;br /&gt;
: When checked, the comment parts are reordered and placed after the related slide part and after its note parts. Default: off.&lt;br /&gt;
; Translate Masters&lt;br /&gt;
: When checked, expose slide masters and notes masters for translation. This will also expose for translation content from layouts that are currently in use by at least one slide.  Default: on.&lt;br /&gt;
; Translate Graphic Metadata&lt;br /&gt;
: When checked, the graphic metadata (@name and @descr attribute values) are exposed for translation. Default: off.&lt;br /&gt;
; Excluded/Included Highlight Colors&lt;br /&gt;
: Starting in 1.48.0, the &amp;quot;Excluded/Included Highlight Colors&amp;quot; option from the Word configuration also affects PowerPoint content. See the docs in [[#Word Options]].&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
* Various, see [https://bitbucket.org/okapiframework/okapi/issues?status=new&amp;amp;title=~OpenXML the issues list].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=OpenXML_Filter&amp;diff=1034</id>
		<title>OpenXML Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=OpenXML_Filter&amp;diff=1034"/>
		<updated>2025-06-25T21:58:38Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* Word Options */ Update highlighting docs&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
This filter allows you to process the different types of documents of the Microsoft Office suite from 2007 and later, such as DOCX (text documents), XLSX (spreadsheets) and PPTX (presentations).  These documents are based on the OpenXML format, opposed to the binary formats used by pre-2007 versions of Office.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
The filter parameters are divided into '''General Options''', which apply to all formats, and format-specific options.&lt;br /&gt;
&lt;br /&gt;
===General Options===&lt;br /&gt;
; Translate Document Properties&lt;br /&gt;
: When checked, exposes the following document properties for translation: title, subject, creator, description, category, keywords, content status. Default: on.&lt;br /&gt;
; Translate Comments&lt;br /&gt;
: When checked, exposes document comments for translation.  Default: on.&lt;br /&gt;
; Clean Tags Aggressively&lt;br /&gt;
: When checked, strips additional formatting tags related to text spacing.  This is meant to improve filtering in cases where Office documents were converted from other formats (in particular, PDF), and imperfect conversion added a lot of extra formatting noise.  Default: off.&lt;br /&gt;
; Ignore Whitespace Styles&lt;br /&gt;
: When checked under the &amp;quot;Clean Tags Aggressively&amp;quot;, the whitespace character styles (formatting) are ignored and considered equal to the consequential ones.  Default: off.&lt;br /&gt;
&lt;br /&gt;
=== Word Options ===&lt;br /&gt;
; Translated Headers and Footers&lt;br /&gt;
: When checked, exposes header and footer content for translation. Default: on.&lt;br /&gt;
; Translate Numbering Level Text&lt;br /&gt;
: When checked, exposes numbering-level text for translation. Default: off.&lt;br /&gt;
; Translated Hidden Text&lt;br /&gt;
: When checked, exposes hidden text for translation. Default: on.&lt;br /&gt;
; Exclude Graphical Metadata&lt;br /&gt;
: When not checked, labels associated with drawings and word art are exposed for translation.  When checked, these labels (which are frequently not displayed in the document) are suppressed. Default: off.&lt;br /&gt;
; Ignored Styles &amp;gt; Ignore Font Colours&lt;br /&gt;
: When checked, font colours will be ignored. Default: off.&lt;br /&gt;
: If &amp;lt;cite&amp;gt;Clean Tags Aggressively&amp;lt;/cite&amp;gt; and this option are checked and the ignorance thresholds are empty, the font colour run properties are removed from the document structure on filtering. This means that the font colour information is absent on merge as well.&lt;br /&gt;
; Ignored Styles &amp;gt; Font Colours Minimum Ignorance Threshold&lt;br /&gt;
: When defined, font colours will be ignored starting from the specified value. It can be empty (considered as a white colour by default), and contain preset colour values or RGB hex strings: black, Black, 000000 - thresholds in white. Default: none.&lt;br /&gt;
; Ignored Styles &amp;gt; Font Colours Maximum Ignorance Threshold&lt;br /&gt;
: When defined, font colours will be ignored ending by the specified value. It can be empty (considered as a white colour by default), and contain preset colour values or RGB hex strings: white, White, FFFFFF - thresholds in white. Default: none.&lt;br /&gt;
; Excluded/Included Styles&lt;br /&gt;
: Depending on the radio switch (exclude or include), text using any selected styles will be excluded or included for translation. Default: none.&lt;br /&gt;
; Excluded/Included Highlight Colors&lt;br /&gt;
: Depending on the radio switch (exclude or include), text using any selected colours will be excluded or included for translation. &lt;br /&gt;
* If the switch is set to &amp;quot;Include&amp;quot;, only text in the specified colors will be extracted for translation.&lt;br /&gt;
* If the switch is set to &amp;quot;Exclude&amp;quot;, all content &amp;lt;b&amp;gt;except&amp;lt;/b&amp;gt; for text in the specified colors will be extracted for translation.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;i&amp;gt;Note&amp;lt;/i&amp;gt;: Text that is excluded using this mechanism will be treated as hidden; that means the &amp;quot;Translate Everything Hidden&amp;quot; options will extract it.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;i&amp;gt;Note&amp;lt;/i&amp;gt;: Starting in 1.48.0, this option also applies to content in PowerPoint files.&lt;br /&gt;
&lt;br /&gt;
Default: the switch is set to &amp;quot;Exclude&amp;quot; and no colors are selected, meaning that all visible content will be extracted for translation.&lt;br /&gt;
&lt;br /&gt;
; Excluded Font Colours&lt;br /&gt;
: Text using any selected colours will not be exposed for translation. Default: none.&lt;br /&gt;
; Allow Style Optimisation&lt;br /&gt;
: When checked, the optimisation of styles is allowed - common formatting of all runs in a paragraph is moved to the styles part. Default: on.&lt;br /&gt;
&lt;br /&gt;
=== Excel Options ===&lt;br /&gt;
; Translate Hidden Rows and Columns&lt;br /&gt;
: When checked, hidden rows and columns are exposed for translation.  Default: off.&lt;br /&gt;
; Colors to Exclude&lt;br /&gt;
: Text with a foreground or background color matching any of the selected colors in this option will be excluded from translation.  Default: none.&lt;br /&gt;
:* The named colors available in the UI correspond to the standard color palette of Excel 2010.  &lt;br /&gt;
:* The configuration itself also supports colors specified as RGB in the format &amp;lt;code&amp;gt;RRGGBB&amp;lt;/code&amp;gt;, so specific colors not explicitly listed in the UI may be excluded by modifying the .fprm file by hand.  For example, to exclude #69b3e7 (Pantone 292), you could modify the &amp;lt;code&amp;gt;tsExcelExcludedColors&amp;lt;/code&amp;gt; section of the configuration file like this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
tsExcelExcludedColors.i=1&lt;br /&gt;
ccc0=69b3e7&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
; Translate Cells Copied&lt;br /&gt;
: When checked, cell data are copied on extraction to allow contextualised and independent translations.  Default: on.&lt;br /&gt;
; Preserve Styles In Target Columns&lt;br /&gt;
: When checked, the cell styles in target columns are preserved.  Default: off.&lt;br /&gt;
; Extract Source And Target Columns Joined&lt;br /&gt;
: When checked, the source and target columns (cells in a row) are joined on extraction.  Default: off.&lt;br /&gt;
; Worksheet Configurations&lt;br /&gt;
: The list of configurations spotting the exclusion from translation rows and/or columns and/or marking such rows and/or columns as metadata per a worksheet name pattern.&lt;br /&gt;
: For one configuration it is possible to specify:&lt;br /&gt;
:* Name Pattern - a regular expression, by which all other operations are matched and applied. For formatting options please refer to &amp;lt;code&amp;gt;java.util.regex.Pattern&amp;lt;/code&amp;gt;. E.g.: &amp;lt;code&amp;gt;Sheet1&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Source Columns - a list of ALPHA-26 numbers, specifying columns that are copied over the target ones for translation/extraction. E.g.: &amp;lt;code&amp;gt;A,B&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Target Columns - a list of ALPHA-26 numbers, specifying columns that are overwritten by the source ones for translation/extraction. E.g.: &amp;lt;code&amp;gt;C,D&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Target Columns Max Characters - a list of decimal unsigned integers [0, 2^32]. When specified, the maxwidth and size-unit properties are attached to text units specified in the target columns. E.g.: &amp;lt;code&amp;gt;25,30&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Excluded Rows - a list of integers, pointing out row numbers that are excluded from translation/extraction. E.g.: &amp;lt;code&amp;gt;1,2&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Excluded Columns - a list of ALPHA-26 numbers, specifying columns that are excluded from translation/extraction. E.g.: &amp;lt;code&amp;gt;A,B&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Metadata Rows - a list of integers, pointing out row numbers that are treated and extracted as metadata. E.g.: &amp;lt;code&amp;gt;3,4&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Metadata Columns - a list of ALPHA-26 numbers, specifying columns that are treated and extracted as metadata. E.g.: &amp;lt;code&amp;gt;C,D&amp;lt;/code&amp;gt;.&lt;br /&gt;
: Let's consider a simple table as an example and find out what can be done with all those configurations.&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; style=&amp;quot;margin:auto&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! colspan=&amp;quot;2&amp;quot;|Metadata Header A1 !! colspan=&amp;quot;2&amp;quot;|Metadata Header C1&lt;br /&gt;
|-&lt;br /&gt;
! Metadata Header A2 !! Metadata Header B2 || Metadata Header C2 !! Metadata Header D2&lt;br /&gt;
|-&lt;br /&gt;
| A3 || B3 || C3 || Metadata D3&lt;br /&gt;
|-&lt;br /&gt;
| A4 || B4 || C4 || Metadata D4&lt;br /&gt;
|-&lt;br /&gt;
| A5 || B5 || C5 || Metadata D5&lt;br /&gt;
|}&lt;br /&gt;
: Firstly, let's suppose we would like to translate column A only and place the translation in column B. At the same time we do not want to translate the 1st and the 2nd rows.&lt;br /&gt;
: This requirement can be configured in the following way (using the &amp;lt;code&amp;gt;net.sf.okapi.common.ParametersString&amp;lt;/code&amp;gt; format as an example):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
worksheetConfigurations.number.i=1&lt;br /&gt;
worksheetConfigurations.0.namePattern=Sheet1&lt;br /&gt;
worksheetConfigurations.0.sourceColumns=A&lt;br /&gt;
worksheetConfigurations.0.targetColumns=B&lt;br /&gt;
worksheetConfigurations.0.excludedRows=1,2&lt;br /&gt;
worksheetConfigurations.0.excludedColumns=C,D&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
: Then the XLIFF would look like this after extraction and translation:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;group id=&amp;quot;P76C545-sg1&amp;quot; resname=&amp;quot;Sheet1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg1&amp;quot; resname=&amp;quot;1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg2&amp;quot; resname=&amp;quot;2&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg3&amp;quot; resname=&amp;quot;3&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu1&amp;quot; resname=&amp;quot;Sheet1!B3&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A3&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;A3-tr&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg4&amp;quot; resname=&amp;quot;4&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu2&amp;quot; resname=&amp;quot;Sheet1!B4&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A4&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;A4-tr&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg5&amp;quot; resname=&amp;quot;5&amp;quot;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu3&amp;quot; resname=&amp;quot;Sheet1!B5&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A5&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;A5-tr&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
&amp;lt;/group&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
: And the merged representation would be the following:&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; style=&amp;quot;margin:auto&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! colspan=&amp;quot;2&amp;quot;|Metadata Header A1 !! colspan=&amp;quot;2&amp;quot;|Metadata Header C1&lt;br /&gt;
|-&lt;br /&gt;
! Metadata Header A2 !! Metadata Header B2 || Metadata Header C2 !! Metadata Header D2&lt;br /&gt;
|-&lt;br /&gt;
| A3 || A3-tr || C3 || Metadata D3&lt;br /&gt;
|-&lt;br /&gt;
| A4 || A4-tr || C4 || Metadata D4&lt;br /&gt;
|-&lt;br /&gt;
| A5 || A5-tr || C5 || Metadata D5&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
: Furthermore, let's suppose we would like to translate columns A and B, and treat column D as metadata for each of the translatable cell in a row. At the same time, we would like to consider the 1st and 2nd rows as metadata about the metadata in columns. And, we would like not to extract the 5th row.&lt;br /&gt;
: All these requirements can be written as the following configurations:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
worksheetConfigurations.number.i=1&lt;br /&gt;
worksheetConfigurations.0.namePattern=Sheet1&lt;br /&gt;
worksheetConfigurations.0.excludedRows=5&lt;br /&gt;
worksheetConfigurations.0.excludedColumns=C&lt;br /&gt;
worksheetConfigurations.0.metadataRows=1,2&lt;br /&gt;
worksheetConfigurations.0.metadataColumns=D&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
: Then, the extraction to XLIFF should look like that:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;group id=&amp;quot;P76C545-sg1&amp;quot; resname=&amp;quot;Sheet1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg1&amp;quot; resname=&amp;quot;1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg2&amp;quot; resname=&amp;quot;2&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg3&amp;quot; resname=&amp;quot;3&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;context-group name=&amp;quot;row-metadata&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;context context-type=&amp;quot;x-Metadata Header C1;Metadata Header D2&amp;quot;&amp;gt;Metadata D3&amp;lt;/context&amp;gt;&lt;br /&gt;
    &amp;lt;/context-group&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu1&amp;quot; resname=&amp;quot;Sheet1!A3&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A3&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu2&amp;quot; resname=&amp;quot;Sheet1!B3&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;B3&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg4&amp;quot; resname=&amp;quot;4&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;context-group name=&amp;quot;row-metadata&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;context context-type=&amp;quot;x-Metadata Header C1;Metadata Header D2&amp;quot;&amp;gt;Metadata D4&amp;lt;/context&amp;gt;&lt;br /&gt;
    &amp;lt;/context-group&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu3&amp;quot; resname=&amp;quot;Sheet1!A4&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A4&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu4&amp;quot; resname=&amp;quot;Sheet1!B4&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;B4&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg5&amp;quot; resname=&amp;quot;5&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;context-group name=&amp;quot;row-metadata&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;context context-type=&amp;quot;x-Metadata Header C1;Metadata Header D2&amp;quot;&amp;gt;Metadata D5&amp;lt;/context&amp;gt;&lt;br /&gt;
    &amp;lt;/context-group&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
&amp;lt;/group&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== PowerPoint Options ===&lt;br /&gt;
; Translate Document Properties&lt;br /&gt;
: When checked and the same option is checked under '''the Gereral Options''' (''they will be separated after the next release''), the following document properties are exposed for translation: title, subject, creator, description, category, keywords, content status. Default: on.&lt;br /&gt;
; Reorder Document Properties&lt;br /&gt;
: When checked, the document properties are reordered and placed after the root relationship part (_rels/.rels). Default: off.&lt;br /&gt;
; Reorder Relationships&lt;br /&gt;
: When checked, the relationship parts are reordered and placed after the related slide or layout or master part. Default: off.&lt;br /&gt;
; Translate Diagram Data&lt;br /&gt;
: When checked, the diagram data are exposed for translation. Default: on.&lt;br /&gt;
; Reorder Diagram Data&lt;br /&gt;
: When checked, the diagram data parts are reordered and placed after the related slide or layout or master part and after their relationship parts. Default: off.&lt;br /&gt;
; Translate Charts&lt;br /&gt;
: When checked, the charts are exposed for translation. Default: on.&lt;br /&gt;
; Reorder Charts&lt;br /&gt;
: When checked, the chart parts are reordered and placed after the related slide or layout or master part and after their diagram data parts. Default: off.&lt;br /&gt;
; Translate Notes&lt;br /&gt;
: When checked, the slide notes exposed for translation. Default: off.&lt;br /&gt;
; Reorder Notes&lt;br /&gt;
: When checked, the note parts are reordered and placed after the related slide part and after its chart parts. Default: off.&lt;br /&gt;
; Translate Comments&lt;br /&gt;
: When checked and the same option is checked under '''the Gereral Options''' (''they will be separated after the next release''), the document comments are exposed for translation. Default: on.&lt;br /&gt;
; Reorder Comments&lt;br /&gt;
: When checked, the comment parts are reordered and placed after the related slide part and after its note parts. Default: off.&lt;br /&gt;
; Translate Masters&lt;br /&gt;
: When checked, expose slide masters and notes masters for translation. This will also expose for translation content from layouts that are currently in use by at least one slide.  Default: on.&lt;br /&gt;
; Translate Graphic Metadata&lt;br /&gt;
: When checked, the graphic metadata (@name and @descr attribute values) are exposed for translation. Default: off.&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
* Various, see [https://bitbucket.org/okapiframework/okapi/issues?status=new&amp;amp;title=~OpenXML the issues list].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Distributions&amp;diff=1033</id>
		<title>Distributions</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Distributions&amp;diff=1033"/>
		<updated>2025-06-03T16:49:34Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* Main Project */ Update link to snapshot builds&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
== Main Project ==&lt;br /&gt;
&lt;br /&gt;
This includes platform-specific distributions for the applications (Rainbow, Tikal, Ratel, etc.) as well as the libraries-only package (all platforms).&lt;br /&gt;
&lt;br /&gt;
* Latest release ( &amp;lt;b&amp;gt;1.47.0 - Oct 5, 2024&amp;lt;/b&amp;gt; ) : [https://okapiframework.org/binaries/main/1.47.0 https://okapiframework.org/binaries/main/1.47.0]&lt;br /&gt;
* Recent releases: [https://okapiframework.org/binaries/main https://okapiframework.org/binaries/main]&lt;br /&gt;
* Release artifacts on Maven Central: https://search.maven.org/search?q=net.sf.okapi&lt;br /&gt;
* Changes log: [https://okapiframework.org/binaries/main/changes/files/1.47.0.html https://okapiframework.org/binaries/main/changes/files/1.47.0.html]&lt;br /&gt;
&lt;br /&gt;
Snapshots:&lt;br /&gt;
&lt;br /&gt;
* Latest Development Snapshots (nightly builds): [https://gitlab.com/okapiframework/Okapi/-/pipelines https://gitlab.com/okapiframework/Okapi/-/pipelines]&lt;br /&gt;
&lt;br /&gt;
== Longhorn ==&lt;br /&gt;
&lt;br /&gt;
This includes the Longhorn distributions (all platforms).&lt;br /&gt;
&lt;br /&gt;
* Latest release ( &amp;lt;b&amp;gt;1.44.0 - Jan 16, 2023&amp;lt;/b&amp;gt; ) : [https://okapiframework.org/binaries/longhorn/okapi-longhorn_all-platforms_1.44.0.zip okapi-longhorn_all-platforms_1.44.0.zip]&lt;br /&gt;
* Recent releases: [https://okapiframework.org/binaries/longhorn https://okapiframework.org/binaries/longhorn]&lt;br /&gt;
&lt;br /&gt;
== OmegaT Filter Plugin ==&lt;br /&gt;
&lt;br /&gt;
This includes the Filters Plugin for OmegaT (all platforms).&lt;br /&gt;
&lt;br /&gt;
* Latest release ( &amp;lt;b&amp;gt;1.13-1.45.0 - Feb 26, 2023&amp;lt;/b&amp;gt; ) : [https://okapiframework.org/binaries/omegat-plugin/okapiFiltersForOmegaT-1.13-1.45.0-dist.zip okapiFiltersForOmegaT-1.13-1.45.0-dist.zip]&lt;br /&gt;
* All releases: [https://okapiframework.org/binaries/omegat-plugin https://okapiframework.org/binaries/omegat-plugin]&lt;br /&gt;
&lt;br /&gt;
== Ocelot ==&lt;br /&gt;
&lt;br /&gt;
This includes the  Review Workbench application Ocelot (all platforms).&lt;br /&gt;
&lt;br /&gt;
* Latest release ( &amp;lt;b&amp;gt;3.0 - Oct 17, 2017&amp;lt;/b&amp;gt; ) : [https://okapiframework.org/binaries/ocelot/Ocelot-3.0.jar Ocelot-3.0.jar]&lt;br /&gt;
* Recent releases: [https://okapiframework.org/binaries/ocelot https://okapiframework.org/binaries/ocelot]&lt;br /&gt;
&lt;br /&gt;
== Archives ==&lt;br /&gt;
&lt;br /&gt;
Older distributions that are not included above.&lt;br /&gt;
&lt;br /&gt;
* [https://okapiframework.org/binaries/archives Archives (https://okapiframework.org/binaries/archives)]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=OpenXML_Filter&amp;diff=1030</id>
		<title>OpenXML Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=OpenXML_Filter&amp;diff=1030"/>
		<updated>2025-03-28T16:26:30Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: Improve docs on color exclusion&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
This filter allows you to process the different types of documents of the Microsoft Office suite from 2007 and later, such as DOCX (text documents), XLSX (spreadsheets) and PPTX (presentations).  These documents are based on the OpenXML format, opposed to the binary formats used by pre-2007 versions of Office.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
The filter parameters are divided into '''General Options''', which apply to all formats, and format-specific options.&lt;br /&gt;
&lt;br /&gt;
===General Options===&lt;br /&gt;
; Translate Document Properties&lt;br /&gt;
: When checked, exposes the following document properties for translation: title, subject, creator, description, category, keywords, content status. Default: on.&lt;br /&gt;
; Translate Comments&lt;br /&gt;
: When checked, exposes document comments for translation.  Default: on.&lt;br /&gt;
; Clean Tags Aggressively&lt;br /&gt;
: When checked, strips additional formatting tags related to text spacing.  This is meant to improve filtering in cases where Office documents were converted from other formats (in particular, PDF), and imperfect conversion added a lot of extra formatting noise.  Default: off.&lt;br /&gt;
; Ignore Whitespace Styles&lt;br /&gt;
: When checked under the &amp;quot;Clean Tags Aggressively&amp;quot;, the whitespace character styles (formatting) are ignored and considered equal to the consequential ones.  Default: off.&lt;br /&gt;
&lt;br /&gt;
=== Word Options ===&lt;br /&gt;
; Translated Headers and Footers&lt;br /&gt;
: When checked, exposes header and footer content for translation. Default: on.&lt;br /&gt;
; Translate Numbering Level Text&lt;br /&gt;
: When checked, exposes numbering-level text for translation. Default: off.&lt;br /&gt;
; Translated Hidden Text&lt;br /&gt;
: When checked, exposes hidden text for translation. Default: on.&lt;br /&gt;
; Exclude Graphical Metadata&lt;br /&gt;
: When not checked, labels associated with drawings and word art are exposed for translation.  When checked, these labels (which are frequently not displayed in the document) are suppressed. Default: off.&lt;br /&gt;
; Ignored Styles &amp;gt; Ignore Font Colours&lt;br /&gt;
: When checked, font colours will be ignored. Default: off.&lt;br /&gt;
: If &amp;lt;cite&amp;gt;Clean Tags Aggressively&amp;lt;/cite&amp;gt; and this option are checked and the ignorance thresholds are empty, the font colour run properties are removed from the document structure on filtering. This means that the font colour information is absent on merge as well.&lt;br /&gt;
; Ignored Styles &amp;gt; Font Colours Minimum Ignorance Threshold&lt;br /&gt;
: When defined, font colours will be ignored starting from the specified value. It can be empty (considered as a white colour by default), and contain preset colour values or RGB hex strings: black, Black, 000000 - thresholds in white. Default: none.&lt;br /&gt;
; Ignored Styles &amp;gt; Font Colours Maximum Ignorance Threshold&lt;br /&gt;
: When defined, font colours will be ignored ending by the specified value. It can be empty (considered as a white colour by default), and contain preset colour values or RGB hex strings: white, White, FFFFFF - thresholds in white. Default: none.&lt;br /&gt;
; Excluded/Included Styles&lt;br /&gt;
: Depending on the radio switch (exclude or include), text using any selected styles will be excluded or included for translation. Default: none.&lt;br /&gt;
; Excluded/Included Highlight Colors&lt;br /&gt;
: Depending on the radio switch (exclude or include), text using any selected colours will be excluded or included for translation. Default: none.&lt;br /&gt;
; Excluded Font Colours&lt;br /&gt;
: Text using any selected colours will not be exposed for translation. Default: none.&lt;br /&gt;
; Allow Style Optimisation&lt;br /&gt;
: When checked, the optimisation of styles is allowed - common formatting of all runs in a paragraph is moved to the styles part. Default: on.&lt;br /&gt;
&lt;br /&gt;
=== Excel Options ===&lt;br /&gt;
; Translate Hidden Rows and Columns&lt;br /&gt;
: When checked, hidden rows and columns are exposed for translation.  Default: off.&lt;br /&gt;
; Colors to Exclude&lt;br /&gt;
: Text with a foreground or background color matching any of the selected colors in this option will be excluded from translation.  Default: none.&lt;br /&gt;
:* The named colors available in the UI correspond to the standard color palette of Excel 2010.  &lt;br /&gt;
:* The configuration itself also supports colors specified as RGB in the format &amp;lt;code&amp;gt;RRGGBB&amp;lt;/code&amp;gt;, so specific colors not explicitly listed in the UI may be excluded by modifying the .fprm file by hand.  For example, to exclude #69b3e7 (Pantone 292), you could modify the &amp;lt;code&amp;gt;tsExcelExcludedColors&amp;lt;/code&amp;gt; section of the configuration file like this:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
tsExcelExcludedColors.i=1&lt;br /&gt;
ccc0=69b3e7&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
; Translate Cells Copied&lt;br /&gt;
: When checked, cell data are copied on extraction to allow contextualised and independent translations.  Default: on.&lt;br /&gt;
; Preserve Styles In Target Columns&lt;br /&gt;
: When checked, the cell styles in target columns are preserved.  Default: off.&lt;br /&gt;
; Extract Source And Target Columns Joined&lt;br /&gt;
: When checked, the source and target columns (cells in a row) are joined on extraction.  Default: off.&lt;br /&gt;
; Worksheet Configurations&lt;br /&gt;
: The list of configurations spotting the exclusion from translation rows and/or columns and/or marking such rows and/or columns as metadata per a worksheet name pattern.&lt;br /&gt;
: For one configuration it is possible to specify:&lt;br /&gt;
:* Name Pattern - a regular expression, by which all other operations are matched and applied. For formatting options please refer to &amp;lt;code&amp;gt;java.util.regex.Pattern&amp;lt;/code&amp;gt;. E.g.: &amp;lt;code&amp;gt;Sheet1&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Source Columns - a list of ALPHA-26 numbers, specifying columns that are copied over the target ones for translation/extraction. E.g.: &amp;lt;code&amp;gt;A,B&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Target Columns - a list of ALPHA-26 numbers, specifying columns that are overwritten by the source ones for translation/extraction. E.g.: &amp;lt;code&amp;gt;C,D&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Target Columns Max Characters - a list of decimal unsigned integers [0, 2^32]. When specified, the maxwidth and size-unit properties are attached to text units specified in the target columns. E.g.: &amp;lt;code&amp;gt;25,30&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Excluded Rows - a list of integers, pointing out row numbers that are excluded from translation/extraction. E.g.: &amp;lt;code&amp;gt;1,2&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Excluded Columns - a list of ALPHA-26 numbers, specifying columns that are excluded from translation/extraction. E.g.: &amp;lt;code&amp;gt;A,B&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Metadata Rows - a list of integers, pointing out row numbers that are treated and extracted as metadata. E.g.: &amp;lt;code&amp;gt;3,4&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Metadata Columns - a list of ALPHA-26 numbers, specifying columns that are treated and extracted as metadata. E.g.: &amp;lt;code&amp;gt;C,D&amp;lt;/code&amp;gt;.&lt;br /&gt;
: Let's consider a simple table as an example and find out what can be done with all those configurations.&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; style=&amp;quot;margin:auto&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! colspan=&amp;quot;2&amp;quot;|Metadata Header A1 !! colspan=&amp;quot;2&amp;quot;|Metadata Header C1&lt;br /&gt;
|-&lt;br /&gt;
! Metadata Header A2 !! Metadata Header B2 || Metadata Header C2 !! Metadata Header D2&lt;br /&gt;
|-&lt;br /&gt;
| A3 || B3 || C3 || Metadata D3&lt;br /&gt;
|-&lt;br /&gt;
| A4 || B4 || C4 || Metadata D4&lt;br /&gt;
|-&lt;br /&gt;
| A5 || B5 || C5 || Metadata D5&lt;br /&gt;
|}&lt;br /&gt;
: Firstly, let's suppose we would like to translate column A only and place the translation in column B. At the same time we do not want to translate the 1st and the 2nd rows.&lt;br /&gt;
: This requirement can be configured in the following way (using the &amp;lt;code&amp;gt;net.sf.okapi.common.ParametersString&amp;lt;/code&amp;gt; format as an example):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
worksheetConfigurations.number.i=1&lt;br /&gt;
worksheetConfigurations.0.namePattern=Sheet1&lt;br /&gt;
worksheetConfigurations.0.sourceColumns=A&lt;br /&gt;
worksheetConfigurations.0.targetColumns=B&lt;br /&gt;
worksheetConfigurations.0.excludedRows=1,2&lt;br /&gt;
worksheetConfigurations.0.excludedColumns=C,D&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
: Then the XLIFF would look like this after extraction and translation:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;group id=&amp;quot;P76C545-sg1&amp;quot; resname=&amp;quot;Sheet1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg1&amp;quot; resname=&amp;quot;1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg2&amp;quot; resname=&amp;quot;2&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg3&amp;quot; resname=&amp;quot;3&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu1&amp;quot; resname=&amp;quot;Sheet1!B3&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A3&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;A3-tr&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg4&amp;quot; resname=&amp;quot;4&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu2&amp;quot; resname=&amp;quot;Sheet1!B4&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A4&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;A4-tr&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg5&amp;quot; resname=&amp;quot;5&amp;quot;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu3&amp;quot; resname=&amp;quot;Sheet1!B5&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A5&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;A5-tr&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
&amp;lt;/group&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
: And the merged representation would be the following:&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; style=&amp;quot;margin:auto&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! colspan=&amp;quot;2&amp;quot;|Metadata Header A1 !! colspan=&amp;quot;2&amp;quot;|Metadata Header C1&lt;br /&gt;
|-&lt;br /&gt;
! Metadata Header A2 !! Metadata Header B2 || Metadata Header C2 !! Metadata Header D2&lt;br /&gt;
|-&lt;br /&gt;
| A3 || A3-tr || C3 || Metadata D3&lt;br /&gt;
|-&lt;br /&gt;
| A4 || A4-tr || C4 || Metadata D4&lt;br /&gt;
|-&lt;br /&gt;
| A5 || A5-tr || C5 || Metadata D5&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
: Furthermore, let's suppose we would like to translate columns A and B, and treat column D as metadata for each of the translatable cell in a row. At the same time, we would like to consider the 1st and 2nd rows as metadata about the metadata in columns. And, we would like not to extract the 5th row.&lt;br /&gt;
: All these requirements can be written as the following configurations:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
worksheetConfigurations.number.i=1&lt;br /&gt;
worksheetConfigurations.0.namePattern=Sheet1&lt;br /&gt;
worksheetConfigurations.0.excludedRows=5&lt;br /&gt;
worksheetConfigurations.0.excludedColumns=C&lt;br /&gt;
worksheetConfigurations.0.metadataRows=1,2&lt;br /&gt;
worksheetConfigurations.0.metadataColumns=D&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
: Then, the extraction to XLIFF should look like that:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;group id=&amp;quot;P76C545-sg1&amp;quot; resname=&amp;quot;Sheet1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg1&amp;quot; resname=&amp;quot;1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg2&amp;quot; resname=&amp;quot;2&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg3&amp;quot; resname=&amp;quot;3&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;context-group name=&amp;quot;row-metadata&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;context context-type=&amp;quot;x-Metadata Header C1;Metadata Header D2&amp;quot;&amp;gt;Metadata D3&amp;lt;/context&amp;gt;&lt;br /&gt;
    &amp;lt;/context-group&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu1&amp;quot; resname=&amp;quot;Sheet1!A3&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A3&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu2&amp;quot; resname=&amp;quot;Sheet1!B3&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;B3&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg4&amp;quot; resname=&amp;quot;4&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;context-group name=&amp;quot;row-metadata&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;context context-type=&amp;quot;x-Metadata Header C1;Metadata Header D2&amp;quot;&amp;gt;Metadata D4&amp;lt;/context&amp;gt;&lt;br /&gt;
    &amp;lt;/context-group&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu3&amp;quot; resname=&amp;quot;Sheet1!A4&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A4&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu4&amp;quot; resname=&amp;quot;Sheet1!B4&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;B4&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg5&amp;quot; resname=&amp;quot;5&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;context-group name=&amp;quot;row-metadata&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;context context-type=&amp;quot;x-Metadata Header C1;Metadata Header D2&amp;quot;&amp;gt;Metadata D5&amp;lt;/context&amp;gt;&lt;br /&gt;
    &amp;lt;/context-group&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
&amp;lt;/group&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== PowerPoint Options ===&lt;br /&gt;
; Translate Document Properties&lt;br /&gt;
: When checked and the same option is checked under '''the Gereral Options''' (''they will be separated after the next release''), the following document properties are exposed for translation: title, subject, creator, description, category, keywords, content status. Default: on.&lt;br /&gt;
; Reorder Document Properties&lt;br /&gt;
: When checked, the document properties are reordered and placed after the root relationship part (_rels/.rels). Default: off.&lt;br /&gt;
; Reorder Relationships&lt;br /&gt;
: When checked, the relationship parts are reordered and placed after the related slide or layout or master part. Default: off.&lt;br /&gt;
; Translate Diagram Data&lt;br /&gt;
: When checked, the diagram data are exposed for translation. Default: on.&lt;br /&gt;
; Reorder Diagram Data&lt;br /&gt;
: When checked, the diagram data parts are reordered and placed after the related slide or layout or master part and after their relationship parts. Default: off.&lt;br /&gt;
; Translate Charts&lt;br /&gt;
: When checked, the charts are exposed for translation. Default: on.&lt;br /&gt;
; Reorder Charts&lt;br /&gt;
: When checked, the chart parts are reordered and placed after the related slide or layout or master part and after their diagram data parts. Default: off.&lt;br /&gt;
; Translate Notes&lt;br /&gt;
: When checked, the slide notes exposed for translation. Default: off.&lt;br /&gt;
; Reorder Notes&lt;br /&gt;
: When checked, the note parts are reordered and placed after the related slide part and after its chart parts. Default: off.&lt;br /&gt;
; Translate Comments&lt;br /&gt;
: When checked and the same option is checked under '''the Gereral Options''' (''they will be separated after the next release''), the document comments are exposed for translation. Default: on.&lt;br /&gt;
; Reorder Comments&lt;br /&gt;
: When checked, the comment parts are reordered and placed after the related slide part and after its note parts. Default: off.&lt;br /&gt;
; Translate Masters&lt;br /&gt;
: When checked, expose slide masters and notes masters for translation. This will also expose for translation content from layouts that are currently in use by at least one slide.  Default: on.&lt;br /&gt;
; Translate Graphic Metadata&lt;br /&gt;
: When checked, the graphic metadata (@name and @descr attribute values) are exposed for translation. Default: off.&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
* Various, see [https://bitbucket.org/okapiframework/okapi/issues?status=new&amp;amp;title=~OpenXML the issues list].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=OpenXML_Filter&amp;diff=1029</id>
		<title>OpenXML Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=OpenXML_Filter&amp;diff=1029"/>
		<updated>2025-03-24T17:24:06Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* Excel Options */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
This filter allows you to process the different types of documents of the Microsoft Office suite from 2007 and later, such as DOCX (text documents), XLSX (spreadsheets) and PPTX (presentations).  These documents are based on the OpenXML format, opposed to the binary formats used by pre-2007 versions of Office.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
The filter parameters are divided into '''General Options''', which apply to all formats, and format-specific options.&lt;br /&gt;
&lt;br /&gt;
===General Options===&lt;br /&gt;
; Translate Document Properties&lt;br /&gt;
: When checked, exposes the following document properties for translation: title, subject, creator, description, category, keywords, content status. Default: on.&lt;br /&gt;
; Translate Comments&lt;br /&gt;
: When checked, exposes document comments for translation.  Default: on.&lt;br /&gt;
; Clean Tags Aggressively&lt;br /&gt;
: When checked, strips additional formatting tags related to text spacing.  This is meant to improve filtering in cases where Office documents were converted from other formats (in particular, PDF), and imperfect conversion added a lot of extra formatting noise.  Default: off.&lt;br /&gt;
; Ignore Whitespace Styles&lt;br /&gt;
: When checked under the &amp;quot;Clean Tags Aggressively&amp;quot;, the whitespace character styles (formatting) are ignored and considered equal to the consequential ones.  Default: off.&lt;br /&gt;
&lt;br /&gt;
=== Word Options ===&lt;br /&gt;
; Translated Headers and Footers&lt;br /&gt;
: When checked, exposes header and footer content for translation. Default: on.&lt;br /&gt;
; Translate Numbering Level Text&lt;br /&gt;
: When checked, exposes numbering-level text for translation. Default: off.&lt;br /&gt;
; Translated Hidden Text&lt;br /&gt;
: When checked, exposes hidden text for translation. Default: on.&lt;br /&gt;
; Exclude Graphical Metadata&lt;br /&gt;
: When not checked, labels associated with drawings and word art are exposed for translation.  When checked, these labels (which are frequently not displayed in the document) are suppressed. Default: off.&lt;br /&gt;
; Ignored Styles &amp;gt; Ignore Font Colours&lt;br /&gt;
: When checked, font colours will be ignored. Default: off.&lt;br /&gt;
: If &amp;lt;cite&amp;gt;Clean Tags Aggressively&amp;lt;/cite&amp;gt; and this option are checked and the ignorance thresholds are empty, the font colour run properties are removed from the document structure on filtering. This means that the font colour information is absent on merge as well.&lt;br /&gt;
; Ignored Styles &amp;gt; Font Colours Minimum Ignorance Threshold&lt;br /&gt;
: When defined, font colours will be ignored starting from the specified value. It can be empty (considered as a white colour by default), and contain preset colour values or RGB hex strings: black, Black, 000000 - thresholds in white. Default: none.&lt;br /&gt;
; Ignored Styles &amp;gt; Font Colours Maximum Ignorance Threshold&lt;br /&gt;
: When defined, font colours will be ignored ending by the specified value. It can be empty (considered as a white colour by default), and contain preset colour values or RGB hex strings: white, White, FFFFFF - thresholds in white. Default: none.&lt;br /&gt;
; Excluded/Included Styles&lt;br /&gt;
: Depending on the radio switch (exclude or include), text using any selected styles will be excluded or included for translation. Default: none.&lt;br /&gt;
; Excluded/Included Highlight Colors&lt;br /&gt;
: Depending on the radio switch (exclude or include), text using any selected colours will be excluded or included for translation. Default: none.&lt;br /&gt;
; Excluded Font Colours&lt;br /&gt;
: Text using any selected colours will not be exposed for translation. Default: none.&lt;br /&gt;
; Allow Style Optimisation&lt;br /&gt;
: When checked, the optimisation of styles is allowed - common formatting of all runs in a paragraph is moved to the styles part. Default: on.&lt;br /&gt;
&lt;br /&gt;
=== Excel Options ===&lt;br /&gt;
; Translate Hidden Rows and Columns&lt;br /&gt;
: When checked, hidden rows and columns are exposed for translation.  Default: off.&lt;br /&gt;
; Colors to Exclude&lt;br /&gt;
: Text with a foreground or background color matching any of the selected colors in this option will be excluded from translation.  These colors correspond to the standard color palette of Excel 2010.  The configuration itself stores these values as RGB, so specific colors not explicitly listed here may be excluded by modifying the .fprm file by hand.  Default: none.&lt;br /&gt;
; Translate Cells Copied&lt;br /&gt;
: When checked, cell data are copied on extraction to allow contextualised and independent translations.  Default: on.&lt;br /&gt;
; Preserve Styles In Target Columns&lt;br /&gt;
: When checked, the cell styles in target columns are preserved.  Default: off.&lt;br /&gt;
; Extract Source And Target Columns Joined&lt;br /&gt;
: When checked, the source and target columns (cells in a row) are joined on extraction.  Default: off.&lt;br /&gt;
; Worksheet Configurations&lt;br /&gt;
: The list of configurations spotting the exclusion from translation rows and/or columns and/or marking such rows and/or columns as metadata per a worksheet name pattern.&lt;br /&gt;
: For one configuration it is possible to specify:&lt;br /&gt;
:* Name Pattern - a regular expression, by which all other operations are matched and applied. For formatting options please refer to &amp;lt;code&amp;gt;java.util.regex.Pattern&amp;lt;/code&amp;gt;. E.g.: &amp;lt;code&amp;gt;Sheet1&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Source Columns - a list of ALPHA-26 numbers, specifying columns that are copied over the target ones for translation/extraction. E.g.: &amp;lt;code&amp;gt;A,B&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Target Columns - a list of ALPHA-26 numbers, specifying columns that are overwritten by the source ones for translation/extraction. E.g.: &amp;lt;code&amp;gt;C,D&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Target Columns Max Characters - a list of decimal unsigned integers [0, 2^32]. When specified, the maxwidth and size-unit properties are attached to text units specified in the target columns. E.g.: &amp;lt;code&amp;gt;25,30&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Excluded Rows - a list of integers, pointing out row numbers that are excluded from translation/extraction. E.g.: &amp;lt;code&amp;gt;1,2&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Excluded Columns - a list of ALPHA-26 numbers, specifying columns that are excluded from translation/extraction. E.g.: &amp;lt;code&amp;gt;A,B&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Metadata Rows - a list of integers, pointing out row numbers that are treated and extracted as metadata. E.g.: &amp;lt;code&amp;gt;3,4&amp;lt;/code&amp;gt;.&lt;br /&gt;
:* Metadata Columns - a list of ALPHA-26 numbers, specifying columns that are treated and extracted as metadata. E.g.: &amp;lt;code&amp;gt;C,D&amp;lt;/code&amp;gt;.&lt;br /&gt;
: Let's consider a simple table as an example and find out what can be done with all those configurations.&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; style=&amp;quot;margin:auto&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! colspan=&amp;quot;2&amp;quot;|Metadata Header A1 !! colspan=&amp;quot;2&amp;quot;|Metadata Header C1&lt;br /&gt;
|-&lt;br /&gt;
! Metadata Header A2 !! Metadata Header B2 || Metadata Header C2 !! Metadata Header D2&lt;br /&gt;
|-&lt;br /&gt;
| A3 || B3 || C3 || Metadata D3&lt;br /&gt;
|-&lt;br /&gt;
| A4 || B4 || C4 || Metadata D4&lt;br /&gt;
|-&lt;br /&gt;
| A5 || B5 || C5 || Metadata D5&lt;br /&gt;
|}&lt;br /&gt;
: Firstly, let's suppose we would like to translate column A only and place the translation in column B. At the same time we do not want to translate the 1st and the 2nd rows.&lt;br /&gt;
: This requirement can be configured in the following way (using the &amp;lt;code&amp;gt;net.sf.okapi.common.ParametersString&amp;lt;/code&amp;gt; format as an example):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
worksheetConfigurations.number.i=1&lt;br /&gt;
worksheetConfigurations.0.namePattern=Sheet1&lt;br /&gt;
worksheetConfigurations.0.sourceColumns=A&lt;br /&gt;
worksheetConfigurations.0.targetColumns=B&lt;br /&gt;
worksheetConfigurations.0.excludedRows=1,2&lt;br /&gt;
worksheetConfigurations.0.excludedColumns=C,D&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
: Then the XLIFF would look like this after extraction and translation:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;group id=&amp;quot;P76C545-sg1&amp;quot; resname=&amp;quot;Sheet1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg1&amp;quot; resname=&amp;quot;1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg2&amp;quot; resname=&amp;quot;2&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg3&amp;quot; resname=&amp;quot;3&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu1&amp;quot; resname=&amp;quot;Sheet1!B3&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A3&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;A3-tr&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg4&amp;quot; resname=&amp;quot;4&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu2&amp;quot; resname=&amp;quot;Sheet1!B4&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A4&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;A4-tr&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg5&amp;quot; resname=&amp;quot;5&amp;quot;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu3&amp;quot; resname=&amp;quot;Sheet1!B5&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A5&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;A5-tr&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
&amp;lt;/group&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
: And the merged representation would be the following:&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot; style=&amp;quot;margin:auto&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! colspan=&amp;quot;2&amp;quot;|Metadata Header A1 !! colspan=&amp;quot;2&amp;quot;|Metadata Header C1&lt;br /&gt;
|-&lt;br /&gt;
! Metadata Header A2 !! Metadata Header B2 || Metadata Header C2 !! Metadata Header D2&lt;br /&gt;
|-&lt;br /&gt;
| A3 || A3-tr || C3 || Metadata D3&lt;br /&gt;
|-&lt;br /&gt;
| A4 || A4-tr || C4 || Metadata D4&lt;br /&gt;
|-&lt;br /&gt;
| A5 || A5-tr || C5 || Metadata D5&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
: Furthermore, let's suppose we would like to translate columns A and B, and treat column D as metadata for each of the translatable cell in a row. At the same time, we would like to consider the 1st and 2nd rows as metadata about the metadata in columns. And, we would like not to extract the 5th row.&lt;br /&gt;
: All these requirements can be written as the following configurations:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
worksheetConfigurations.number.i=1&lt;br /&gt;
worksheetConfigurations.0.namePattern=Sheet1&lt;br /&gt;
worksheetConfigurations.0.excludedRows=5&lt;br /&gt;
worksheetConfigurations.0.excludedColumns=C&lt;br /&gt;
worksheetConfigurations.0.metadataRows=1,2&lt;br /&gt;
worksheetConfigurations.0.metadataColumns=D&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
: Then, the extraction to XLIFF should look like that:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;group id=&amp;quot;P76C545-sg1&amp;quot; resname=&amp;quot;Sheet1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg1&amp;quot; resname=&amp;quot;1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg2&amp;quot; resname=&amp;quot;2&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg3&amp;quot; resname=&amp;quot;3&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;context-group name=&amp;quot;row-metadata&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;context context-type=&amp;quot;x-Metadata Header C1;Metadata Header D2&amp;quot;&amp;gt;Metadata D3&amp;lt;/context&amp;gt;&lt;br /&gt;
    &amp;lt;/context-group&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu1&amp;quot; resname=&amp;quot;Sheet1!A3&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A3&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu2&amp;quot; resname=&amp;quot;Sheet1!B3&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;B3&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg4&amp;quot; resname=&amp;quot;4&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;context-group name=&amp;quot;row-metadata&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;context context-type=&amp;quot;x-Metadata Header C1;Metadata Header D2&amp;quot;&amp;gt;Metadata D4&amp;lt;/context&amp;gt;&lt;br /&gt;
    &amp;lt;/context-group&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu3&amp;quot; resname=&amp;quot;Sheet1!A4&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;A4&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
    &amp;lt;trans-unit id=&amp;quot;P147242AB-tu4&amp;quot; resname=&amp;quot;Sheet1!B4&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;B4&amp;lt;/source&amp;gt;&lt;br /&gt;
      &amp;lt;target xml:lang=&amp;quot;es&amp;quot;&amp;gt;&amp;lt;/target&amp;gt;&lt;br /&gt;
    &amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
  &amp;lt;group id=&amp;quot;P132303AB-sg5&amp;quot; resname=&amp;quot;5&amp;quot;&amp;gt;&lt;br /&gt;
    &amp;lt;context-group name=&amp;quot;row-metadata&amp;quot;&amp;gt;&lt;br /&gt;
      &amp;lt;context context-type=&amp;quot;x-Metadata Header C1;Metadata Header D2&amp;quot;&amp;gt;Metadata D5&amp;lt;/context&amp;gt;&lt;br /&gt;
    &amp;lt;/context-group&amp;gt;&lt;br /&gt;
  &amp;lt;/group&amp;gt;&lt;br /&gt;
&amp;lt;/group&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== PowerPoint Options ===&lt;br /&gt;
; Translate Document Properties&lt;br /&gt;
: When checked and the same option is checked under '''the Gereral Options''' (''they will be separated after the next release''), the following document properties are exposed for translation: title, subject, creator, description, category, keywords, content status. Default: on.&lt;br /&gt;
; Reorder Document Properties&lt;br /&gt;
: When checked, the document properties are reordered and placed after the root relationship part (_rels/.rels). Default: off.&lt;br /&gt;
; Reorder Relationships&lt;br /&gt;
: When checked, the relationship parts are reordered and placed after the related slide or layout or master part. Default: off.&lt;br /&gt;
; Translate Diagram Data&lt;br /&gt;
: When checked, the diagram data are exposed for translation. Default: on.&lt;br /&gt;
; Reorder Diagram Data&lt;br /&gt;
: When checked, the diagram data parts are reordered and placed after the related slide or layout or master part and after their relationship parts. Default: off.&lt;br /&gt;
; Translate Charts&lt;br /&gt;
: When checked, the charts are exposed for translation. Default: on.&lt;br /&gt;
; Reorder Charts&lt;br /&gt;
: When checked, the chart parts are reordered and placed after the related slide or layout or master part and after their diagram data parts. Default: off.&lt;br /&gt;
; Translate Notes&lt;br /&gt;
: When checked, the slide notes exposed for translation. Default: off.&lt;br /&gt;
; Reorder Notes&lt;br /&gt;
: When checked, the note parts are reordered and placed after the related slide part and after its chart parts. Default: off.&lt;br /&gt;
; Translate Comments&lt;br /&gt;
: When checked and the same option is checked under '''the Gereral Options''' (''they will be separated after the next release''), the document comments are exposed for translation. Default: on.&lt;br /&gt;
; Reorder Comments&lt;br /&gt;
: When checked, the comment parts are reordered and placed after the related slide part and after its note parts. Default: off.&lt;br /&gt;
; Translate Masters&lt;br /&gt;
: When checked, expose slide masters and notes masters for translation. This will also expose for translation content from layouts that are currently in use by at least one slide.  Default: on.&lt;br /&gt;
; Translate Graphic Metadata&lt;br /&gt;
: When checked, the graphic metadata (@name and @descr attribute values) are exposed for translation. Default: off.&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
* Various, see [https://bitbucket.org/okapiframework/okapi/issues?status=new&amp;amp;title=~OpenXML the issues list].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=FAQ&amp;diff=1013</id>
		<title>FAQ</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=FAQ&amp;diff=1013"/>
		<updated>2024-10-06T23:49:31Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: Update links to gitlab&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Capabilities==&lt;br /&gt;
&lt;br /&gt;
====What formats are supported?====&lt;br /&gt;
&lt;br /&gt;
The framework offers filters for many file formats, including XML, XLIFF, TMX, HTML, DOCX, ODT, Properties, PO, and many more.&amp;lt;br&amp;gt;&lt;br /&gt;
For a more complete list of the supported formats, see the &amp;quot;[[Filters]]&amp;quot; page.&lt;br /&gt;
&lt;br /&gt;
Note that you can also create your own filter configurations to support some formats. You can also create your own filters and use them seamlessly with the Okapi tools.&lt;br /&gt;
&lt;br /&gt;
====How do I extract text for translation?====&lt;br /&gt;
&lt;br /&gt;
See the article &amp;quot;[[How to Extract Text for Translation]]&amp;quot; in the [[Knowledge Base]].&lt;br /&gt;
&lt;br /&gt;
====Does Okapi provide a translation editor?====&lt;br /&gt;
&lt;br /&gt;
Not at this time. The Okapi tools allow you to create translation packages in various formats that can be opened in different translation editors such as OmegaT, MemoQ, Trados Workbench, Swordfish, Wordfast, etc.&lt;br /&gt;
&lt;br /&gt;
For translating XLIFF files see: &amp;quot;[[How to Translate XLIFF Documents]]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
====Does Okapi provide a TM (Translation Memory)?====&lt;br /&gt;
&lt;br /&gt;
Yes. There are currently two TM engines implemented in the framework:&lt;br /&gt;
&lt;br /&gt;
* [[Pensieve TM]] is the main TM engine.&lt;br /&gt;
* [[SimpleTM TM]] is a limited and older engine that '''is being progressively phased out'''.&lt;br /&gt;
&lt;br /&gt;
You can also use third-part TM engines through the the different [[Connectors|connectors]] that the framework provides. For example: the [[Translate Toolkit TM Connector|Translate Toolkit TM]], [[GlobalSight TM Connector|GlobalSight TM]], the [[OpenTran Translation Repository Connector|OpenTran Translation Repository]], [[MyMemory TM Connector|MyMemory]], etc. For a complete list and more details see the &amp;quot;[[Connectors]]&amp;quot; page.&lt;br /&gt;
&lt;br /&gt;
====Does Okapi provide a MT (Machine Translation) system?====&lt;br /&gt;
&lt;br /&gt;
Not at this time. But you can use different third-party MT system using one of the connectors distributed with the framework. For example you can work with [[Google MT v2 Connector|Google MT]], [[Apertium MT Connector|Apertium MT]], [[Microsoft Translator Connector|Microsoft Translator]], etc. For a complete list, see the [[Connectors|Connectors page]].&lt;br /&gt;
&lt;br /&gt;
====Why is there several distributions, isn't Java cross-platform?====&lt;br /&gt;
&lt;br /&gt;
Yes, Java is cross-platform, and most of the Okapi code runs anywhere Java runs.&lt;br /&gt;
However, for a better internationalization support and a more seamless integration with each platform, we have selected to use Eclipse SWT (http://www.eclipse.org/swt) as the foundation for the UI of our applications. That library requires a different distribution for each platform and architecture.&lt;br /&gt;
&lt;br /&gt;
Okapi's source code has been carefully designed to separate UI-dependant code and non-UI code, so most of the components (such as the [[Filters]], the [[Steps]] and the [[Connectors]]) can be used on any platform.&lt;br /&gt;
&lt;br /&gt;
====Can I change the Java VM settings when running the tools?====&lt;br /&gt;
&lt;br /&gt;
Yes. See [[How to Change the Java Parameters for Rainbow]]. You can follow the same steps for all Okapi tools.&lt;br /&gt;
&lt;br /&gt;
==Simple Troubleshooting==&lt;br /&gt;
&lt;br /&gt;
====Is there a Getting Started guide?====&lt;br /&gt;
&lt;br /&gt;
Yes. See the &amp;quot;[[Getting Started]]&amp;quot; page.&lt;br /&gt;
&lt;br /&gt;
====When I try to start Rainbow/Ratel/CheckMate nothing happens. What is wrong?====&lt;br /&gt;
&lt;br /&gt;
* Check that you have the proper version of Java (1.7 or above).&lt;br /&gt;
* Make sure you have installed the correct distribution for your platform.&lt;br /&gt;
* If your machine is 32-bit make sure to have installed the 32-bit distribution.&lt;br /&gt;
* If your machine is 64-bit make sure to have installed the 64-bit distribution.&lt;br /&gt;
&lt;br /&gt;
==Licenses==&lt;br /&gt;
&lt;br /&gt;
====Under what licence the Okapi Framework is developed?====&lt;br /&gt;
&lt;br /&gt;
* The source code is under [https://www.apache.org/licenses/LICENSE-2.0 Apache Licence version 2.0].&lt;br /&gt;
* The documentation is under [http://creativecommons.org/licenses/by-sa/3.0/ Creative Commons Attribution-ShareAlike License (CC-BY-SA)].&lt;br /&gt;
&lt;br /&gt;
====Can I use Okapi's components in my applications?====&lt;br /&gt;
&lt;br /&gt;
Yes. The project uses the Apache license which allows open-source or commercial products to use our applications and components. See more information the license at [https://www.apache.org/licenses/LICENSE-2.0].&lt;br /&gt;
&lt;br /&gt;
==Support==&lt;br /&gt;
&lt;br /&gt;
====Is there a users group or a support mailing list?====&lt;br /&gt;
&lt;br /&gt;
Yes. There are two main mailing lists. Both have public archives, and both require registration to post a message:&lt;br /&gt;
&lt;br /&gt;
* [https://groups.google.com/g/okapi-users Okapi users] is the group and mailing list '''for the end users'''.&lt;br /&gt;
* [http://groups.google.com/g/okapi-devel Okapi developers] is the group and mailing list '''for the developers''' working on the source code.&lt;br /&gt;
&lt;br /&gt;
====How do I report bugs or request enhancement?====&lt;br /&gt;
&lt;br /&gt;
* You can post a bug report or an enhancement request on [https://gitlab.com/okapiframework/Okapi/-/issues the issues tracking page] if you have a GitLab account (preferred).&lt;br /&gt;
&lt;br /&gt;
* You can post a message to [https://groups.google.com/g/okapi-users the Okapi users group] if you are part of the group.&lt;br /&gt;
&lt;br /&gt;
* You can just [mailto:okapitools@opentag.com&amp;amp;subject=Feedback send feedback by email].&lt;br /&gt;
&lt;br /&gt;
==Miscellaneous==&lt;br /&gt;
&lt;br /&gt;
====What does 'Okapi' mean?====&lt;br /&gt;
&lt;br /&gt;
An okapi is an African animal looking somewhat like [http://en.wikipedia.org/wiki/Okapi a cross between a zebra and a giraffe]. Okapi is pronounced [http://en.wikipedia.org/wiki/Wikipedia:IPA_for_English /oʊˈkɑːpɪ/] ([http://www.m-w.com/cgi-bin/audio.pl?okapi001.wav=okapi hear it])&lt;br /&gt;
&lt;br /&gt;
The usage of this name for the framework has its roots to much older projects. At some point it was an acronym for &amp;quot;Open Kit API&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
====What happened to the .NET Okapi?====&lt;br /&gt;
&lt;br /&gt;
The older version of the Okapi Framework for .NET is no longer developed. Its distribution and source code is still available here: http://sourceforge.net/projects/okapi/. All new development is now done in the Java branch.&lt;br /&gt;
&lt;br /&gt;
====Where is Olifant?====&lt;br /&gt;
&lt;br /&gt;
Olifant, the TMX editor, is currently only part of the .NET Okapi. It is still available [http://sourceforge.net/projects/okapi/files/ from the SourceForge project]. Note that Olifant is for Windows only.&lt;br /&gt;
&lt;br /&gt;
==For developers==&lt;br /&gt;
&lt;br /&gt;
====Getting set up====&lt;br /&gt;
&lt;br /&gt;
* Check out the source code from GitLab using git clone: https://gitlab.com/okapiframework/Okapi&lt;br /&gt;
* Or, if you want to submit pull requests, first create a fork of the Okapi project. &lt;br /&gt;
* Import into your IDE. For example, in Eclipse go to File &amp;gt; Import &amp;gt; Maven &amp;gt; Existing Maven project. &lt;br /&gt;
If you want to keep several distinct Okapi repositories in the same Eclipse workspace (for instance, your fork and the main Okapi project), you need to assign a name template under the &amp;quot;Advanced&amp;quot; section in the first step of the import wizard. &lt;br /&gt;
* The &amp;quot;master&amp;quot; branch contains the latest release version. The &amp;quot;dev&amp;quot; branch contains the current work (the &amp;quot;snapshot&amp;quot; in Maven terms). &lt;br /&gt;
* See also: https://gitlab.com/okapiframework/Okapi/-/wikis/How-to-Contribute&lt;br /&gt;
Happy coding!&lt;br /&gt;
&lt;br /&gt;
====How to build okapi-lib locally====&lt;br /&gt;
&lt;br /&gt;
The Okapi Framework consists of Maven projects. However, in order to build the apps and lib projects locally, you need to use the Ant build configurations. &lt;br /&gt;
&lt;br /&gt;
For instance, to create a local version of okapi-lib.jar, go to &amp;lt;OKAPI_HOME&amp;gt;/deployment/maven/ and run ant -f build_okapi-lib.xml init okapiLib. The jar will be generated in &amp;lt;OKAPI_HOME&amp;gt;/deployment/maven/dist_common/lib/. &lt;br /&gt;
&lt;br /&gt;
If you use the default build.xml by running above command without the -f option, platform-specific distributions of the apps will be created plus the platform-indipendent okapi-lib.jar.&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=FAQ&amp;diff=1012</id>
		<title>FAQ</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=FAQ&amp;diff=1012"/>
		<updated>2024-10-06T23:48:46Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: Update issues link to gitlab&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Capabilities==&lt;br /&gt;
&lt;br /&gt;
====What formats are supported?====&lt;br /&gt;
&lt;br /&gt;
The framework offers filters for many file formats, including XML, XLIFF, TMX, HTML, DOCX, ODT, Properties, PO, and many more.&amp;lt;br&amp;gt;&lt;br /&gt;
For a more complete list of the supported formats, see the &amp;quot;[[Filters]]&amp;quot; page.&lt;br /&gt;
&lt;br /&gt;
Note that you can also create your own filter configurations to support some formats. You can also create your own filters and use them seamlessly with the Okapi tools.&lt;br /&gt;
&lt;br /&gt;
====How do I extract text for translation?====&lt;br /&gt;
&lt;br /&gt;
See the article &amp;quot;[[How to Extract Text for Translation]]&amp;quot; in the [[Knowledge Base]].&lt;br /&gt;
&lt;br /&gt;
====Does Okapi provide a translation editor?====&lt;br /&gt;
&lt;br /&gt;
Not at this time. The Okapi tools allow you to create translation packages in various formats that can be opened in different translation editors such as OmegaT, MemoQ, Trados Workbench, Swordfish, Wordfast, etc.&lt;br /&gt;
&lt;br /&gt;
For translating XLIFF files see: &amp;quot;[[How to Translate XLIFF Documents]]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
====Does Okapi provide a TM (Translation Memory)?====&lt;br /&gt;
&lt;br /&gt;
Yes. There are currently two TM engines implemented in the framework:&lt;br /&gt;
&lt;br /&gt;
* [[Pensieve TM]] is the main TM engine.&lt;br /&gt;
* [[SimpleTM TM]] is a limited and older engine that '''is being progressively phased out'''.&lt;br /&gt;
&lt;br /&gt;
You can also use third-part TM engines through the the different [[Connectors|connectors]] that the framework provides. For example: the [[Translate Toolkit TM Connector|Translate Toolkit TM]], [[GlobalSight TM Connector|GlobalSight TM]], the [[OpenTran Translation Repository Connector|OpenTran Translation Repository]], [[MyMemory TM Connector|MyMemory]], etc. For a complete list and more details see the &amp;quot;[[Connectors]]&amp;quot; page.&lt;br /&gt;
&lt;br /&gt;
====Does Okapi provide a MT (Machine Translation) system?====&lt;br /&gt;
&lt;br /&gt;
Not at this time. But you can use different third-party MT system using one of the connectors distributed with the framework. For example you can work with [[Google MT v2 Connector|Google MT]], [[Apertium MT Connector|Apertium MT]], [[Microsoft Translator Connector|Microsoft Translator]], etc. For a complete list, see the [[Connectors|Connectors page]].&lt;br /&gt;
&lt;br /&gt;
====Why is there several distributions, isn't Java cross-platform?====&lt;br /&gt;
&lt;br /&gt;
Yes, Java is cross-platform, and most of the Okapi code runs anywhere Java runs.&lt;br /&gt;
However, for a better internationalization support and a more seamless integration with each platform, we have selected to use Eclipse SWT (http://www.eclipse.org/swt) as the foundation for the UI of our applications. That library requires a different distribution for each platform and architecture.&lt;br /&gt;
&lt;br /&gt;
Okapi's source code has been carefully designed to separate UI-dependant code and non-UI code, so most of the components (such as the [[Filters]], the [[Steps]] and the [[Connectors]]) can be used on any platform.&lt;br /&gt;
&lt;br /&gt;
====Can I change the Java VM settings when running the tools?====&lt;br /&gt;
&lt;br /&gt;
Yes. See [[How to Change the Java Parameters for Rainbow]]. You can follow the same steps for all Okapi tools.&lt;br /&gt;
&lt;br /&gt;
==Simple Troubleshooting==&lt;br /&gt;
&lt;br /&gt;
====Is there a Getting Started guide?====&lt;br /&gt;
&lt;br /&gt;
Yes. See the &amp;quot;[[Getting Started]]&amp;quot; page.&lt;br /&gt;
&lt;br /&gt;
====When I try to start Rainbow/Ratel/CheckMate nothing happens. What is wrong?====&lt;br /&gt;
&lt;br /&gt;
* Check that you have the proper version of Java (1.7 or above).&lt;br /&gt;
* Make sure you have installed the correct distribution for your platform.&lt;br /&gt;
* If your machine is 32-bit make sure to have installed the 32-bit distribution.&lt;br /&gt;
* If your machine is 64-bit make sure to have installed the 64-bit distribution.&lt;br /&gt;
&lt;br /&gt;
==Licenses==&lt;br /&gt;
&lt;br /&gt;
====Under what licence the Okapi Framework is developed?====&lt;br /&gt;
&lt;br /&gt;
* The source code is under [https://www.apache.org/licenses/LICENSE-2.0 Apache Licence version 2.0].&lt;br /&gt;
* The documentation is under [http://creativecommons.org/licenses/by-sa/3.0/ Creative Commons Attribution-ShareAlike License (CC-BY-SA)].&lt;br /&gt;
&lt;br /&gt;
====Can I use Okapi's components in my applications?====&lt;br /&gt;
&lt;br /&gt;
Yes. The project uses the Apache license which allows open-source or commercial products to use our applications and components. See more information the license at [https://www.apache.org/licenses/LICENSE-2.0].&lt;br /&gt;
&lt;br /&gt;
==Support==&lt;br /&gt;
&lt;br /&gt;
====Is there a users group or a support mailing list?====&lt;br /&gt;
&lt;br /&gt;
Yes. There are two main mailing lists. Both have public archives, and both require registration to post a message:&lt;br /&gt;
&lt;br /&gt;
* [https://groups.google.com/g/okapi-users Okapi users] is the group and mailing list '''for the end users'''.&lt;br /&gt;
* [http://groups.google.com/g/okapi-devel Okapi developers] is the group and mailing list '''for the developers''' working on the source code.&lt;br /&gt;
&lt;br /&gt;
====How do I report bugs or request enhancement?====&lt;br /&gt;
&lt;br /&gt;
* You can post a bug report or an enhancement request on [https://gitlab.com/okapiframework/Okapi/-/issues the issues tracking page] if you have a GitLab account (preferred).&lt;br /&gt;
&lt;br /&gt;
* You can post a message to [https://groups.google.com/g/okapi-users the Okapi users group] if you are part of the group.&lt;br /&gt;
&lt;br /&gt;
* You can just [mailto:okapitools@opentag.com&amp;amp;subject=Feedback send feedback by email].&lt;br /&gt;
&lt;br /&gt;
==Miscellaneous==&lt;br /&gt;
&lt;br /&gt;
====What does 'Okapi' mean?====&lt;br /&gt;
&lt;br /&gt;
An okapi is an African animal looking somewhat like [http://en.wikipedia.org/wiki/Okapi a cross between a zebra and a giraffe]. Okapi is pronounced [http://en.wikipedia.org/wiki/Wikipedia:IPA_for_English /oʊˈkɑːpɪ/] ([http://www.m-w.com/cgi-bin/audio.pl?okapi001.wav=okapi hear it])&lt;br /&gt;
&lt;br /&gt;
The usage of this name for the framework has its roots to much older projects. At some point it was an acronym for &amp;quot;Open Kit API&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
====What happened to the .NET Okapi?====&lt;br /&gt;
&lt;br /&gt;
The older version of the Okapi Framework for .NET is no longer developed. Its distribution and source code is still available here: http://sourceforge.net/projects/okapi/. All new development is now done in the Java branch.&lt;br /&gt;
&lt;br /&gt;
====Where is Olifant?====&lt;br /&gt;
&lt;br /&gt;
Olifant, the TMX editor, is currently only part of the .NET Okapi. It is still available [http://sourceforge.net/projects/okapi/files/ from the SourceForge project]. Note that Olifant is for Windows only.&lt;br /&gt;
&lt;br /&gt;
==For developers==&lt;br /&gt;
&lt;br /&gt;
====Getting set up====&lt;br /&gt;
&lt;br /&gt;
* Check out the source code from Bitbucket using git clone: https://bitbucket.org/okapiframework/okapi&lt;br /&gt;
* Or, if you want to submit pull requests, first create a fork of the Okapi project. &lt;br /&gt;
* Import into your IDE. For example, in Eclipse go to File &amp;gt; Import &amp;gt; Maven &amp;gt; Existing Maven project. &lt;br /&gt;
If you want to keep several distinct Okapi repositories in the same Eclipse workspace (for instance, your fork and the main Okapi project), you need to assign a name template under the &amp;quot;Advanced&amp;quot; section in the first step of the import wizard. &lt;br /&gt;
* The &amp;quot;master&amp;quot; branch contains the latest release version. The &amp;quot;dev&amp;quot; branch contains the current work (the &amp;quot;snapshot&amp;quot; in Maven terms). &lt;br /&gt;
* See also: https://bitbucket.org/okapiframework/okapi/wiki/How%20to%20Contribute&lt;br /&gt;
Happy coding!&lt;br /&gt;
&lt;br /&gt;
====How to build okapi-lib locally====&lt;br /&gt;
&lt;br /&gt;
The Okapi Framework consists of Maven projects. However, in order to build the apps and lib projects locally, you need to use the Ant build configurations. &lt;br /&gt;
&lt;br /&gt;
For instance, to create a local version of okapi-lib.jar, go to &amp;lt;OKAPI_HOME&amp;gt;/deployment/maven/ and run ant -f build_okapi-lib.xml init okapiLib. The jar will be generated in &amp;lt;OKAPI_HOME&amp;gt;/deployment/maven/dist_common/lib/. &lt;br /&gt;
&lt;br /&gt;
If you use the default build.xml by running above command without the -f option, platform-specific distributions of the apps will be created plus the platform-indipendent okapi-lib.jar.&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Distributions&amp;diff=959</id>
		<title>Distributions</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Distributions&amp;diff=959"/>
		<updated>2023-01-16T20:16:13Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: Update longhorn release&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&lt;br /&gt;
== Main Project ==&lt;br /&gt;
&lt;br /&gt;
This includes platform-specific distributions for the applications (Rainbow, Tikal, Ratel, etc.) as well as the libraries-only package (all platforms).&lt;br /&gt;
&lt;br /&gt;
* Latest release ( &amp;lt;b&amp;gt;1.44.0 - Aug 28, 2022&amp;lt;/b&amp;gt; ) : [https://okapiframework.org/binaries/main/1.44.0 https://okapiframework.org/binaries/main/1.44.0]&lt;br /&gt;
* Recent releases: [https://okapiframework.org/binaries/main https://okapiframework.org/binaries/main]&lt;br /&gt;
* Release artifacts on Maven Central: https://search.maven.org/search?q=net.sf.okapi&lt;br /&gt;
* Changes log: [https://okapiframework.org/binaries/main/changes.html https://okapiframework.org/binaries/main/changes.html]&lt;br /&gt;
&lt;br /&gt;
Snapshots:&lt;br /&gt;
&lt;br /&gt;
* Latest Development Snapshots (nightly builds): [https://gitlab.com/okapiframework/okapi/-/jobs/artifacts/dev/browse/deployment/maven/done?job=verification:jdk11 https://gitlab.com/okapiframework/okapi/-/jobs/artifacts/dev/browse/deployment/maven/done]&lt;br /&gt;
&lt;br /&gt;
== Longhorn ==&lt;br /&gt;
&lt;br /&gt;
This includes the Longhorn distributions (all platforms).&lt;br /&gt;
&lt;br /&gt;
* Latest release ( &amp;lt;b&amp;gt;1.44.0 - Jan 16, 2023&amp;lt;/b&amp;gt; ) : [https://okapiframework.org/binaries/longhorn/okapi-longhorn_all-platforms_1.44.0.zip okapi-longhorn_all-platforms_1.44.0.zip]&lt;br /&gt;
* Recent releases: [https://okapiframework.org/binaries/longhorn https://okapiframework.org/binaries/longhorn]&lt;br /&gt;
&lt;br /&gt;
== OmegaT Filter Plugin ==&lt;br /&gt;
&lt;br /&gt;
This includes the Filters Plugin for OmegaT (all platforms).&lt;br /&gt;
&lt;br /&gt;
* Latest release ( &amp;lt;b&amp;gt;1.12-1.44.0 - Nov 23, 2022&amp;lt;/b&amp;gt; ) : [https://okapiframework.org/binaries/omegat-plugin/okapiFiltersForOmegaT-1.12-1.44.0-dist.zip okapiFiltersForOmegaT-1.12-1.44.0-dist.zip]&lt;br /&gt;
* All releases: [https://okapiframework.org/binaries/omegat-plugin https://okapiframework.org/binaries/omegat-plugin]&lt;br /&gt;
&lt;br /&gt;
== Ocelot ==&lt;br /&gt;
&lt;br /&gt;
This includes the  Review Workbench application Ocelot (all platforms).&lt;br /&gt;
&lt;br /&gt;
* Latest release ( &amp;lt;b&amp;gt;3.0 - Oct 17, 2017&amp;lt;/b&amp;gt; ) : [https://okapiframework.org/binaries/ocelot/Ocelot-3.0.jar Ocelot-3.0.jar]&lt;br /&gt;
* Recent releases: [https://okapiframework.org/binaries/ocelot https://okapiframework.org/binaries/ocelot]&lt;br /&gt;
&lt;br /&gt;
== Archives ==&lt;br /&gt;
&lt;br /&gt;
Older distributions that are not included above.&lt;br /&gt;
&lt;br /&gt;
* [https://okapiframework.org/binaries/archives Archives (https://okapiframework.org/binaries/archives)]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Markdown_Filter&amp;diff=952</id>
		<title>Markdown Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Markdown_Filter&amp;diff=952"/>
		<updated>2022-10-10T21:29:17Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* Limitations */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
The Markdown Filter is an Okapi component for extracting translatable text from Markdown files. See https://en.wikipedia.org/wiki/Markdown for more information about the format.&lt;br /&gt;
Markdown is a family of formats, not all of them mutually compatible.  This filter is designed to work with markdown based on the [http://commonmark.org CommonMark] specification, with additional features to support [https://guides.github.com/features/mastering-markdown/ GitHub-flavored Markdown].&lt;br /&gt;
&lt;br /&gt;
==Processing Details==&lt;br /&gt;
&lt;br /&gt;
===Input Encoding===&lt;br /&gt;
&lt;br /&gt;
The filter decides which encoding to use for the input file using the following logic:&lt;br /&gt;
&lt;br /&gt;
If the file has a Unicode Byte-Order-Mark:&lt;br /&gt;
Then, the corresponding encoding (e.g. UTF-8, UTF-16, etc.) is used.&lt;br /&gt;
Otherwise, the input encoding used is the default encoding that was specified when setting the filter options.&lt;br /&gt;
&lt;br /&gt;
===HTML Elements===&lt;br /&gt;
The HTML Inline Elements, i.e. the tags, and the HTML Block, a chunk of text sandwiched between a block-forming start tag and its corresponding end tag, are processed by the HTML filter. The HTML filter to use can be customized separately.&lt;br /&gt;
&lt;br /&gt;
===Inline Codes===&lt;br /&gt;
The [[HTML_Filter#Inline_Code_Finder|Inline Code Finder]] is supported by this filter. &lt;br /&gt;
&lt;br /&gt;
The subfilter applies to the translatable text within the proper part of Markdown document. It does not apply to the HTML inline tags or HTML blocks. For that, you would need to enable and specify the inline code pattern for the HTML filter separately, name the configuration as okf_html@''arbitary-name''.fprm, and specify that name for the htmlSubfilter parameter.&lt;br /&gt;
&lt;br /&gt;
Note, the support of the Inline Code Finder was temporarily unavailable in some snapshot builds of version 0.36, but it has been restored.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
; Translate Hyperlink URLs (translateUrls)&lt;br /&gt;
: By default, URLs in link and image statements are not exposed for translation.  If this option is enabled, they will be extracted as a subflow. Default: false&lt;br /&gt;
&lt;br /&gt;
; REGEX Pattern for Translatable URLs (urlToTranslatePattern)&lt;br /&gt;
: When translateUrls=true, only the URLs that match this REGEX will be extracted. Default: .+ (all URLs)&lt;br /&gt;
&lt;br /&gt;
; Translate Fenced Code Blocks (translateCodeBlocks)&lt;br /&gt;
: This option controls whether the contents of fenced code blocks are exposed for translation. Default: true&lt;br /&gt;
&lt;br /&gt;
; Translate Indented Code Blocks (translateIndentedCodeBlocks)&lt;br /&gt;
: This option controls whether the contents of indented code blocks are exposed for translation. Default: true&lt;br /&gt;
&lt;br /&gt;
; Translate Inline Code Blocks (translateInlineCodeBlocks)&lt;br /&gt;
: This option controls whether the contents of inline code blocks (ie, text delimited by single backticks) are exposed for translation. Default: true&lt;br /&gt;
&lt;br /&gt;
; Translate YAML Metadata Header (translateImageAltText)&lt;br /&gt;
: Some markdown formats support a [http://pandoc.org/MANUAL.html#extension-yaml_metadata_block YAML Metadata Header] that contains key/value data. By default, this header is not exposed for translation. When the &amp;quot;Translate YAML Metadata Header&amp;quot; option is enabled, the header will be parsed and the metadata values will be exposed for translation. Default: false&lt;br /&gt;
&lt;br /&gt;
; Translate Image Alt Text (translateImageAltText)&lt;br /&gt;
: The alt text for a graphic image in the form of &amp;lt;nowiki&amp;gt;![alt text](https://foo.com/images/bar.jpg)&amp;lt;/nowiki&amp;gt; or as the alt attribute of an img tag &amp;lt;nowiki&amp;gt;&amp;lt;img src=&amp;quot;https://foo.com/images/bar.jpg&amp;quot; alt=&amp;quot;alt text&amp;quot;&amp;gt;&amp;lt;/nowiki&amp;gt; will be extracted if this parameter is true. Default: true.&lt;br /&gt;
&lt;br /&gt;
; Generate anchors based on header text. (generateHeaderAnchors)&lt;br /&gt;
: Some markdown parsers support explicit named anchors in header markup, using the syntax &amp;lt;code&amp;gt;{#my-anchor}&amp;lt;/code&amp;gt;. When set, this option will automatically generate anchors for headings in the source document, for the purpose of providing a stable anchor for hyperlinks that reference a (translatable) header value. Default: false.&lt;br /&gt;
&lt;br /&gt;
; Parses out certain MDX expressions using regex. (parseMdx) [Experimental]&lt;br /&gt;
: When set, parses out multi-line &amp;lt;code&amp;gt;export&amp;lt;/code&amp;gt; blocks as skeleton. Default: false.&lt;br /&gt;
&lt;br /&gt;
; Enter a String of characters that will be escaped as HTML entities. (htmlEntitesToEscape)&lt;br /&gt;
: When set, encodes specific characters as HTML entities on export. Default: (none)&lt;br /&gt;
&lt;br /&gt;
; Support backslash escaping of punctuation (unescapeBackslashCharacters)&lt;br /&gt;
: When set, parses backslash-escaped punctuation in source documents. Default: false.&lt;br /&gt;
&lt;br /&gt;
; Enter a String of punctuation characters that will be escaped when the option above is enabled. (charactersToEscape)&lt;br /&gt;
: When &amp;lt;code&amp;gt;unescapeBackslashCharacters&amp;lt;/code&amp;gt; is enabled, characters listed in this option will be backslash-escaped on export. Default: &amp;lt;code&amp;gt;*_`{}[]&amp;amp;lt;&amp;amp;gt;()#+\-.!|&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
; HTML subfilter configuration ID (htmlSubfilter)&lt;br /&gt;
: The custom configuration ID of the HTML filter that will be called to process HTML contents within Markdown documents. The configuration file must be saved in a known location with ''.fprm'' suffix. Specify nothing to use the default HTML filter configuration tailored for the Markdown filter. Default: (empty)&lt;br /&gt;
&lt;br /&gt;
; YAML subfilter configuration ID (yamlSubfilter)&lt;br /&gt;
: The custom configuration ID of the YAML filter that will be called to process any YAML metadata header detected in the document. This allows for customization of the metadata fields extracted for translation. Default: (emptY)&lt;br /&gt;
&lt;br /&gt;
; Enter non translatable block quotes (nonTranslateBlocks)&lt;br /&gt;
: This option prevents some block quotes from translation. Block quotes that start with one of comma separated strings will not be extracted. Default: (empty - contents in all block quotes will be extracted) &lt;br /&gt;
&lt;br /&gt;
; Use Code Finder (useCodeFinder)&lt;br /&gt;
: Determines whether to use the Inline Code Finder or not. Default: false&lt;br /&gt;
&lt;br /&gt;
; Number of Code Finder Rules (codeFinderRules.count)&lt;br /&gt;
: The number of rules, i.e. regular expression patterns. Default: 1&lt;br /&gt;
&lt;br /&gt;
; Code Finder Rule ''N'' (codeFinderRules.rule''N'') &lt;br /&gt;
: ''N''th matching pattern for codes where ''N''=0,1,2...&lt;br /&gt;
&lt;br /&gt;
; Sample Text (codeFinderRules.sample)&lt;br /&gt;
: Sample text to test the rules on UI. &lt;br /&gt;
&lt;br /&gt;
; Use All Rules (codeFinderRules.useAllRulesWhenTesting)&lt;br /&gt;
: Determines whether to apply all rules when testing on UI.&lt;br /&gt;
&lt;br /&gt;
==Notes==&lt;br /&gt;
&lt;br /&gt;
=== Translation of URLs as Subflows ===&lt;br /&gt;
&lt;br /&gt;
When there is a subflow of text in the middle of the main text, the subflow will be extracted before the segment that contains it.  For example, for this run of Markdown text:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Please click ![The Information desk logo](images/circled-i.jpg) for help.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The extracted text in the XLIFF file will look like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;lt;trans-unit id=&amp;quot;tu2&amp;quot; restype=&amp;quot;x-img-link&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;images/circled-i.jpg&amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
&amp;lt;trans-unit id=&amp;quot;tu1&amp;quot; xml:space=&amp;quot;preserve&amp;quot;&amp;gt;&lt;br /&gt;
&amp;lt;source xml:lang=&amp;quot;en&amp;quot;&amp;gt;Please click &amp;lt;bpt id=&amp;quot;1&amp;quot;&amp;gt;![&amp;lt;/bpt&amp;gt;The Information desk logo&amp;lt;ept id=&amp;quot;1&amp;quot;&amp;gt;]([#$tu2])&amp;lt;/ept&amp;gt; for help.&amp;lt;/source&amp;gt;&lt;br /&gt;
&amp;lt;/trans-unit&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Markdown_Filter&amp;diff=951</id>
		<title>Markdown Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Markdown_Filter&amp;diff=951"/>
		<updated>2022-10-10T21:20:42Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* Parameters */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
The Markdown Filter is an Okapi component for extracting translatable text from Markdown files. See https://en.wikipedia.org/wiki/Markdown for more information about the format.&lt;br /&gt;
Markdown is a family of formats, not all of them mutually compatible.  This filter is designed to work with markdown based on the [http://commonmark.org CommonMark] specification, with additional features to support [https://guides.github.com/features/mastering-markdown/ GitHub-flavored Markdown].&lt;br /&gt;
&lt;br /&gt;
==Processing Details==&lt;br /&gt;
&lt;br /&gt;
===Input Encoding===&lt;br /&gt;
&lt;br /&gt;
The filter decides which encoding to use for the input file using the following logic:&lt;br /&gt;
&lt;br /&gt;
If the file has a Unicode Byte-Order-Mark:&lt;br /&gt;
Then, the corresponding encoding (e.g. UTF-8, UTF-16, etc.) is used.&lt;br /&gt;
Otherwise, the input encoding used is the default encoding that was specified when setting the filter options.&lt;br /&gt;
&lt;br /&gt;
===HTML Elements===&lt;br /&gt;
The HTML Inline Elements, i.e. the tags, and the HTML Block, a chunk of text sandwiched between a block-forming start tag and its corresponding end tag, are processed by the HTML filter. The HTML filter to use can be customized separately.&lt;br /&gt;
&lt;br /&gt;
===Inline Codes===&lt;br /&gt;
The [[HTML_Filter#Inline_Code_Finder|Inline Code Finder]] is supported by this filter. &lt;br /&gt;
&lt;br /&gt;
The subfilter applies to the translatable text within the proper part of Markdown document. It does not apply to the HTML inline tags or HTML blocks. For that, you would need to enable and specify the inline code pattern for the HTML filter separately, name the configuration as okf_html@''arbitary-name''.fprm, and specify that name for the htmlSubfilter parameter.&lt;br /&gt;
&lt;br /&gt;
Note, the support of the Inline Code Finder was temporarily unavailable in some snapshot builds of version 0.36, but it has been restored.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
; Translate Hyperlink URLs (translateUrls)&lt;br /&gt;
: By default, URLs in link and image statements are not exposed for translation.  If this option is enabled, they will be extracted as a subflow. Default: false&lt;br /&gt;
&lt;br /&gt;
; REGEX Pattern for Translatable URLs (urlToTranslatePattern)&lt;br /&gt;
: When translateUrls=true, only the URLs that match this REGEX will be extracted. Default: .+ (all URLs)&lt;br /&gt;
&lt;br /&gt;
; Translate Fenced Code Blocks (translateCodeBlocks)&lt;br /&gt;
: This option controls whether the contents of fenced code blocks are exposed for translation. Default: true&lt;br /&gt;
&lt;br /&gt;
; Translate Indented Code Blocks (translateIndentedCodeBlocks)&lt;br /&gt;
: This option controls whether the contents of indented code blocks are exposed for translation. Default: true&lt;br /&gt;
&lt;br /&gt;
; Translate Inline Code Blocks (translateInlineCodeBlocks)&lt;br /&gt;
: This option controls whether the contents of inline code blocks (ie, text delimited by single backticks) are exposed for translation. Default: true&lt;br /&gt;
&lt;br /&gt;
; Translate YAML Metadata Header (translateImageAltText)&lt;br /&gt;
: Some markdown formats support a [http://pandoc.org/MANUAL.html#extension-yaml_metadata_block YAML Metadata Header] that contains key/value data. By default, this header is not exposed for translation. When the &amp;quot;Translate YAML Metadata Header&amp;quot; option is enabled, the header will be parsed and the metadata values will be exposed for translation. Default: false&lt;br /&gt;
&lt;br /&gt;
; Translate Image Alt Text (translateImageAltText)&lt;br /&gt;
: The alt text for a graphic image in the form of &amp;lt;nowiki&amp;gt;![alt text](https://foo.com/images/bar.jpg)&amp;lt;/nowiki&amp;gt; or as the alt attribute of an img tag &amp;lt;nowiki&amp;gt;&amp;lt;img src=&amp;quot;https://foo.com/images/bar.jpg&amp;quot; alt=&amp;quot;alt text&amp;quot;&amp;gt;&amp;lt;/nowiki&amp;gt; will be extracted if this parameter is true. Default: true.&lt;br /&gt;
&lt;br /&gt;
; Generate anchors based on header text. (generateHeaderAnchors)&lt;br /&gt;
: Some markdown parsers support explicit named anchors in header markup, using the syntax &amp;lt;code&amp;gt;{#my-anchor}&amp;lt;/code&amp;gt;. When set, this option will automatically generate anchors for headings in the source document, for the purpose of providing a stable anchor for hyperlinks that reference a (translatable) header value. Default: false.&lt;br /&gt;
&lt;br /&gt;
; Parses out certain MDX expressions using regex. (parseMdx) [Experimental]&lt;br /&gt;
: When set, parses out multi-line &amp;lt;code&amp;gt;export&amp;lt;/code&amp;gt; blocks as skeleton. Default: false.&lt;br /&gt;
&lt;br /&gt;
; Enter a String of characters that will be escaped as HTML entities. (htmlEntitesToEscape)&lt;br /&gt;
: When set, encodes specific characters as HTML entities on export. Default: (none)&lt;br /&gt;
&lt;br /&gt;
; Support backslash escaping of punctuation (unescapeBackslashCharacters)&lt;br /&gt;
: When set, parses backslash-escaped punctuation in source documents. Default: false.&lt;br /&gt;
&lt;br /&gt;
; Enter a String of punctuation characters that will be escaped when the option above is enabled. (charactersToEscape)&lt;br /&gt;
: When &amp;lt;code&amp;gt;unescapeBackslashCharacters&amp;lt;/code&amp;gt; is enabled, characters listed in this option will be backslash-escaped on export. Default: &amp;lt;code&amp;gt;*_`{}[]&amp;amp;lt;&amp;amp;gt;()#+\-.!|&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
; HTML subfilter configuration ID (htmlSubfilter)&lt;br /&gt;
: The custom configuration ID of the HTML filter that will be called to process HTML contents within Markdown documents. The configuration file must be saved in a known location with ''.fprm'' suffix. Specify nothing to use the default HTML filter configuration tailored for the Markdown filter. Default: (empty)&lt;br /&gt;
&lt;br /&gt;
; YAML subfilter configuration ID (yamlSubfilter)&lt;br /&gt;
: The custom configuration ID of the YAML filter that will be called to process any YAML metadata header detected in the document. This allows for customization of the metadata fields extracted for translation. Default: (emptY)&lt;br /&gt;
&lt;br /&gt;
; Enter non translatable block quotes (nonTranslateBlocks)&lt;br /&gt;
: This option prevents some block quotes from translation. Block quotes that start with one of comma separated strings will not be extracted. Default: (empty - contents in all block quotes will be extracted) &lt;br /&gt;
&lt;br /&gt;
; Use Code Finder (useCodeFinder)&lt;br /&gt;
: Determines whether to use the Inline Code Finder or not. Default: false&lt;br /&gt;
&lt;br /&gt;
; Number of Code Finder Rules (codeFinderRules.count)&lt;br /&gt;
: The number of rules, i.e. regular expression patterns. Default: 1&lt;br /&gt;
&lt;br /&gt;
; Code Finder Rule ''N'' (codeFinderRules.rule''N'') &lt;br /&gt;
: ''N''th matching pattern for codes where ''N''=0,1,2...&lt;br /&gt;
&lt;br /&gt;
; Sample Text (codeFinderRules.sample)&lt;br /&gt;
: Sample text to test the rules on UI. &lt;br /&gt;
&lt;br /&gt;
; Use All Rules (codeFinderRules.useAllRulesWhenTesting)&lt;br /&gt;
: Determines whether to apply all rules when testing on UI.&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
=== Subflows are Not Supported ===&lt;br /&gt;
&lt;br /&gt;
When there is a subflow of text in the middle of the main text, the subflow will be inter-mixed with the main flow of text.  For example, for this run of Markdown text:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Please click ![The Information desk logo](images/circled-i.jpg) for help.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The extracted text in the XLIFF file will look like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Please click &amp;lt;x id=&amp;quot;1&amp;quot;/&amp;gt;The Information desk logo&amp;lt;x id=&amp;quot;2/&amp;gt; for help.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Markdown_Filter&amp;diff=950</id>
		<title>Markdown Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Markdown_Filter&amp;diff=950"/>
		<updated>2022-10-10T21:15:07Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* Parameters */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
The Markdown Filter is an Okapi component for extracting translatable text from Markdown files. See https://en.wikipedia.org/wiki/Markdown for more information about the format.&lt;br /&gt;
Markdown is a family of formats, not all of them mutually compatible.  This filter is designed to work with markdown based on the [http://commonmark.org CommonMark] specification, with additional features to support [https://guides.github.com/features/mastering-markdown/ GitHub-flavored Markdown].&lt;br /&gt;
&lt;br /&gt;
==Processing Details==&lt;br /&gt;
&lt;br /&gt;
===Input Encoding===&lt;br /&gt;
&lt;br /&gt;
The filter decides which encoding to use for the input file using the following logic:&lt;br /&gt;
&lt;br /&gt;
If the file has a Unicode Byte-Order-Mark:&lt;br /&gt;
Then, the corresponding encoding (e.g. UTF-8, UTF-16, etc.) is used.&lt;br /&gt;
Otherwise, the input encoding used is the default encoding that was specified when setting the filter options.&lt;br /&gt;
&lt;br /&gt;
===HTML Elements===&lt;br /&gt;
The HTML Inline Elements, i.e. the tags, and the HTML Block, a chunk of text sandwiched between a block-forming start tag and its corresponding end tag, are processed by the HTML filter. The HTML filter to use can be customized separately.&lt;br /&gt;
&lt;br /&gt;
===Inline Codes===&lt;br /&gt;
The [[HTML_Filter#Inline_Code_Finder|Inline Code Finder]] is supported by this filter. &lt;br /&gt;
&lt;br /&gt;
The subfilter applies to the translatable text within the proper part of Markdown document. It does not apply to the HTML inline tags or HTML blocks. For that, you would need to enable and specify the inline code pattern for the HTML filter separately, name the configuration as okf_html@''arbitary-name''.fprm, and specify that name for the htmlSubfilter parameter.&lt;br /&gt;
&lt;br /&gt;
Note, the support of the Inline Code Finder was temporarily unavailable in some snapshot builds of version 0.36, but it has been restored.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
; Translate Hyperlink URLs (translateUrls)&lt;br /&gt;
: By default, URLs in link and image statements are not exposed for translation.  If this option is enabled, they will be extracted as a subflow. Default: false&lt;br /&gt;
&lt;br /&gt;
; REGEX Pattern for Translatable URLs (urlToTranslatePattern)&lt;br /&gt;
: When translateUrls=true, only the URLs that match this REGEX will be extracted. Default: .+ (all URLs)&lt;br /&gt;
&lt;br /&gt;
; Translate Fenced Code Blocks (translateCodeBlocks)&lt;br /&gt;
: This option controls whether the contents of fenced code blocks are exposed for translation. Default: true&lt;br /&gt;
&lt;br /&gt;
; Translate Indented Code Blocks (translateIndentedCodeBlocks)&lt;br /&gt;
: This option controls whether the contents of indented code blocks are exposed for translation. Default: true&lt;br /&gt;
&lt;br /&gt;
; Translate Inline Code Blocks (translateInlineCodeBlocks)&lt;br /&gt;
: This option controls whether the contents of inline code blocks (ie, text delimited by single backticks) are exposed for translation. Default: true&lt;br /&gt;
&lt;br /&gt;
; Translate YAML Metadata Header (translateImageAltText)&lt;br /&gt;
: Some markdown formats support a [http://pandoc.org/MANUAL.html#extension-yaml_metadata_block YAML Metadata Header] that contains key/value data. By default, this header is not exposed for translation. When the &amp;quot;Translate YAML Metadata Header&amp;quot; option is enabled, the header will be parsed and the metadata values will be exposed for translation. Default: false&lt;br /&gt;
&lt;br /&gt;
; Translate Image Alt Text (translateImageAltText)&lt;br /&gt;
: The alt text for a graphic image in the form of &amp;lt;nowiki&amp;gt;![alt text](https://foo.com/images/bar.jpg)&amp;lt;/nowiki&amp;gt; or as the alt attribute of an img tag &amp;lt;nowiki&amp;gt;&amp;lt;img src=&amp;quot;https://foo.com/images/bar.jpg&amp;quot; alt=&amp;quot;alt text&amp;quot;&amp;gt;&amp;lt;/nowiki&amp;gt; will be extracted if this parameter is true. Default: true.&lt;br /&gt;
&lt;br /&gt;
; Generate anchors based on header text. (generateHeaderAnchors)&lt;br /&gt;
: Some markdown parsers support explicit named anchors in header markup, using the syntax &amp;lt;code&amp;gt;{#my-anchor}&amp;lt;/code&amp;gt;. When set, this option will automatically generate anchors for headings in the source document, for the purpose of providing a stable anchor for hyperlinks that reference a (translatable) header value. Default: false.&lt;br /&gt;
&lt;br /&gt;
; Parses out certain MDX expressions using regex. (parseMdx) [Experimental]&lt;br /&gt;
: When set, parses out multi-line &amp;lt;code&amp;gt;export&amp;lt;/code&amp;gt; blocks as skeleton. Default: false.&lt;br /&gt;
&lt;br /&gt;
; HTML Subfilter Configuration ID (htmlSubfilter)&lt;br /&gt;
: The custom configuration ID of the HTML filter that will be called to process HTML contents within Markdown documents. The configuration file must be saved in a known location with ''.fprm'' suffix. Specify nothing to use the default HTML filter configuration tailored for the Markdown filter. Default: (empty)&lt;br /&gt;
&lt;br /&gt;
; Enter non translatable block quotes (nonTranslateBlocks)&lt;br /&gt;
: This option prevents some block quotes from translation. Block quotes that start with one of comma separated strings will not be extracted. Default: (empty - contents in all block quotes will be extracted) &lt;br /&gt;
&lt;br /&gt;
; Use Code Finder (useCodeFinder)&lt;br /&gt;
: Determines whether to use the Inline Code Finder or not. Default: false&lt;br /&gt;
&lt;br /&gt;
; Number of Code Finder Rules (codeFinderRules.count)&lt;br /&gt;
: The number of rules, i.e. regular expression patterns. Default: 1&lt;br /&gt;
&lt;br /&gt;
; Code Finder Rule ''N'' (codeFinderRules.rule''N'') &lt;br /&gt;
: ''N''th matching pattern for codes where ''N''=0,1,2...&lt;br /&gt;
&lt;br /&gt;
; Sample Text (codeFinderRules.sample)&lt;br /&gt;
: Sample text to test the rules on UI. &lt;br /&gt;
&lt;br /&gt;
; Use All Rules (codeFinderRules.useAllRulesWhenTesting)&lt;br /&gt;
: Determines whether to apply all rules when testing on UI.&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
=== Subflows are Not Supported ===&lt;br /&gt;
&lt;br /&gt;
When there is a subflow of text in the middle of the main text, the subflow will be inter-mixed with the main flow of text.  For example, for this run of Markdown text:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Please click ![The Information desk logo](images/circled-i.jpg) for help.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The extracted text in the XLIFF file will look like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Please click &amp;lt;x id=&amp;quot;1&amp;quot;/&amp;gt;The Information desk logo&amp;lt;x id=&amp;quot;2/&amp;gt; for help.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Markdown_Filter&amp;diff=949</id>
		<title>Markdown Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Markdown_Filter&amp;diff=949"/>
		<updated>2022-10-10T21:08:37Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* Parameters */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
The Markdown Filter is an Okapi component for extracting translatable text from Markdown files. See https://en.wikipedia.org/wiki/Markdown for more information about the format.&lt;br /&gt;
Markdown is a family of formats, not all of them mutually compatible.  This filter is designed to work with markdown based on the [http://commonmark.org CommonMark] specification, with additional features to support [https://guides.github.com/features/mastering-markdown/ GitHub-flavored Markdown].&lt;br /&gt;
&lt;br /&gt;
==Processing Details==&lt;br /&gt;
&lt;br /&gt;
===Input Encoding===&lt;br /&gt;
&lt;br /&gt;
The filter decides which encoding to use for the input file using the following logic:&lt;br /&gt;
&lt;br /&gt;
If the file has a Unicode Byte-Order-Mark:&lt;br /&gt;
Then, the corresponding encoding (e.g. UTF-8, UTF-16, etc.) is used.&lt;br /&gt;
Otherwise, the input encoding used is the default encoding that was specified when setting the filter options.&lt;br /&gt;
&lt;br /&gt;
===HTML Elements===&lt;br /&gt;
The HTML Inline Elements, i.e. the tags, and the HTML Block, a chunk of text sandwiched between a block-forming start tag and its corresponding end tag, are processed by the HTML filter. The HTML filter to use can be customized separately.&lt;br /&gt;
&lt;br /&gt;
===Inline Codes===&lt;br /&gt;
The [[HTML_Filter#Inline_Code_Finder|Inline Code Finder]] is supported by this filter. &lt;br /&gt;
&lt;br /&gt;
The subfilter applies to the translatable text within the proper part of Markdown document. It does not apply to the HTML inline tags or HTML blocks. For that, you would need to enable and specify the inline code pattern for the HTML filter separately, name the configuration as okf_html@''arbitary-name''.fprm, and specify that name for the htmlSubfilter parameter.&lt;br /&gt;
&lt;br /&gt;
Note, the support of the Inline Code Finder was temporarily unavailable in some snapshot builds of version 0.36, but it has been restored.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
; Translate Hyperlink URLs (translateUrls)&lt;br /&gt;
: By default, URLs in link and image statements are not exposed for translation.  If this option is enabled, they will be extracted as a subflow. Default: false&lt;br /&gt;
&lt;br /&gt;
; REGEX Pattern for Translatable URLs (urlToTranslatePattern)&lt;br /&gt;
: When translateUrls=true, only the URLs that match this REGEX will be extracted. Default: .+ (all URLs)&lt;br /&gt;
&lt;br /&gt;
; Translate Code Blocks (translateCodeBlocks)&lt;br /&gt;
: This option controls whether the contents of fenced code blocks are exposed for translation. Default: true&lt;br /&gt;
&lt;br /&gt;
; Translate YAML Metadata Header (translateImageAltText)&lt;br /&gt;
: Some markdown formats support a [http://pandoc.org/MANUAL.html#extension-yaml_metadata_block YAML Metadata Header] that contains key/value data. By default, this header is not exposed for translation. When the &amp;quot;Translate YAML Metadata Header&amp;quot; option is enabled, the header will be parsed and the metadata values will be exposed for translation. Default: false&lt;br /&gt;
&lt;br /&gt;
; Translate Image Alt Text (translateImageAltText)&lt;br /&gt;
: The alt text for a graphic image in the form of &amp;lt;nowiki&amp;gt;![alt text](https://foo.com/images/bar.jpg)&amp;lt;/nowiki&amp;gt; or as the alt attribute of an img tag &amp;lt;nowiki&amp;gt;&amp;lt;img src=&amp;quot;https://foo.com/images/bar.jpg&amp;quot; alt=&amp;quot;alt text&amp;quot;&amp;gt;&amp;lt;/nowiki&amp;gt; will be extracted if this parameter is true. Default: true.&lt;br /&gt;
&lt;br /&gt;
; HTML Subfilter Configuration ID (htmlSubfilter)&lt;br /&gt;
: The custom configuration ID of the HTML filter that will be called to process HTML contents within Markdown documents. The configuration file must be saved in a known location with ''.fprm'' suffix. Specify nothing to use the default HTML filter configuration tailored for the Markdown filter. Default: (empty)&lt;br /&gt;
&lt;br /&gt;
; Enter non translatable block quotes (nonTranslateBlocks)&lt;br /&gt;
: This option prevents some block quotes from translation. Block quotes that start with one of comma separated strings will not be extracted. Default: (empty - contents in all block quotes will be extracted) &lt;br /&gt;
&lt;br /&gt;
; Use Code Finder (useCodeFinder)&lt;br /&gt;
: Determines whether to use the Inline Code Finder or not. Default: false&lt;br /&gt;
&lt;br /&gt;
; Number of Code Finder Rules (codeFinderRules.count)&lt;br /&gt;
: The number of rules, i.e. regular expression patterns. Default: 1&lt;br /&gt;
&lt;br /&gt;
; Code Finder Rule ''N'' (codeFinderRules.rule''N'') &lt;br /&gt;
: ''N''th matching pattern for codes where ''N''=0,1,2...&lt;br /&gt;
&lt;br /&gt;
; Sample Text (codeFinderRules.sample)&lt;br /&gt;
: Sample text to test the rules on UI. &lt;br /&gt;
&lt;br /&gt;
; Use All Rules (codeFinderRules.useAllRulesWhenTesting)&lt;br /&gt;
: Determines whether to apply all rules when testing on UI.&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
=== Subflows are Not Supported ===&lt;br /&gt;
&lt;br /&gt;
When there is a subflow of text in the middle of the main text, the subflow will be inter-mixed with the main flow of text.  For example, for this run of Markdown text:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Please click ![The Information desk logo](images/circled-i.jpg) for help.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
The extracted text in the XLIFF file will look like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Please click &amp;lt;x id=&amp;quot;1&amp;quot;/&amp;gt;The Information desk logo&amp;lt;x id=&amp;quot;2/&amp;gt; for help.&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=JSON_Filter&amp;diff=846</id>
		<title>JSON Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=JSON_Filter&amp;diff=846"/>
		<updated>2021-01-04T23:55:30Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: Update metadata param description&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
The JSON Filter is an Okapi component that implements the IFilter interface for JSON (Javascript Object Notation).&lt;br /&gt;
&lt;br /&gt;
The implementation is based on the JSON specifications: http://www.json.org/&lt;br /&gt;
&lt;br /&gt;
The following is an example of a very simple JSON file. The translatable text is highlighted:&lt;br /&gt;
&lt;br /&gt;
 {&amp;quot;menu&amp;quot;: {&lt;br /&gt;
   &amp;quot;value&amp;quot;: &amp;quot;&amp;lt;span class=&amp;quot;hi&amp;quot;&amp;gt;File&amp;lt;/span&amp;gt;&amp;quot;,&lt;br /&gt;
    &amp;quot;popup&amp;quot;: {&lt;br /&gt;
       &amp;quot;menuitem&amp;quot;: [&lt;br /&gt;
          {&amp;quot;value&amp;quot;: &amp;quot;&amp;lt;span class=&amp;quot;hi&amp;quot;&amp;gt;New&amp;lt;/span&amp;gt;&amp;quot;},&lt;br /&gt;
          {&amp;quot;value&amp;quot;: &amp;quot;&amp;lt;span class=&amp;quot;hi&amp;quot;&amp;gt;Open&amp;lt;/span&amp;gt;&amp;quot;},&lt;br /&gt;
          {&amp;quot;value&amp;quot;: &amp;quot;&amp;lt;span class=&amp;quot;hi&amp;quot;&amp;gt;Close&amp;lt;/span&amp;gt;&amp;quot;}&lt;br /&gt;
       ]&lt;br /&gt;
    }&lt;br /&gt;
 }}&lt;br /&gt;
&lt;br /&gt;
==Processing Details==&lt;br /&gt;
&lt;br /&gt;
===Input Encoding===&lt;br /&gt;
&lt;br /&gt;
JSON files are normally in one of the Unicode encoding, but the filter supports any encoding. It decides which encoding to use for the input file using the following logic:&lt;br /&gt;
&lt;br /&gt;
* If the file has a Unicode Byte-Order-Mark:&lt;br /&gt;
** Then, the corresponding encoding (e.g. UTF-8, UTF-16, etc.) is used.&lt;br /&gt;
* Else, if a header entry with a &amp;lt;code&amp;gt;charset&amp;lt;/code&amp;gt; declaration exists in the first 1000 characters of the file:&lt;br /&gt;
** If the value of the &amp;lt;code&amp;gt;charset&amp;lt;/code&amp;gt; is &amp;quot;&amp;lt;code&amp;gt;charset&amp;lt;/code&amp;gt;&amp;quot; (case insensitive):&lt;br /&gt;
*** Then the file is likely to be a template with no encoding declared, so the current encoding (auto-detected or default) is used.&lt;br /&gt;
*** Else, the declared encoding is used. Note that if the encoding has been detected from a Byte-Order-Mark and the encoding declared in the header entry does not match, a warning is generated and the encoding of the Byte-Order-Mark is used.&lt;br /&gt;
* Otherwise, the input encoding used is the default encoding that was specified when setting the filter options.&lt;br /&gt;
&lt;br /&gt;
===Output Encoding===&lt;br /&gt;
&lt;br /&gt;
If the output encoding is UTF-8:&lt;br /&gt;
&lt;br /&gt;
* If the input encoding was also UTF-8, a Byte-Order-Mark is used for the output document only if one was detected in the input document.&lt;br /&gt;
* If the input encoding was not UTF-8, no Byte-Order-Mark is used in the output document.&lt;br /&gt;
&lt;br /&gt;
===Line-Breaks===&lt;br /&gt;
&lt;br /&gt;
The type of line-breaks of the output is the same as the one of the original input.&lt;br /&gt;
&lt;br /&gt;
===Comments===&lt;br /&gt;
&lt;br /&gt;
Though not technically legal in JSON these comment types are supported:&lt;br /&gt;
&amp;lt;code&amp;gt;&lt;br /&gt;
* // comment&lt;br /&gt;
* # comment&lt;br /&gt;
* /* comment */&lt;br /&gt;
* &amp;amp;lt;!-- comment --&amp;gt;&lt;br /&gt;
&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
=== Options Tab===&lt;br /&gt;
&lt;br /&gt;
====Stand-alone strings====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Extract strings without associated key&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set this option to extract string that are not associated directly to a key value.&lt;br /&gt;
&lt;br /&gt;
====Strings with keys====&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Extract all key/strings pairs&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set this option to extract all strings that have a key associated. If a regular expression for exceptions is defined, the strings that have a key matching the expression are not extracted.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Do not extract key/string pairs&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set the option to not extract any string that has an associated key. If a regular expression for exceptions is defined, the strings that have a key matching the expression are extracted.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Excepted when the key matches the following regular expression&amp;lt;/cite&amp;gt; &amp;amp;mdash; Enter a regular expression that correspond to the keys that should have a behavior inverse to the default behavior you have selected for the key/strings pairs.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Use the key as the resname&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set this option to use the value of the key as the value of the name of the extracted item (&amp;lt;code&amp;gt;resname&amp;lt;/code&amp;gt; in XLIFF).&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Use the full key path&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set this option to use the full key path in the &amp;lt;code&amp;gt;resname&amp;lt;/code&amp;gt;. For example: &amp;lt;code&amp;gt;/menu/value/popup/menuitem/value&amp;lt;/code&amp;gt;. The use key name as resname option must be set for this option to take effect. If enabled, exception regular expressions apply to the full path.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Include leading &amp;quot;/&amp;quot; on key path&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set this option to have a leading character '/' in the full key path.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Regex matching keys that are notes, values of which to appear as &amp;lt;note&amp;gt; in XLIFF&amp;lt;/cite&amp;gt; &amp;amp;mdash; Specify regular expression. The values of the matching keys will be transferred to &amp;amp;lt;note&amp;gt; elements in XLIFF.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Regex matching keys who's values are added as TextUnit Metadata&amp;lt;/cite&amp;gt; &amp;amp;mdash; Specify regular expression. The values of the matching keys will be written out as &amp;amp;lt;context-group&amp;gt; elements in XLIFF.&lt;br /&gt;
&lt;br /&gt;
===New Extraction Rules &amp;gt;= version M39===&lt;br /&gt;
&amp;lt;b&amp;gt;If specified these will override the corresponding rules above.&amp;lt;/b&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Regex matching keys which are ID's (resname in XLIFF), &amp;lt;b&amp;gt;overrides &amp;amp;quot;use the key as resname&amp;amp;quot;&amp;lt;/b&amp;gt;&amp;lt;/cite&amp;gt; &amp;amp;mdash; Specify regular expression. The value of the matching key will be used as &amp;lt;code&amp;gt;resname&amp;lt;/code&amp;gt; in XLIFF.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Regex matching keys who's values are extracted (&amp;lt;b&amp;gt;overrides &amp;amp;quot;extraction exceptions&amp;amp;quot;&amp;lt;/b&amp;gt;)&amp;lt;/cite&amp;gt; &amp;amp;mdash; Specify regular expression. The values of the matching keys will be extracted.&lt;br /&gt;
&lt;br /&gt;
===Content Processing Tab===&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Process text content with this sub-filter&amp;lt;/cite&amp;gt; &amp;amp;mdash; Specify an Okapi filter ID (e.g. &amp;lt;code&amp;gt;okf_html&amp;lt;/code&amp;gt;) to process the content of all translatable text with that filter. Leave this field blank for default behavior.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Find inline codes by patterns defined below&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set this option to use the specified regular expressions on the text of the extracted items. Any match will be converted to an inline code.&lt;br /&gt;
&lt;br /&gt;
'''Note:''' This option cannot be used together with the sub-filtering option.&lt;br /&gt;
&lt;br /&gt;
By default the expression is:&lt;br /&gt;
&lt;br /&gt;
 ((%(([-0+#]?)[-0+#]?)((\d\$)?)(([\d\*]*)(\.[\d\*]*)?)[dioxXucsfeEgGpn])&lt;br /&gt;
 |((\\r\\n)|\\a|\\b|\\f|\\n|\\r|\\t|\\v)&lt;br /&gt;
 |(\{\d.*?\}))&lt;br /&gt;
&lt;br /&gt;
{{CodeFinder Help}}&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
Comments within a JSON string are parsed as part of the string content, not as comments. A configured subfilter will then process these as true comments (they will become part of the skeleton or whatever the filter is configured to do).&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=FAQ&amp;diff=736</id>
		<title>FAQ</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=FAQ&amp;diff=736"/>
		<updated>2018-05-11T20:40:17Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* Is there a users group or a support mailing list? */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;==Capabilities==&lt;br /&gt;
&lt;br /&gt;
====What formats are supported?====&lt;br /&gt;
&lt;br /&gt;
The framework offers filters for many file formats, including XML, XLIFF, TMX, HTML, DOCX, ODT, Properties, PO, and many more.&amp;lt;br&amp;gt;&lt;br /&gt;
For a more complete list of the supported formats, see the &amp;quot;[[Filters]]&amp;quot; page.&lt;br /&gt;
&lt;br /&gt;
Note that you can also create your own filter configurations to support some formats. You can also create your own filters and use them seamlessly with the Okapi tools.&lt;br /&gt;
&lt;br /&gt;
====How do I extract text for translation?====&lt;br /&gt;
&lt;br /&gt;
See the article &amp;quot;[[How to Extract Text for Translation]]&amp;quot; in the [[Knowledge Base]].&lt;br /&gt;
&lt;br /&gt;
====Does Okapi provide a translation editor?====&lt;br /&gt;
&lt;br /&gt;
Not at this time. The Okapi tools allow you to create translation packages in various formats that can be opened in different translation editors such as OmegaT, MemoQ, Trados Workbench, Swordfish, Wordfast, etc.&lt;br /&gt;
&lt;br /&gt;
For translating XLIFF files see: &amp;quot;[[How to Translate XLIFF Documents]]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
====Does Okapi provide a TM (Translation Memory)?====&lt;br /&gt;
&lt;br /&gt;
Yes. There are currently two TM engines implemented in the framework:&lt;br /&gt;
&lt;br /&gt;
* [[Pensieve TM]] is the main TM engine.&lt;br /&gt;
* [[SimpleTM TM]] is a limited and older engine that '''is being progressively phased out'''.&lt;br /&gt;
&lt;br /&gt;
You can also use third-part TM engines through the the different [[Connectors|connectors]] that the framework provides. For example: the [[Translate Toolkit TM Connector|Translate Toolkit TM]], [[GlobalSight TM Connector|GlobalSight TM]], the [[OpenTran Translation Repository Connector|OpenTran Translation Repository]], [[MyMemory TM Connector|MyMemory]], etc. For a complete list and more details see the &amp;quot;[[Connectors]]&amp;quot; page.&lt;br /&gt;
&lt;br /&gt;
====Does Okapi provide a MT (Machine Translation) system?====&lt;br /&gt;
&lt;br /&gt;
Not at this time. But you can use different third-party MT system using one of the connectors distributed with the framework. For example you can work with [[Google MT v2 Connector|Google MT]], [[Apertium MT Connector|Apertium MT]], [[Microsoft Translator Connector|Microsoft Translator]], etc. For a complete list, see the [[Connectors|Connectors page]].&lt;br /&gt;
&lt;br /&gt;
====Why is there several distributions, isn't Java cross-platform?====&lt;br /&gt;
&lt;br /&gt;
Yes, Java is cross-platform, and most of the Okapi code runs anywhere Java runs.&lt;br /&gt;
However, for a better internationalization support and a more seamless integration with each platform, we have selected to use Eclipse SWT (http://www.eclipse.org/swt) as the foundation for the UI of our applications. That library requires a different distribution for each platform and architecture.&lt;br /&gt;
&lt;br /&gt;
Okapi's source code has been carefully designed to separate UI-dependant code and non-UI code, so most of the components (such as the [[Filters]], the [[Steps]] and the [[Connectors]]) can be used on any platform.&lt;br /&gt;
&lt;br /&gt;
====Can I change the Java VM settings when running the tools?====&lt;br /&gt;
&lt;br /&gt;
Yes. See [[How to Change the Java Parameters for Rainbow]]. You can follow the same steps for all Okapi tools.&lt;br /&gt;
&lt;br /&gt;
==Simple Troubleshooting==&lt;br /&gt;
&lt;br /&gt;
====Is there a Getting Started guide?====&lt;br /&gt;
&lt;br /&gt;
Yes. See the &amp;quot;[[Getting Started]]&amp;quot; page.&lt;br /&gt;
&lt;br /&gt;
====When I try to start Rainbow/Ratel/CheckMate nothing happens. What is wrong?====&lt;br /&gt;
&lt;br /&gt;
* Check that you have the proper version of Java (1.7 or above).&lt;br /&gt;
* Make sure you have installed the correct distribution for your platform.&lt;br /&gt;
* If your machine is 32-bit make sure to have installed the 32-bit distribution.&lt;br /&gt;
* If your machine is 64-bit make sure to have installed the 64-bit distribution.&lt;br /&gt;
&lt;br /&gt;
==Licenses==&lt;br /&gt;
&lt;br /&gt;
====Under what licence the Okapi Framework is developed?====&lt;br /&gt;
&lt;br /&gt;
* The source code is under [https://www.apache.org/licenses/LICENSE-2.0 Apache Licence version 2.0].&lt;br /&gt;
* The documentation is under [http://creativecommons.org/licenses/by-sa/3.0/ Creative Commons Attribution-ShareAlike License (CC-BY-SA)].&lt;br /&gt;
&lt;br /&gt;
====Can I use Okapi's components in my applications?====&lt;br /&gt;
&lt;br /&gt;
Yes. The project uses the Apache license which allows open-source or commercial products to use our applications and components. See more information the license at [https://www.apache.org/licenses/LICENSE-2.0].&lt;br /&gt;
&lt;br /&gt;
==Support==&lt;br /&gt;
&lt;br /&gt;
====Is there a users group or a support mailing list?====&lt;br /&gt;
&lt;br /&gt;
Yes. There are two main mailing lists. Both have public archives, and both require registration to post a message:&lt;br /&gt;
&lt;br /&gt;
* [http://tech.groups.yahoo.com/group/okapitools/ https://groups.yahoo.com/group/okapitools/] is the group and mailing list '''for the end users'''.&lt;br /&gt;
* [http://groups.google.com/group/okapi-devel https://groups.google.com/group/okapi-devel] is the group and mailing list '''for the developers''' working on the source code.&lt;br /&gt;
&lt;br /&gt;
====How do I report bugs or request enhancement?====&lt;br /&gt;
&lt;br /&gt;
* You can post a bug report or an enhancement request in the issues tracking page: http://code.google.com/p/okapi/issues/entry if you have a Google account.&lt;br /&gt;
&lt;br /&gt;
* You can post a message to the [http://tech.groups.yahoo.com/group/okapitools/ Okapi Tools users group] if you are part of the group.&lt;br /&gt;
&lt;br /&gt;
* You can just [mailto:okapitools@opentag.com&amp;amp;subject=Feedback send feedback by email].&lt;br /&gt;
&lt;br /&gt;
==Miscellaneous==&lt;br /&gt;
&lt;br /&gt;
====What does 'Okapi' mean?====&lt;br /&gt;
&lt;br /&gt;
An okapi is an African animal looking somewhat like [http://en.wikipedia.org/wiki/Okapi a cross between a zebra and a giraffe]. Okapi is pronounced [http://en.wikipedia.org/wiki/Wikipedia:IPA_for_English /oʊˈkɑːpɪ/] ([http://www.m-w.com/cgi-bin/audio.pl?okapi001.wav=okapi hear it])&lt;br /&gt;
&lt;br /&gt;
The usage of this name for the framework has its roots to much older projects. At some point it was an acronym for &amp;quot;Open Kit API&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
====What happened to the .NET Okapi?====&lt;br /&gt;
&lt;br /&gt;
The older version of the Okapi Framework for .NET is no longer developed. Its distribution and source code is still available here: http://sourceforge.net/projects/okapi/. All new development is now done in the Java branch.&lt;br /&gt;
&lt;br /&gt;
====Where is Olifant?====&lt;br /&gt;
&lt;br /&gt;
Olifant, the TMX editor, is currently only part of the .NET Okapi. It is still available [http://sourceforge.net/projects/okapi/files/ from the SourceForge project]. Note that Olifant is for Windows only.&lt;br /&gt;
&lt;br /&gt;
==For developers==&lt;br /&gt;
&lt;br /&gt;
====Getting set up====&lt;br /&gt;
&lt;br /&gt;
* Check out the source code from Bitbucket using git clone: https://bitbucket.org/okapiframework/okapi&lt;br /&gt;
* Or, if you want to submit pull requests, first create a fork of the Okapi project. &lt;br /&gt;
* Import into your IDE. For example, in Eclipse go to File &amp;gt; Import &amp;gt; Maven &amp;gt; Existing Maven project. &lt;br /&gt;
If you want to keep several distinct Okapi repositories in the same Eclipse workspace (for instance, your fork and the main Okapi project), you need to assign a name template under the &amp;quot;Advanced&amp;quot; section in the first step of the import wizard. &lt;br /&gt;
* The &amp;quot;master&amp;quot; branch contains the latest release version. The &amp;quot;dev&amp;quot; branch contains the current work (the &amp;quot;snapshot&amp;quot; in Maven terms). &lt;br /&gt;
* See also: https://bitbucket.org/okapiframework/okapi/wiki/How%20to%20Contribute&lt;br /&gt;
Happy coding!&lt;br /&gt;
&lt;br /&gt;
====How to build okapi-lib locally====&lt;br /&gt;
&lt;br /&gt;
The Okapi Framework consists of Maven projects. However, in order to build the apps and lib projects locally, you need to use the Ant build configurations. &lt;br /&gt;
&lt;br /&gt;
For instance, to create a local version of okapi-lib.jar, go to &amp;lt;OKAPI_HOME&amp;gt;/deployment/maven/ and run ant -f build_okapi-lib.xml init okapiLib. The jar will be generated in &amp;lt;OKAPI_HOME&amp;gt;/deployment/maven/dist_common/lib/. &lt;br /&gt;
&lt;br /&gt;
If you use the default build.xml by running above command without the -f option, platform-specific distributions of the apps will be created plus the platform-indipendent okapi-lib.jar.&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Longhorn&amp;diff=721</id>
		<title>Longhorn</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Longhorn&amp;diff=721"/>
		<updated>2018-01-23T19:54:19Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__TOC__&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
Longhorn is a server application that allows you to execute Batch Configurations remotely on any set of input files. Batch Configurations which include pre-defined pipelines and filter configurations, can be exported from [[Rainbow]].&lt;br /&gt;
&lt;br /&gt;
The distribution also includes a client library to access the Longhorn Web services.&lt;br /&gt;
&lt;br /&gt;
==Download and Installation==&lt;br /&gt;
&lt;br /&gt;
* '''Stable release: http://bintray.com/okapi/Distribution/Longhorn&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;del&amp;gt;Development release (snapshot): http://okapiframework.org/snapshots&amp;lt;/del&amp;gt;  Development snapshots are not currently available.&lt;br /&gt;
&lt;br /&gt;
To install Longhorn:&lt;br /&gt;
&lt;br /&gt;
* Unzip the distribution file on your server.&lt;br /&gt;
* Follow the instructions provided with the &amp;lt;code&amp;gt;readme&amp;lt;/code&amp;gt; file of the distribution.&lt;br /&gt;
* Starting with m24, Longhorn requires Java 1.7.&lt;br /&gt;
&lt;br /&gt;
==Functionality==&lt;br /&gt;
&lt;br /&gt;
To process files with Longhorn these steps are required:&lt;br /&gt;
# Create a temporary project&lt;br /&gt;
# Upload a Batch Configuration file into that project&lt;br /&gt;
# Upload the input files into that project&lt;br /&gt;
# Execute the project&lt;br /&gt;
# Download the output files&lt;br /&gt;
# Delete the project&lt;br /&gt;
&lt;br /&gt;
==Usage==&lt;br /&gt;
&lt;br /&gt;
There are three ways to access Longhorns functionality. There is&lt;br /&gt;
* a REST interface,&lt;br /&gt;
* a Java API and&lt;br /&gt;
* an HTML client.&lt;br /&gt;
&lt;br /&gt;
They can be used as described below.&lt;br /&gt;
&lt;br /&gt;
===REST-Interface===&lt;br /&gt;
&lt;br /&gt;
Longhorn can be accessed directly via HTTP methods:&lt;br /&gt;
;POST http://{host}/okapi-longhorn/projects/new : Creates a new temporary project and returns its URI (e.g. &amp;lt;code&amp;gt;http://localhost/okapi-longhorn/projects/1&amp;lt;/code&amp;gt;) in the &amp;lt;tt&amp;gt;Location&amp;lt;/tt&amp;gt; header of the response.&lt;br /&gt;
;POST http://{host}/okapi-longhorn/projects/1/batchConfiguration : Uploads a Batch Configuration file&lt;br /&gt;
;POST http://{host}/okapi-longhorn/projects/1/inputFiles.zip : Adds input files as a zip archive (the zip will be extracted and the included files will be used as input files)&lt;br /&gt;
;PUT http://{host}/okapi-longhorn/projects/1/inputFiles/help.html : Uploads a file that will have the name 'help.html'&lt;br /&gt;
;GET http://{host}/okapi-longhorn/projects/1/inputFiles/help.html: Retrieve an input file that was previously added with PUT or POST&lt;br /&gt;
;POST http://{host}/okapi-longhorn/projects/1/tasks/execute : Executes the Batch Configuration on the uploaded input files&lt;br /&gt;
;POST http://{host}/okapi-longhorn/projects/1/tasks/execute/en-US/de-DE : Executes the Batch Configuration on the uploaded input files with the source language set to 'en-US' and the target language set to 'de-DE'&lt;br /&gt;
;POST http://{host}/okapi-longhorn/projects/1/tasks/execute/en-US?targets=de-DE&amp;amp;targets=fr-FR : Executes the Batch Configuration on the uploaded input files with the source language set to 'en-US' and multiple target languages, 'de-DE' and 'fr-FR'&lt;br /&gt;
;GET http://{host}/okapi-longhorn/projects/1/outputFiles : Returns a list of the output files generated&lt;br /&gt;
;GET http://{host}/okapi-longhorn/projects/1/outputFiles/help.out.html : Accesses the output file 'help.out.html' directly&lt;br /&gt;
;GET http://{host}/okapi-longhorn/projects/1/outputFiles.zip : Returns all output files in a zip archive&lt;br /&gt;
;DEL http://{host}/okapi-longhorn/projects/1 : Deletes the project&lt;br /&gt;
;GET http://{host}/okapi-longhorn/projects : Returns a list of all projects on the server&lt;br /&gt;
&lt;br /&gt;
===REST-Interface Sample code: Python===&lt;br /&gt;
&lt;br /&gt;
This example works with the requests package - minidom is used to parse the XML project list.&lt;br /&gt;
&lt;br /&gt;
    import requests&lt;br /&gt;
    from xml.dom import minidom&lt;br /&gt;
&lt;br /&gt;
    url = 'http://localhost:8080/okapi-longhorn/'&lt;br /&gt;
&lt;br /&gt;
Code to create a new project&lt;br /&gt;
&lt;br /&gt;
    r = requests.post(url+'projects/new')&lt;br /&gt;
    print r.text&lt;br /&gt;
&lt;br /&gt;
Code to '''list''' existing projects (i.e.: to check if the project was created, and to get the ID of the last project)&lt;br /&gt;
&lt;br /&gt;
    r = requests.get(url+'projects/')&lt;br /&gt;
&lt;br /&gt;
    xmlstring = minidom.parseString(r.text)&lt;br /&gt;
    itemlist = xmlstring.getElementsByTagName('e')&lt;br /&gt;
    lastproject = len(itemlist)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Code to '''post''' a '''batch config file'''&lt;br /&gt;
&lt;br /&gt;
    batchfile = open('/home/user/batchconfig.bconf', 'rb')&lt;br /&gt;
    r = requests.post(url+'projects/'+str(lastproject)+'/batchConfiguration', files=dict(batchConfiguration=batchfile))&lt;br /&gt;
&lt;br /&gt;
Code to '''put''' a string as a '''file'''&lt;br /&gt;
&lt;br /&gt;
    payload = &amp;quot;hello world!&amp;quot;&lt;br /&gt;
    r = requests.put(url+'projects/'+str(lastproject)+'/inputFiles/test.txt', files=dict(inputFile=payload))&lt;br /&gt;
&lt;br /&gt;
Code to '''post''' a '''file'''&lt;br /&gt;
&lt;br /&gt;
    payload = open('/home/user/test.txt', 'rb')&lt;br /&gt;
    r = requests.post(url+'projects/'+str(lastproject)+'/inputFiles/test.txt', files=dict(inputFile=payload))&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
===Java API===&lt;br /&gt;
&lt;br /&gt;
The API is distributed as a &amp;lt;code&amp;gt;.jar&amp;lt;/code&amp;gt; file in the Longhorn distribution package. You can also build it from the Okapi source code via Maven from the project &amp;lt;code&amp;gt;lib-longhorn-api&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
====Maven====&lt;br /&gt;
The API is available as a maven dependency.  Add this repository to your &amp;lt;tt&amp;gt;pom.xml&amp;lt;/tt&amp;gt;:&lt;br /&gt;
    &amp;lt;repository&amp;gt;&lt;br /&gt;
        &amp;lt;id&amp;gt;okapi-longhorn-release&amp;lt;/id&amp;gt;&lt;br /&gt;
        &amp;lt;name&amp;gt;Okapi Longhorn Release&amp;lt;/name&amp;gt;&lt;br /&gt;
        &amp;lt;url&amp;gt;http://repository-opentag.forge.cloudbees.com/release/&amp;lt;/url&amp;gt;&lt;br /&gt;
    &amp;lt;/repository&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Along with this dependency, substituting in a valid version number (e.g, &amp;lt;tt&amp;gt;0.27&amp;lt;/tt&amp;gt;):&lt;br /&gt;
    &amp;lt;dependency&amp;gt;&lt;br /&gt;
      &amp;lt;groupId&amp;gt;net.sf.okapi.lib&amp;lt;/groupId&amp;gt;&lt;br /&gt;
      &amp;lt;artifactId&amp;gt;okapi-lib-longhorn-api&amp;lt;/artifactId&amp;gt;&lt;br /&gt;
      &amp;lt;version&amp;gt;${okapi.version}&amp;lt;/version&amp;gt;&lt;br /&gt;
    &amp;lt;/dependency&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====Sample Code====&lt;br /&gt;
&lt;br /&gt;
 LonghornService ws = new RESTService(new URI(&amp;quot;http://localhost:9095/okapi-longhorn&amp;quot;));&lt;br /&gt;
 &lt;br /&gt;
 // Create project&lt;br /&gt;
 LonghornProject proj = ws.createProject();&lt;br /&gt;
 &lt;br /&gt;
 // Post batch configuration&lt;br /&gt;
 File bconfFile = new File(&amp;quot;C:\\setup.bconf&amp;quot;);&lt;br /&gt;
 proj.addBatchConfiguration(bconfFile);&lt;br /&gt;
 &lt;br /&gt;
 // Send input files&lt;br /&gt;
 &lt;br /&gt;
 // First by single upload...&lt;br /&gt;
 File file1 = new File(&amp;quot;C:\\help.html&amp;quot;);&lt;br /&gt;
 // * in the root directory&lt;br /&gt;
 proj.addInputFile(file1, file1.getName());&lt;br /&gt;
 // * and in a sub-directory&lt;br /&gt;
 proj.addInputFile(file1, &amp;quot;samefile/&amp;quot; + file1.getName());&lt;br /&gt;
 &lt;br /&gt;
 // ...then by package upload&lt;br /&gt;
 File inputPackage = new File(&amp;quot;C:\\more_files.zip&amp;quot;);&lt;br /&gt;
 proj.addInputFilesFromZip(inputPackage);&lt;br /&gt;
 &lt;br /&gt;
 // Execute pipeline&lt;br /&gt;
 // Languages don't matter&lt;br /&gt;
 proj.executePipeline();&lt;br /&gt;
 // Languages matter&lt;br /&gt;
 proj.executePipeline(&amp;quot;en-US&amp;quot;, &amp;quot;de-DE&amp;quot;);&lt;br /&gt;
 &lt;br /&gt;
 // Get output files&lt;br /&gt;
 ArrayList&amp;lt;LonghornFile&amp;gt; outputFiles = proj.getOutputFiles();&lt;br /&gt;
 &lt;br /&gt;
 // Does the fetching of files work?&lt;br /&gt;
 for (LonghornFile of : outputFiles) {&lt;br /&gt;
 	InputStream is = of.openStream();&lt;br /&gt;
 	//TODO save InputStream to local file&lt;br /&gt;
 }&lt;br /&gt;
 &lt;br /&gt;
 // Delete project&lt;br /&gt;
 proj.delete();&lt;br /&gt;
&lt;br /&gt;
===HTML-Client===&lt;br /&gt;
&lt;br /&gt;
You can create projects and upload/download files via an integrated HTML client, too. Uploading input files (and downloading output files) as a zip archive is currently not implemented for the HTML client.&lt;br /&gt;
&lt;br /&gt;
[[File:longhorn_html_client.png]]&lt;br /&gt;
&lt;br /&gt;
===Configuration===&lt;br /&gt;
Since Okapi M22 Okapi Longhorn can be build to run multiple instances on one server.&lt;br /&gt;
You can adjust the build so that it is possible to run multiple Longhorn instances in one JBoss application server. Therefore, the build must be called with an additional parameter:&lt;br /&gt;
&lt;br /&gt;
 mvn clean verify -DuseUniqueContextRoot&lt;br /&gt;
&lt;br /&gt;
====Configure working directory path====&lt;br /&gt;
Longhorn has 2 options to configure the working directory of longhorn (sort by priority): &lt;br /&gt;
#system parameter &amp;quot;LONGHORN_WORKDIR&amp;quot;&lt;br /&gt;
#configuration file in user.home &amp;quot;/okapi-longhorn-configuration.xml&amp;quot;&lt;br /&gt;
If nothing is defined, the working-directory is in user.home in folder &amp;quot;Okapi-Longhorn-Files&amp;quot;.&lt;br /&gt;
Longhorn configuration file example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;longhorn-config&amp;gt;&lt;br /&gt;
     &amp;lt;use-unique-working-directory&amp;gt;True&amp;lt;/use-unique-working-directory&amp;gt;&lt;br /&gt;
     &amp;lt;working-directory&amp;gt;D:\testData\longhorn-files&amp;lt;/working-directory&amp;gt;&lt;br /&gt;
 &amp;lt;/longhorn-config&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====Configuration Options====&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! option&lt;br /&gt;
! description&lt;br /&gt;
! data type&lt;br /&gt;
|-&lt;br /&gt;
| working-directory&lt;br /&gt;
| path of the working directory&lt;br /&gt;
| string&lt;br /&gt;
|-&lt;br /&gt;
| use-unique-working-directory&lt;br /&gt;
| if set to true the version of longhorn will be added to working directory name&lt;br /&gt;
e.g path/to/working/directory_M0.21&lt;br /&gt;
| boolean(True or False)&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[Category:Longhorn]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=XML_Filter&amp;diff=703</id>
		<title>XML Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=XML_Filter&amp;diff=703"/>
		<updated>2017-10-18T18:27:08Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* codeFinder */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
This filter allows you to process XML documents. It uses a DOM-based parser, which allows it to implement [[ITS]]. If you need to process very large XML documents and have no need for ITS, you may want to look at using the [[XML Stream Filter]].&lt;br /&gt;
&lt;br /&gt;
The following is an example of a simple XML document. The translatable text is highlighted. Because each format based on XML is different, you need information on what are the translatable parts, what are the inline elements, etc. The XML Filter [[#ITS Support|implements the ITS W3C Recommendation]] to address this issue.&lt;br /&gt;
&lt;br /&gt;
 &amp;amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;amp;lt;myDoc&amp;gt;&lt;br /&gt;
  &amp;amp;lt;prolog&amp;gt;&lt;br /&gt;
   &amp;amp;lt;author&amp;gt;Zebulon Fairfield&amp;lt;/author&amp;gt;&lt;br /&gt;
   &amp;amp;lt;version&amp;gt;version 12, revision 2 - 2006-08-14&amp;lt;/version&amp;gt;&lt;br /&gt;
   &amp;amp;lt;keywords&amp;gt;&amp;lt;kw&amp;gt;&amp;lt;span class=&amp;quot;hi&amp;quot;&amp;gt;horse&amp;lt;/span&amp;gt;&amp;lt;/kw&amp;gt;&amp;lt;kw&amp;gt;&amp;lt;span class=&amp;quot;hi&amp;quot;&amp;gt;appaloosa&amp;lt;/span&amp;gt;&amp;lt;/kw&amp;gt;&amp;lt;/keywords&amp;gt;&lt;br /&gt;
   &amp;amp;lt;storageKey&amp;gt;articles-6D272BA9-3B89CAD8&amp;lt;/storageKey&amp;gt;&lt;br /&gt;
  &amp;amp;lt;/prolog&amp;gt;&lt;br /&gt;
  &amp;amp;lt;body&amp;gt;&lt;br /&gt;
   &amp;amp;lt;title&amp;gt;&amp;lt;span class=&amp;quot;hi&amp;quot;&amp;gt;Appaloosa&amp;lt;/span&amp;gt;&amp;amp;lt;/title&amp;gt;&lt;br /&gt;
   &amp;amp;lt;p&amp;gt;&amp;lt;span class=&amp;quot;hi&amp;quot;&amp;gt;The Appaloosas are rugged horses originally breed by &lt;br /&gt;
 the &amp;lt;kw&amp;gt;Nez-Perce&amp;lt;/kw&amp;gt; tribe in the US Northwest.&amp;lt;/span&amp;gt;&amp;amp;lt;/p&amp;gt;&lt;br /&gt;
   &amp;amp;lt;p&amp;gt;&amp;lt;span class=&amp;quot;hi&amp;quot;&amp;gt;They are often characterized by their spotted coats.&amp;lt;/span&amp;gt;&amp;amp;lt;/p&amp;gt;&lt;br /&gt;
  &amp;amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;amp;lt;/myDoc&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This filter is implemented in the class &amp;lt;code&amp;gt;net.sf.okapi.filters.xml.XMLFilter&amp;lt;/code&amp;gt; of the library.&lt;br /&gt;
&lt;br /&gt;
==Processing Details==&lt;br /&gt;
&lt;br /&gt;
===Input Encoding===&lt;br /&gt;
&lt;br /&gt;
The filter decides which encoding to use for the input document using the following logic:&lt;br /&gt;
&lt;br /&gt;
* If the document has an encoding declaration it is used. &lt;br /&gt;
* Otherwise, UTF-8 is used as the default encoding (regardless the actual default encoding that was specified when opening the document). &lt;br /&gt;
&lt;br /&gt;
===Output Encoding===&lt;br /&gt;
&lt;br /&gt;
If the output encoding is UTF-8:&lt;br /&gt;
&lt;br /&gt;
* If the input encoding was also UTF-8, a Byte-Order-Mark is used for the output document only if one was detected in the input document. &lt;br /&gt;
* If the input encoding was not UTF-8, no Byte-Order-Mark is used in the output document. &lt;br /&gt;
&lt;br /&gt;
If the original document had an XML encoding declaration it is updated, if it did not, one is automatically added.&lt;br /&gt;
&lt;br /&gt;
===Line-Breaks===&lt;br /&gt;
&lt;br /&gt;
The type of line-breaks of the output is the same as the one of the original input.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
This filter stores its parameters in an XML file and does not provide an editor to modify it. You can edit the file in a simple text editor, or with an XML editor. For an example, see the article &amp;quot;[[How to Create a Custom Configuration for the XML Filter]]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
===ITS Support===&lt;br /&gt;
&lt;br /&gt;
By default the filter process the XML documents based on the '''ITS defaults'''. That is:&lt;br /&gt;
&lt;br /&gt;
* the content of all elements is translatable,&lt;br /&gt;
* and none of the values of the attribute translatable.&lt;br /&gt;
&lt;br /&gt;
Different behavior can occur if the input document contains ITS markup, or if a filter parameters file is specified. The parameters file used by the the XML Filter is [[ITS|an ITS document]].&lt;br /&gt;
&lt;br /&gt;
The '''Internationalization Tag set (ITS)''' is a W3C recommendation that defines a set of elements and attributes you can use to specify different internationalization- and localization-related aspects of your XML document, for instance: ITS defines what attribute values are translatable, what element content should be protected, what element should be treated as a nested sub-flow of text, and much more.&lt;br /&gt;
&lt;br /&gt;
The filter supports ITS 1.0 and ITS 2.0 (2.0 is backward compatible with 1.0)&lt;br /&gt;
&lt;br /&gt;
* The ITS 1.0 specification is available at http://www.w3.org/TR/its/.&lt;br /&gt;
* The ITS 2.0 specification is available at http://www.w3.org/TR/its20/.&lt;br /&gt;
&lt;br /&gt;
See the &amp;quot;[[ITS]]&amp;quot; page for more details on the format.&lt;br /&gt;
&lt;br /&gt;
The filter supports global and local rules and most data categories. See the '''[[ITS Components]]''' page for a detailed list of how the data categories are supported and other information on the implementation.&lt;br /&gt;
&lt;br /&gt;
===ITS Extensions===&lt;br /&gt;
&lt;br /&gt;
The filter supports extensions to the ITS specification. These extension use the namespace URI http://www.w3.org/2008/12/its-extensions.&lt;br /&gt;
&lt;br /&gt;
* [[#idValue and xml:id|idValue and xml:id]]&lt;br /&gt;
* [[#whiteSpaces|whiteSpaces]]&lt;br /&gt;
&lt;br /&gt;
====idValue and xml:id====&lt;br /&gt;
&lt;br /&gt;
{{NoteBox|This extension was defined for ITS 1.0, ITS 2.0 offers the new [http://www.w3.org/TR/its20/#idvalue Id Value] data category that should be used instead of this extension.}}&lt;br /&gt;
&lt;br /&gt;
When the attribute &amp;lt;code&amp;gt;xml:id&amp;lt;/code&amp;gt; is found on a translatable element, it is used as the name of the text unit generated for that element.&lt;br /&gt;
&lt;br /&gt;
For example, in the example below, the resource name associated with the text unit for the &amp;lt;code&amp;gt;&amp;amp;lt;p&amp;gt;&amp;lt;/code&amp;gt; element is &amp;quot;&amp;lt;code&amp;gt;id1&amp;lt;/code&amp;gt;&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
 &amp;amp;lt;p xml:id=&amp;quot;id1&amp;quot;&amp;gt;Text&amp;amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The attribute &amp;lt;code&amp;gt;idValue&amp;lt;/code&amp;gt; used in the ITS &amp;lt;code&amp;gt;translateRule&amp;lt;/code&amp;gt; element allows you to define an XPath expression that correspeonds to the identifier value for the given selection. The value of &amp;lt;code&amp;gt;idValue&amp;lt;/code&amp;gt; must be an expression that can return a string. A node location is a valid expression: it will return the value of the first node at the given location.&lt;br /&gt;
&lt;br /&gt;
For example, in the example below, the resource name associated with the text unit for the &amp;lt;code&amp;gt;&amp;amp;lt;p&amp;gt;&amp;lt;/code&amp;gt; element is &amp;quot;&amp;lt;code&amp;gt;id1&amp;lt;/code&amp;gt;&amp;quot;:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;doc&amp;gt;&lt;br /&gt;
 &amp;lt;its:rules version=&amp;quot;1.0&amp;quot; xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
  xmlns:itsx=&amp;quot;http://www.w3.org/2008/12/its-extensions&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;its:translateRule selector=&amp;quot;//p&amp;quot; translate=&amp;quot;yes&amp;quot; itsx:idValue=&amp;quot;@name&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;/its:rules&amp;gt;&lt;br /&gt;
 &amp;lt;p name=&amp;quot;id1&amp;quot;&amp;gt;text 1&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;/doc&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that &amp;lt;code&amp;gt;xml:id&amp;lt;/code&amp;gt; has precedence over &amp;lt;code&amp;gt;idValue&amp;lt;/code&amp;gt; declaration. For example, in the example below, the resource name associated with the text unit for the &amp;lt;code&amp;gt;&amp;amp;lt;p&amp;gt;&amp;lt;/code&amp;gt; element is &amp;quot;&amp;lt;code&amp;gt;xid1&amp;lt;/code&amp;gt;&amp;quot;, not &amp;quot;&amp;lt;code&amp;gt;id1&amp;lt;/code&amp;gt;&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;doc&amp;gt;&lt;br /&gt;
 &amp;lt;its:rules version=&amp;quot;1.0&amp;quot; xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
  xmlns:itsx=&amp;quot;http://www.w3.org/2008/12/its-extensions&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;its:translateRule selector=&amp;quot;//p&amp;quot; translate=&amp;quot;yes&amp;quot; itsx:idValue=&amp;quot;@name&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;/its:rules&amp;gt;&lt;br /&gt;
 &amp;lt;p xml:id=&amp;quot;xid1&amp;quot; name=&amp;quot;id1&amp;quot;&amp;gt;text 1&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;/doc&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can build complex ID based on different attributes, element or event hard-coded text. Any of the String functions offered by XPath can be used.&lt;br /&gt;
&lt;br /&gt;
For example, in the file below, the two elements &amp;lt;code&amp;gt;&amp;amp;tl;text&amp;gt;&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;&amp;amp;lt;desc&amp;gt;&amp;lt;/code&amp;gt; are translatable, but they have only one corresponding ID, the &amp;lt;code&amp;gt;name&amp;lt;/code&amp;gt; attribute in their parent element. To make sure you have a unique identifier for both the content of &amp;lt;code&amp;gt;&amp;amp;lt;text&amp;gt;&amp;lt;/code&amp;gt; and the content of &amp;lt;code&amp;gt;&amp;amp;lt;desc&amp;gt;&amp;lt;/code&amp;gt;, you can use the rules set in the example. The XPath expression &amp;quot;&amp;lt;code&amp;gt;concat(../@name, '_t')&amp;lt;/code&amp;gt;&amp;quot; will give the ID &amp;quot;&amp;lt;code&amp;gt;id1_t&amp;lt;/code&amp;gt;&amp;quot; and the expression &amp;quot;&amp;lt;code&amp;gt;concat(../@name, '_d')&amp;lt;/code&amp;gt;&amp;quot; will give the ID &amp;quot;&amp;lt;code&amp;gt;id1_d&amp;lt;/code&amp;gt;&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;doc&amp;gt;&lt;br /&gt;
 &amp;lt;its:rules version=&amp;quot;1.0&amp;quot; xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
  xmlns:itsx=&amp;quot;http://www.w3.org/2008/12/its-extensions&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;its:translateRule selector=&amp;quot;//text&amp;quot; translate=&amp;quot;yes&amp;quot; itsx:idValue=&amp;quot;concat(../@name, '_t')&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;its:translateRule selector=&amp;quot;//desc&amp;quot; translate=&amp;quot;yes&amp;quot; itsx:idValue=&amp;quot;concat(../@name, '_d')&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;/its:rules&amp;gt;&lt;br /&gt;
 &amp;lt;msg name=&amp;quot;id1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;text&amp;gt;Value of text&amp;lt;/text&amp;gt;&lt;br /&gt;
  &amp;lt;desc&amp;gt;Value of desc&amp;lt;/desc&amp;gt;&lt;br /&gt;
 &amp;lt;/msg&amp;gt;&lt;br /&gt;
&amp;lt;/doc&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====whiteSpaces====&lt;br /&gt;
&lt;br /&gt;
{{NoteBox|This extension was defined for ITS 1.0, ITS 2.0 offers the new [http://www.w3.org/TR/its20/#preservespace Preserve Space] data category that should be used instead of this extension.}}&lt;br /&gt;
&lt;br /&gt;
The extension attribute whiteSpaces allows you to apply globally the equivalent of a local &amp;lt;code&amp;gt;xml:space&amp;lt;/code&amp;gt; attribute.&lt;br /&gt;
&lt;br /&gt;
For example, if you have a format where all element &amp;lt;code&amp;gt;&amp;amp;lt;pre&amp;gt;&amp;lt;/code&amp;gt; must have their spaces, tabs and line breaks preserved, you can specify the attribute &amp;lt;code&amp;gt;whiteSpaces=&amp;quot;preserve&amp;quot;&amp;lt;/code&amp;gt; in a &amp;lt;code&amp;gt;&amp;amp;lt;its:translateRule&amp;gt;&amp;lt;/code&amp;gt; element for the &amp;lt;code&amp;gt;&amp;amp;lt;pre&amp;gt;&amp;lt;/code&amp;gt; elements. In the example below, the spaces in the &amp;lt;code&amp;gt;&amp;amp;lt;pre&amp;gt;&amp;lt;/code&amp;gt; element will be preserved on extraction.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;doc&amp;gt;&lt;br /&gt;
  &amp;lt;nowiki&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot; xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
   xmlns:itsx=&amp;quot;http://www.w3.org/2008/12/its-extensions&amp;quot;&amp;gt;&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
   &amp;lt;its:translateRule selector=&amp;quot;//pre&amp;quot; translate=&amp;quot;yes&amp;quot; itsx:whiteSpaces=&amp;quot;preserve&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/its:rules&amp;gt;&lt;br /&gt;
  &amp;amp;lt;pre&amp;gt;Some txt with    many spaces.  &amp;amp;lt;/pre&amp;gt;&lt;br /&gt;
 &amp;lt;/doc&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that the &amp;lt;code&amp;gt;xml:space&amp;lt;/code&amp;gt; attribute has precedence over &amp;lt;code&amp;gt;whiteSpaces&amp;lt;/code&amp;gt;. For example, in the following example, the white spaces in the content of &amp;lt;code&amp;gt;&amp;amp;lt;pre&amp;gt;&amp;lt;/code&amp;gt; may '''not''' be preserved because the attribute &amp;lt;code&amp;gt;xml:space&amp;lt;/code&amp;gt; has the value &amp;lt;code&amp;gt;default&amp;lt;/code&amp;gt;:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;doc&amp;gt;&lt;br /&gt;
  &amp;lt;nowiki&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot; xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
   xmlns:itsx=&amp;quot;http://www.w3.org/2008/12/its-extensions&amp;quot;&amp;gt;&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
   &amp;lt;its:translateRule selector=&amp;quot;//pre&amp;quot; translate=&amp;quot;yes&amp;quot; itsx:whiteSpaces=&amp;quot;preserve&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/its:rules&amp;gt;&lt;br /&gt;
  &amp;amp;&amp;amp;lt;pre xml:space=&amp;quot;default&amp;quot;&amp;gt;Some txt with    many spaces.  &amp;amp;lt;/pre&amp;gt;&lt;br /&gt;
 &amp;lt;/doc&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Filter Options===&lt;br /&gt;
&lt;br /&gt;
The filter supports also options in addition to ITS and ITS extension. These options use the namespace URI &amp;lt;code&amp;gt;okapi-framework:xmlfilter-options&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
{{NoteBox|The filter options must be placed in the parameters file (.fprm) used with the filter, not in embedded or linked ITS rules. Options placed in embedded or linked ITS rules have no effect.}}&lt;br /&gt;
&lt;br /&gt;
When you use several options, they must be set in a single &amp;lt;code&amp;gt;&amp;amp;lt;okp:options&amp;gt;&amp;lt;/code&amp;gt; element, as shown below:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:options lineBreakAsCode=&amp;quot;yes&amp;quot;&lt;br /&gt;
              escapeQuotes=&amp;quot;no&amp;quot;&lt;br /&gt;
              escapeGT=&amp;quot;yes&amp;quot;&lt;br /&gt;
 /&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The following options are available:&lt;br /&gt;
&lt;br /&gt;
* [[#lineBreakAsCode|lineBreakAsCode]]&lt;br /&gt;
* [[#codeFinder|codeFinder]]&lt;br /&gt;
* [[#omitXMLDeclaration|omitXMLDeclaration]]&lt;br /&gt;
* [[#escapeQuotes|escapeQuotes]]&lt;br /&gt;
* [[#escapeGT|escapeGT]]&lt;br /&gt;
* [[#escapeNbsp|escapeNbsp]]&lt;br /&gt;
* [[#extractIfOnlyCodes|extractIfOnlyCodes]]&lt;br /&gt;
* [[#inlineCdata|inlineCdata]]&lt;br /&gt;
&lt;br /&gt;
====lineBreakAsCode====&lt;br /&gt;
&lt;br /&gt;
In some cases the content of element includes line-breaks that need to be included as part of the content but without using an actual line-break in the extracted text. For example in some XML documents generated by Excel, the formatting of the cells is marked up with &amp;lt;code&amp;gt;&amp;amp;amp;#10;&amp;lt;/code&amp;gt; entity references. They need to be passed as inline codes.&lt;br /&gt;
&lt;br /&gt;
By default this option is set to false.&lt;br /&gt;
&lt;br /&gt;
To specify this the filter use the extension &amp;lt;code&amp;gt;lineBreakAsCode&amp;lt;/code&amp;gt; extension attribute. This affect all the extracted content.&lt;br /&gt;
&lt;br /&gt;
For example: The following code is an ITS document with the option to treat line-breaks as code. It can be used along with the example of XML document listed below.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:options lineBreakAsCode=&amp;quot;yes&amp;quot;/&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;doc&amp;gt;&lt;br /&gt;
  &amp;lt;data&amp;gt;line 1&amp;amp;amp;#10;line 2.&amp;lt;/data&amp;gt;&lt;br /&gt;
 &amp;lt;/doc&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====codeFinder====&lt;br /&gt;
&lt;br /&gt;
You can define a set of regular expressions to capture span of extracted text that should be treated as inline codes. For example, some element content may have variables, or HTML tags that need to be protected from modification and treated as codes. Use the codeFinder element for this.&lt;br /&gt;
&lt;br /&gt;
In the following parameters file, the &amp;lt;code&amp;gt;codeFinder&amp;lt;/code&amp;gt; element defines two rules:&lt;br /&gt;
&lt;br /&gt;
* The first one (rule0) is &amp;quot;&amp;lt;code&amp;gt;&amp;amp;lt;(/?)\w[^&amp;gt;]*?&amp;gt;&amp;lt;/code&amp;gt;&amp;quot; and matches any XML-type tags (e.g. &amp;quot;&amp;lt;code&amp;gt;&amp;amp;lt;b&amp;gt;&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;&amp;amp;lt;/b&amp;gt;&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;&amp;amp;lt;br/&amp;gt;&amp;lt;/code&amp;gt;&amp;quot;)&lt;br /&gt;
* The second one (rule1) is &amp;quot;&amp;lt;code&amp;gt;(#\w+?\#)|(%\d+?%)&amp;lt;/code&amp;gt;&amp;quot; and matches any word enclosed in &amp;lt;code&amp;gt;#&amp;lt;/code&amp;gt; (e.g. &amp;quot;&amp;lt;code&amp;gt;#VAR#&amp;lt;/code&amp;gt;&amp;quot;) or number enclosed in &amp;lt;code&amp;gt;%&amp;lt;/code&amp;gt; (e.g. &amp;quot;&amp;lt;code&amp;gt;%1%&amp;lt;/code&amp;gt;&amp;quot;).&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:codeFinder useCodeFinder=&amp;quot;yes&amp;quot;&amp;gt;#v1&lt;br /&gt;
count.i=2&lt;br /&gt;
rule0=&amp;amp;amp;lt;(/?)\w+[^&amp;amp;amp;gt;]*?&amp;amp;amp;gt;&lt;br /&gt;
rule1=(#\w+?\#)|(%\d+?%)&lt;br /&gt;
 &amp;lt;/okp:codeFinder&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some important details:&lt;br /&gt;
&lt;br /&gt;
* Set &amp;lt;code&amp;gt;useCodeFinder&amp;lt;/code&amp;gt; to &amp;quot;yes&amp;quot; to have the rules used, if the attribute is missing its value is assumed to be &amp;quot;no&amp;quot;.&lt;br /&gt;
* Make sure the first line of the &amp;lt;code&amp;gt;&amp;amp;lt;codeFinder&amp;gt;&amp;lt;/code&amp;gt; element content is &amp;lt;code&amp;gt;#v1&amp;lt;/code&amp;gt;. &lt;br /&gt;
* Each entry in the content must be on a separate line. &lt;br /&gt;
* &amp;lt;code&amp;gt;count.i=N&amp;lt;/code&amp;gt; must be before any rules and &amp;lt;code&amp;gt;N&amp;lt;/code&amp;gt; must be the number of rules. &lt;br /&gt;
* &amp;lt;code&amp;gt;ruleN&amp;lt;/code&amp;gt; must be incremented starting at 0. &lt;br /&gt;
* The pattern for a rule must be escaped for XML, for example: &amp;quot;&amp;lt;code&amp;gt;&amp;amp;lt;(/?)\w[^&amp;gt;]*?&amp;gt;&amp;lt;/code&amp;gt;&amp;quot; must be entered &amp;quot;&amp;lt;code&amp;gt;&amp;amp;amp;lt;(/?)\w[^&amp;amp;amp;lt;]*?&amp;amp;amp;gt;&amp;lt;/code&amp;gt;&amp;quot; in the parameters file. &lt;br /&gt;
* Do not put spaces before &amp;lt;code&amp;gt;count.i&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;ruleN&amp;lt;/code&amp;gt;, and not after your expressions.&lt;br /&gt;
&lt;br /&gt;
To facilitate the creation of code finder rules [[Rainbow - Code Finder Editor|Rainbow provides the Code Finder Editor]].&lt;br /&gt;
&lt;br /&gt;
====omitXMLDeclaration====&lt;br /&gt;
&lt;br /&gt;
By default an XML declaration is always set at the top of the output document (regardless wether the original document has one or not). It is an important part of the XML document and it is especially needed when the encoding of the output document is not UTF-8, UTF-16 or UTF-32, as its name must be specified in the XML declaration. However, there are a few special cases when the declaration is better left off. To handle those rare cases, you can use &amp;lt;code&amp;gt;omitXMLDeclation&amp;lt;/code&amp;gt; to indicate the filter to not output the XML declaration.&lt;br /&gt;
&lt;br /&gt;
For example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:options omitXMLDeclaration=&amp;quot;yes&amp;quot;/&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Remember that XML documents without an XML declaration may be read incorrectly if the encoding of the document is not UTF-8, UTF-16 or UTF-32.&lt;br /&gt;
&lt;br /&gt;
====escapeQuotes====&lt;br /&gt;
&lt;br /&gt;
By default, when processing the document, the filter uses double-quotes to enclose all attributes (translatable or not) and use the following rules for escaping/not-escaping the literal quotes:&lt;br /&gt;
&lt;br /&gt;
* Inside the attribute values:&lt;br /&gt;
** Single-quotes (=apostrophes) are never escaped&lt;br /&gt;
** Double-quotes are always escaped&lt;br /&gt;
* In element content:&lt;br /&gt;
** Single-quotes (=apostrophes) are not escaped&lt;br /&gt;
** Double-quotes are escaped escaped by default&lt;br /&gt;
&lt;br /&gt;
You cannot change the escaping rules for attributes.&lt;br /&gt;
&lt;br /&gt;
For element content: If the document is processed without triggering any rule that allow the translation of an attribute, then (and only then) the filter takes into account the &amp;lt;code&amp;gt;escapeQuotes&amp;lt;/code&amp;gt; option to escape or not double-quotes in the translatable content.&lt;br /&gt;
&lt;br /&gt;
For example, the following parameters file allows to not escape double-quotes in element content (for the documents where there is no rule for translatable attributes are triggered):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:options escapeQuotes=&amp;quot;no&amp;quot;/&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====escapeGT====&lt;br /&gt;
&lt;br /&gt;
By default the character '&amp;lt;code&amp;gt;&amp;gt;&amp;lt;/code&amp;gt;' is escaped. You can indicate to the filter to not escape it using the &amp;lt;code&amp;gt;escapeGT&amp;lt;/code&amp;gt; option.&lt;br /&gt;
&lt;br /&gt;
For example, the following parameters file indicates to not escape greater-than characters:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:options escapeGT=&amp;quot;no&amp;quot;/&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====escapeNbsp====&lt;br /&gt;
&lt;br /&gt;
By default the non-breaking space character is escaped (in the form &amp;lt;code&amp;gt;&amp;amp;amp;#x00a0;&amp;lt;/code&amp;gt;). You can indicate to the filter to not escape it using the &amp;lt;code&amp;gt;escapeNbsp&amp;lt;/code&amp;gt; option.&lt;br /&gt;
&lt;br /&gt;
For example, the following parameters file indicates to not escape the non-breaking space characters:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:options escapeNbsp=&amp;quot;no&amp;quot;/&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====extractIfOnlyCodes====&lt;br /&gt;
&lt;br /&gt;
By default all extractable entries are extracted even when they contain only white-spaces and/or inline codes. You can indicate to the filter to not extract such entries using the &amp;lt;code&amp;gt;extractIfOnlyCodes&amp;lt;/code&amp;gt; option.&lt;br /&gt;
&lt;br /&gt;
For example, the following parameters file indicates to not extract entries with only whte-spaces and/or inline codes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:options extractIfOnlyCodes=&amp;quot;no&amp;quot;/&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====inlineCdata====&lt;br /&gt;
&lt;br /&gt;
By default, CDATA sections will be exposed as regular content, and the CDATA markers themselves will be discarded.  When the &amp;lt;code&amp;gt;inlineCdata&amp;lt;/code&amp;gt; option is set,&lt;br /&gt;
the CDATA markers will be exposed as inline codes.&lt;br /&gt;
&lt;br /&gt;
For example, the following parameters file will expose CDATA markers as inline codes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:options inlineCdata=&amp;quot;yes&amp;quot;/&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
* Currently, in some cases, the ITS rule &amp;lt;code&amp;gt;withinTextRule&amp;lt;/code&amp;gt; with the value &amp;lt;code&amp;gt;nested&amp;lt;/code&amp;gt; may act like it has a value &amp;lt;code&amp;gt;yes&amp;lt;/code&amp;gt; instead.&lt;br /&gt;
* In output, the values of the &amp;lt;code&amp;gt;xml:lang&amp;lt;/code&amp;gt; attributes are not updated to reflect the target language.&lt;br /&gt;
* When doing the extraction, the whole input file is loaded into memory. You may run into memory limitation if the document is very large.&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]] [[Category:ITS]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=IDML_Filter&amp;diff=702</id>
		<title>IDML Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=IDML_Filter&amp;diff=702"/>
		<updated>2017-10-03T18:51:51Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: Update options/limitations for new (post-M34) rewrite of IDMLFilter&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
This filter allows you to process IDML documents. IDML (InDesign Markup Language) is an XML-based format, introduced in Adobe InDesign CS4, for representing InDesign content. IDML is used in several InDesign and InCopy file types. The specification can be found [http://www.adobe.com/content/dam/Adobe/en/devnet/indesign/cs5_docs/idml/idml-specification.pdf on the Adobe Web site].&lt;br /&gt;
&lt;br /&gt;
==Processing Details==&lt;br /&gt;
&lt;br /&gt;
When processing an IDML filter, the filter looks at all the spreads in the document, and for each of them, gather the list of the stories used in &amp;lt;code&amp;gt;&amp;amp;lt;TextFrame&amp;gt;&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;&amp;amp;lt;TextPath&amp;gt;&amp;lt;/code&amp;gt;. The text is extracted by spread, and for each spread by story in the order the appear in the spread.&lt;br /&gt;
&lt;br /&gt;
Stories embedded inside other stories and not declared at a spread level are extracted in a special group.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Untag XML Structures&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set this option to skip embedded XML structural information when extracting translatable content.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Extract notes&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set this option to extract the content of notes (&amp;lt;code&amp;gt;&amp;amp;lt;Note&amp;gt;&amp;lt;/code&amp;gt; elements).&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Extract master spreads&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set this option to extract the content of the master spreads if they exist. If this option is not set only the normal spreads are extracted.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Extract hidden layers&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set this option to extract also the hidden layers.&lt;br /&gt;
&lt;br /&gt;
==Deprecated Parameters==&lt;br /&gt;
&lt;br /&gt;
Prior to release M34, the filter supported several additional parameters.  The behavior of these has been subsumed by the more intelligent content processing performed by the updated version of the filter in versions M34 and later.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Simplify inline codes when possible&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set this option to reduce the number of inline codes by re-grouping adjacent codes when it is possible.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Create new text units on hard returns&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set this option to create separate text units when a hard return element (&amp;lt;code&amp;gt;&amp;amp;lt;Br/&amp;gt;&amp;lt;/code&amp;gt;) is found.&amp;lt;br/&amp;gt; '''IMPORTANT: This option is not completed yet. Setting it may create extracted documents you will not be able to merge back. Always test merge before use this for production.'''&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Maximum spread size&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set the maximum size for the spread files (in KBytes). Any spread file above the given value will either generate an error or will be skipped from extraction depending on the specified option. This allows you to skip over large spread files that may contain only graphics and require too much memory to be opened. Note that the skipped file are not checked for translatable text.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Generate an error when a spread is larger than the specified value&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set this option to generate an error if a spread size is above the specified &amp;lt;cite&amp;gt;Maximum spread size&amp;lt;/cite&amp;gt;. If this option is not set, the spread is skipped with a warning message.&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Markdown_Filter&amp;diff=696</id>
		<title>Markdown Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Markdown_Filter&amp;diff=696"/>
		<updated>2017-09-08T16:30:35Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* Inline Codes */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
The Markdown Filter is an Okapi component for extracting translatable text from Markdown files. See https://en.wikipedia.org/wiki/Markdown for more information about the format.&lt;br /&gt;
Markdown is a family of formats, not all of them mutually compatible.  This filter is designed to work with markdown based on the [http://commonmark.org CommonMark] specification, with additional features to support [https://guides.github.com/features/mastering-markdown/ GitHub-flavored Markdown].&lt;br /&gt;
&lt;br /&gt;
==Processing Details==&lt;br /&gt;
&lt;br /&gt;
===Input Encoding===&lt;br /&gt;
&lt;br /&gt;
The filter decides which encoding to use for the input file using the following logic:&lt;br /&gt;
&lt;br /&gt;
If the file has a Unicode Byte-Order-Mark:&lt;br /&gt;
Then, the corresponding encoding (e.g. UTF-8, UTF-16, etc.) is used.&lt;br /&gt;
Otherwise, the input encoding used is the default encoding that was specified when setting the filter options.&lt;br /&gt;
&lt;br /&gt;
===Inline Codes===&lt;br /&gt;
&lt;br /&gt;
The filter attempts to convert inline formatting to codes and block-level formatting to document park/skeleton content.&lt;br /&gt;
&lt;br /&gt;
The [[HTML_Filter#Inline_Code_Finder|Inline Code Finder]] can be used to handle convert additional types of formatting to inline codes, as necessary.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
; Translate URLs&lt;br /&gt;
: By default, URLs in link and image statements are not exposed for translation.  If this option is enabled, they will be extracted.  &amp;lt;i&amp;gt;Note&amp;lt;/i&amp;gt;: URLs are currently extracted inline in their containing segment, rather than as a subflow.&lt;br /&gt;
&lt;br /&gt;
; Translate Code Blocks&lt;br /&gt;
: This option, enabled by default, controls whether the contents of fenced code blocks are exposed for translation.&lt;br /&gt;
&lt;br /&gt;
; Translate YAML Metadata Header&lt;br /&gt;
: Some markdown formats support a [http://pandoc.org/MANUAL.html#extension-yaml_metadata_block YAML Metadata Header] that contains key/value data. By default, this header is not exposed for translation. When the &amp;quot;Translate YAML Metadata Header&amp;quot; option is enabled, the header will be parsed and the metadata values will be exposed for translation.&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
* None known&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Markdown_Filter&amp;diff=695</id>
		<title>Markdown Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Markdown_Filter&amp;diff=695"/>
		<updated>2017-08-24T18:42:52Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* Parameters */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
The Markdown Filter is an Okapi component for extracting translatable text from Markdown files. See https://en.wikipedia.org/wiki/Markdown for more information about the format.&lt;br /&gt;
Markdown is a family of formats, not all of them mutually compatible.  This filter is designed to work with markdown based on the [http://commonmark.org CommonMark] specification, with additional features to support [https://guides.github.com/features/mastering-markdown/ GitHub-flavored Markdown].&lt;br /&gt;
&lt;br /&gt;
==Processing Details==&lt;br /&gt;
&lt;br /&gt;
===Input Encoding===&lt;br /&gt;
&lt;br /&gt;
The filter decides which encoding to use for the input file using the following logic:&lt;br /&gt;
&lt;br /&gt;
If the file has a Unicode Byte-Order-Mark:&lt;br /&gt;
Then, the corresponding encoding (e.g. UTF-8, UTF-16, etc.) is used.&lt;br /&gt;
Otherwise, the input encoding used is the default encoding that was specified when setting the filter options.&lt;br /&gt;
&lt;br /&gt;
===Inline Codes===&lt;br /&gt;
&lt;br /&gt;
The filter attempts to convert inline formatting to codes and block-level formatting to document park/skeleton content.&lt;br /&gt;
&lt;br /&gt;
The Inline Code Finder can be used to handle convert additional types of formatting to inline codes, as necessary.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
; Translate URLs&lt;br /&gt;
: By default, URLs in link and image statements are not exposed for translation.  If this option is enabled, they will be extracted.  &amp;lt;i&amp;gt;Note&amp;lt;/i&amp;gt;: URLs are currently extracted inline in their containing segment, rather than as a subflow.&lt;br /&gt;
&lt;br /&gt;
; Translate Code Blocks&lt;br /&gt;
: This option, enabled by default, controls whether the contents of fenced code blocks are exposed for translation.&lt;br /&gt;
&lt;br /&gt;
; Translate YAML Metadata Header&lt;br /&gt;
: Some markdown formats support a [http://pandoc.org/MANUAL.html#extension-yaml_metadata_block YAML Metadata Header] that contains key/value data. By default, this header is not exposed for translation. When the &amp;quot;Translate YAML Metadata Header&amp;quot; option is enabled, the header will be parsed and the metadata values will be exposed for translation.&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
* None known&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=XML_Filter&amp;diff=663</id>
		<title>XML Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=XML_Filter&amp;diff=663"/>
		<updated>2017-04-25T01:32:10Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* Filter Options */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
This filter allows you to process XML documents. It uses a DOM-based parser, which allows it to implement [[ITS]]. If you need to process very large XML documents and have no need for ITS, you may want to look at using the [[XML Stream Filter]].&lt;br /&gt;
&lt;br /&gt;
The following is an example of a simple XML document. The translatable text is highlighted. Because each format based on XML is different, you need information on what are the translatable parts, what are the inline elements, etc. The XML Filter [[#ITS Support|implements the ITS W3C Recommendation]] to address this issue.&lt;br /&gt;
&lt;br /&gt;
 &amp;amp;lt;?xml version=&amp;quot;1.0&amp;quot; encoding=&amp;quot;utf-8&amp;quot;?&amp;gt;&lt;br /&gt;
 &amp;amp;lt;myDoc&amp;gt;&lt;br /&gt;
  &amp;amp;lt;prolog&amp;gt;&lt;br /&gt;
   &amp;amp;lt;author&amp;gt;Zebulon Fairfield&amp;lt;/author&amp;gt;&lt;br /&gt;
   &amp;amp;lt;version&amp;gt;version 12, revision 2 - 2006-08-14&amp;lt;/version&amp;gt;&lt;br /&gt;
   &amp;amp;lt;keywords&amp;gt;&amp;lt;kw&amp;gt;&amp;lt;span class=&amp;quot;hi&amp;quot;&amp;gt;horse&amp;lt;/span&amp;gt;&amp;lt;/kw&amp;gt;&amp;lt;kw&amp;gt;&amp;lt;span class=&amp;quot;hi&amp;quot;&amp;gt;appaloosa&amp;lt;/span&amp;gt;&amp;lt;/kw&amp;gt;&amp;lt;/keywords&amp;gt;&lt;br /&gt;
   &amp;amp;lt;storageKey&amp;gt;articles-6D272BA9-3B89CAD8&amp;lt;/storageKey&amp;gt;&lt;br /&gt;
  &amp;amp;lt;/prolog&amp;gt;&lt;br /&gt;
  &amp;amp;lt;body&amp;gt;&lt;br /&gt;
   &amp;amp;lt;title&amp;gt;&amp;lt;span class=&amp;quot;hi&amp;quot;&amp;gt;Appaloosa&amp;lt;/span&amp;gt;&amp;amp;lt;/title&amp;gt;&lt;br /&gt;
   &amp;amp;lt;p&amp;gt;&amp;lt;span class=&amp;quot;hi&amp;quot;&amp;gt;The Appaloosas are rugged horses originally breed by &lt;br /&gt;
 the &amp;lt;kw&amp;gt;Nez-Perce&amp;lt;/kw&amp;gt; tribe in the US Northwest.&amp;lt;/span&amp;gt;&amp;amp;lt;/p&amp;gt;&lt;br /&gt;
   &amp;amp;lt;p&amp;gt;&amp;lt;span class=&amp;quot;hi&amp;quot;&amp;gt;They are often characterized by their spotted coats.&amp;lt;/span&amp;gt;&amp;amp;lt;/p&amp;gt;&lt;br /&gt;
  &amp;amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;amp;lt;/myDoc&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This filter is implemented in the class &amp;lt;code&amp;gt;net.sf.okapi.filters.xml.XMLFilter&amp;lt;/code&amp;gt; of the library.&lt;br /&gt;
&lt;br /&gt;
==Processing Details==&lt;br /&gt;
&lt;br /&gt;
===Input Encoding===&lt;br /&gt;
&lt;br /&gt;
The filter decides which encoding to use for the input document using the following logic:&lt;br /&gt;
&lt;br /&gt;
* If the document has an encoding declaration it is used. &lt;br /&gt;
* Otherwise, UTF-8 is used as the default encoding (regardless the actual default encoding that was specified when opening the document). &lt;br /&gt;
&lt;br /&gt;
===Output Encoding===&lt;br /&gt;
&lt;br /&gt;
If the output encoding is UTF-8:&lt;br /&gt;
&lt;br /&gt;
* If the input encoding was also UTF-8, a Byte-Order-Mark is used for the output document only if one was detected in the input document. &lt;br /&gt;
* If the input encoding was not UTF-8, no Byte-Order-Mark is used in the output document. &lt;br /&gt;
&lt;br /&gt;
If the original document had an XML encoding declaration it is updated, if it did not, one is automatically added.&lt;br /&gt;
&lt;br /&gt;
===Line-Breaks===&lt;br /&gt;
&lt;br /&gt;
The type of line-breaks of the output is the same as the one of the original input.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
This filter stores its parameters in an XML file and does not provide an editor to modify it. You can edit the file in a simple text editor, or with an XML editor. For an example, see the article &amp;quot;[[How to Create a Custom Configuration for the XML Filter]]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
===ITS Support===&lt;br /&gt;
&lt;br /&gt;
By default the filter process the XML documents based on the '''ITS defaults'''. That is:&lt;br /&gt;
&lt;br /&gt;
* the content of all elements is translatable,&lt;br /&gt;
* and none of the values of the attribute translatable.&lt;br /&gt;
&lt;br /&gt;
Different behavior can occur if the input document contains ITS markup, or if a filter parameters file is specified. The parameters file used by the the XML Filter is [[ITS|an ITS document]].&lt;br /&gt;
&lt;br /&gt;
The '''Internationalization Tag set (ITS)''' is a W3C recommendation that defines a set of elements and attributes you can use to specify different internationalization- and localization-related aspects of your XML document, for instance: ITS defines what attribute values are translatable, what element content should be protected, what element should be treated as a nested sub-flow of text, and much more.&lt;br /&gt;
&lt;br /&gt;
The filter supports ITS 1.0 and ITS 2.0 (2.0 is backward compatible with 1.0)&lt;br /&gt;
&lt;br /&gt;
* The ITS 1.0 specification is available at http://www.w3.org/TR/its/.&lt;br /&gt;
* The ITS 2.0 specification is available at http://www.w3.org/TR/its20/.&lt;br /&gt;
&lt;br /&gt;
See the &amp;quot;[[ITS]]&amp;quot; page for more details on the format.&lt;br /&gt;
&lt;br /&gt;
The filter supports global and local rules and most data categories. See the '''[[ITS Components]]''' page for a detailed list of how the data categories are supported and other information on the implementation.&lt;br /&gt;
&lt;br /&gt;
===ITS Extensions===&lt;br /&gt;
&lt;br /&gt;
The filter supports extensions to the ITS specification. These extension use the namespace URI http://www.w3.org/2008/12/its-extensions.&lt;br /&gt;
&lt;br /&gt;
* [[#idValue and xml:id|idValue and xml:id]]&lt;br /&gt;
* [[#whiteSpaces|whiteSpaces]]&lt;br /&gt;
&lt;br /&gt;
====idValue and xml:id====&lt;br /&gt;
&lt;br /&gt;
{{NoteBox|This extension was defined for ITS 1.0, ITS 2.0 offers the new [http://www.w3.org/TR/its20/#idvalue Id Value] data category that should be used instead of this extension.}}&lt;br /&gt;
&lt;br /&gt;
When the attribute &amp;lt;code&amp;gt;xml:id&amp;lt;/code&amp;gt; is found on a translatable element, it is used as the name of the text unit generated for that element.&lt;br /&gt;
&lt;br /&gt;
For example, in the example below, the resource name associated with the text unit for the &amp;lt;code&amp;gt;&amp;amp;lt;p&amp;gt;&amp;lt;/code&amp;gt; element is &amp;quot;&amp;lt;code&amp;gt;id1&amp;lt;/code&amp;gt;&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
 &amp;amp;lt;p xml:id=&amp;quot;id1&amp;quot;&amp;gt;Text&amp;amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The attribute &amp;lt;code&amp;gt;idValue&amp;lt;/code&amp;gt; used in the ITS &amp;lt;code&amp;gt;translateRule&amp;lt;/code&amp;gt; element allows you to define an XPath expression that correspeonds to the identifier value for the given selection. The value of &amp;lt;code&amp;gt;idValue&amp;lt;/code&amp;gt; must be an expression that can return a string. A node location is a valid expression: it will return the value of the first node at the given location.&lt;br /&gt;
&lt;br /&gt;
For example, in the example below, the resource name associated with the text unit for the &amp;lt;code&amp;gt;&amp;amp;lt;p&amp;gt;&amp;lt;/code&amp;gt; element is &amp;quot;&amp;lt;code&amp;gt;id1&amp;lt;/code&amp;gt;&amp;quot;:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;doc&amp;gt;&lt;br /&gt;
 &amp;lt;its:rules version=&amp;quot;1.0&amp;quot; xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
  xmlns:itsx=&amp;quot;http://www.w3.org/2008/12/its-extensions&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;its:translateRule selector=&amp;quot;//p&amp;quot; translate=&amp;quot;yes&amp;quot; itsx:idValue=&amp;quot;@name&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;/its:rules&amp;gt;&lt;br /&gt;
 &amp;lt;p name=&amp;quot;id1&amp;quot;&amp;gt;text 1&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;/doc&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that &amp;lt;code&amp;gt;xml:id&amp;lt;/code&amp;gt; has precedence over &amp;lt;code&amp;gt;idValue&amp;lt;/code&amp;gt; declaration. For example, in the example below, the resource name associated with the text unit for the &amp;lt;code&amp;gt;&amp;amp;lt;p&amp;gt;&amp;lt;/code&amp;gt; element is &amp;quot;&amp;lt;code&amp;gt;xid1&amp;lt;/code&amp;gt;&amp;quot;, not &amp;quot;&amp;lt;code&amp;gt;id1&amp;lt;/code&amp;gt;&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;doc&amp;gt;&lt;br /&gt;
 &amp;lt;its:rules version=&amp;quot;1.0&amp;quot; xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
  xmlns:itsx=&amp;quot;http://www.w3.org/2008/12/its-extensions&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;its:translateRule selector=&amp;quot;//p&amp;quot; translate=&amp;quot;yes&amp;quot; itsx:idValue=&amp;quot;@name&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;/its:rules&amp;gt;&lt;br /&gt;
 &amp;lt;p xml:id=&amp;quot;xid1&amp;quot; name=&amp;quot;id1&amp;quot;&amp;gt;text 1&amp;lt;/p&amp;gt;&lt;br /&gt;
&amp;lt;/doc&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You can build complex ID based on different attributes, element or event hard-coded text. Any of the String functions offered by XPath can be used.&lt;br /&gt;
&lt;br /&gt;
For example, in the file below, the two elements &amp;lt;code&amp;gt;&amp;amp;tl;text&amp;gt;&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;&amp;amp;lt;desc&amp;gt;&amp;lt;/code&amp;gt; are translatable, but they have only one corresponding ID, the &amp;lt;code&amp;gt;name&amp;lt;/code&amp;gt; attribute in their parent element. To make sure you have a unique identifier for both the content of &amp;lt;code&amp;gt;&amp;amp;lt;text&amp;gt;&amp;lt;/code&amp;gt; and the content of &amp;lt;code&amp;gt;&amp;amp;lt;desc&amp;gt;&amp;lt;/code&amp;gt;, you can use the rules set in the example. The XPath expression &amp;quot;&amp;lt;code&amp;gt;concat(../@name, '_t')&amp;lt;/code&amp;gt;&amp;quot; will give the ID &amp;quot;&amp;lt;code&amp;gt;id1_t&amp;lt;/code&amp;gt;&amp;quot; and the expression &amp;quot;&amp;lt;code&amp;gt;concat(../@name, '_d')&amp;lt;/code&amp;gt;&amp;quot; will give the ID &amp;quot;&amp;lt;code&amp;gt;id1_d&amp;lt;/code&amp;gt;&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;doc&amp;gt;&lt;br /&gt;
 &amp;lt;its:rules version=&amp;quot;1.0&amp;quot; xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
  xmlns:itsx=&amp;quot;http://www.w3.org/2008/12/its-extensions&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;its:translateRule selector=&amp;quot;//text&amp;quot; translate=&amp;quot;yes&amp;quot; itsx:idValue=&amp;quot;concat(../@name, '_t')&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;its:translateRule selector=&amp;quot;//desc&amp;quot; translate=&amp;quot;yes&amp;quot; itsx:idValue=&amp;quot;concat(../@name, '_d')&amp;quot;/&amp;gt;&lt;br /&gt;
 &amp;lt;/its:rules&amp;gt;&lt;br /&gt;
 &amp;lt;msg name=&amp;quot;id1&amp;quot;&amp;gt;&lt;br /&gt;
  &amp;lt;text&amp;gt;Value of text&amp;lt;/text&amp;gt;&lt;br /&gt;
  &amp;lt;desc&amp;gt;Value of desc&amp;lt;/desc&amp;gt;&lt;br /&gt;
 &amp;lt;/msg&amp;gt;&lt;br /&gt;
&amp;lt;/doc&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====whiteSpaces====&lt;br /&gt;
&lt;br /&gt;
{{NoteBox|This extension was defined for ITS 1.0, ITS 2.0 offers the new [http://www.w3.org/TR/its20/#preservespace Preserve Space] data category that should be used instead of this extension.}}&lt;br /&gt;
&lt;br /&gt;
The extension attribute whiteSpaces allows you to apply globally the equivalent of a local &amp;lt;code&amp;gt;xml:space&amp;lt;/code&amp;gt; attribute.&lt;br /&gt;
&lt;br /&gt;
For example, if you have a format where all element &amp;lt;code&amp;gt;&amp;amp;lt;pre&amp;gt;&amp;lt;/code&amp;gt; must have their spaces, tabs and line breaks preserved, you can specify the attribute &amp;lt;code&amp;gt;whiteSpaces=&amp;quot;preserve&amp;quot;&amp;lt;/code&amp;gt; in a &amp;lt;code&amp;gt;&amp;amp;lt;its:translateRule&amp;gt;&amp;lt;/code&amp;gt; element for the &amp;lt;code&amp;gt;&amp;amp;lt;pre&amp;gt;&amp;lt;/code&amp;gt; elements. In the example below, the spaces in the &amp;lt;code&amp;gt;&amp;amp;lt;pre&amp;gt;&amp;lt;/code&amp;gt; element will be preserved on extraction.&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;doc&amp;gt;&lt;br /&gt;
  &amp;lt;nowiki&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot; xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
   xmlns:itsx=&amp;quot;http://www.w3.org/2008/12/its-extensions&amp;quot;&amp;gt;&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
   &amp;lt;its:translateRule selector=&amp;quot;//pre&amp;quot; translate=&amp;quot;yes&amp;quot; itsx:whiteSpaces=&amp;quot;preserve&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/its:rules&amp;gt;&lt;br /&gt;
  &amp;amp;lt;pre&amp;gt;Some txt with    many spaces.  &amp;amp;lt;/pre&amp;gt;&lt;br /&gt;
 &amp;lt;/doc&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that the &amp;lt;code&amp;gt;xml:space&amp;lt;/code&amp;gt; attribute has precedence over &amp;lt;code&amp;gt;whiteSpaces&amp;lt;/code&amp;gt;. For example, in the following example, the white spaces in the content of &amp;lt;code&amp;gt;&amp;amp;lt;pre&amp;gt;&amp;lt;/code&amp;gt; may '''not''' be preserved because the attribute &amp;lt;code&amp;gt;xml:space&amp;lt;/code&amp;gt; has the value &amp;lt;code&amp;gt;default&amp;lt;/code&amp;gt;:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;doc&amp;gt;&lt;br /&gt;
  &amp;lt;nowiki&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot; xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
   xmlns:itsx=&amp;quot;http://www.w3.org/2008/12/its-extensions&amp;quot;&amp;gt;&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
   &amp;lt;its:translateRule selector=&amp;quot;//pre&amp;quot; translate=&amp;quot;yes&amp;quot; itsx:whiteSpaces=&amp;quot;preserve&amp;quot;/&amp;gt;&lt;br /&gt;
  &amp;lt;/its:rules&amp;gt;&lt;br /&gt;
  &amp;amp;&amp;amp;lt;pre xml:space=&amp;quot;default&amp;quot;&amp;gt;Some txt with    many spaces.  &amp;amp;lt;/pre&amp;gt;&lt;br /&gt;
 &amp;lt;/doc&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Filter Options===&lt;br /&gt;
&lt;br /&gt;
The filter supports also options in addition to ITS and ITS extension. These options use the namespace URI &amp;lt;code&amp;gt;okapi-framework:xmlfilter-options&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
{{NoteBox|The filter options must be placed in the parameters file (.fprm) used with the filter, not in embedded or linked ITS rules. Options placed in embedded or linked ITS rules have no effect.}}&lt;br /&gt;
&lt;br /&gt;
When you use several options, they must be set in a single &amp;lt;code&amp;gt;&amp;amp;lt;okp:options&amp;gt;&amp;lt;/code&amp;gt; element, as shown below:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:options lineBreakAsCode=&amp;quot;yes&amp;quot;&lt;br /&gt;
              escapeQuotes=&amp;quot;no&amp;quot;&lt;br /&gt;
              escapeGT=&amp;quot;yes&amp;quot;&lt;br /&gt;
 /&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The following options are available:&lt;br /&gt;
&lt;br /&gt;
* [[#lineBreakAsCode|lineBreakAsCode]]&lt;br /&gt;
* [[#codeFinder|codeFinder]]&lt;br /&gt;
* [[#omitXMLDeclaration|omitXMLDeclaration]]&lt;br /&gt;
* [[#escapeQuotes|escapeQuotes]]&lt;br /&gt;
* [[#escapeGT|escapeGT]]&lt;br /&gt;
* [[#escapeNbsp|escapeNbsp]]&lt;br /&gt;
* [[#extractIfOnlyCodes|extractIfOnlyCodes]]&lt;br /&gt;
* [[#inlineCdata|inlineCdata]]&lt;br /&gt;
&lt;br /&gt;
====lineBreakAsCode====&lt;br /&gt;
&lt;br /&gt;
In some cases the content of element includes line-breaks that need to be included as part of the content but without using an actual line-break in the extracted text. For example in some XML documents generated by Excel, the formatting of the cells is marked up with &amp;lt;code&amp;gt;&amp;amp;amp;#10;&amp;lt;/code&amp;gt; entity references. They need to be passed as inline codes.&lt;br /&gt;
&lt;br /&gt;
By default this option is set to false.&lt;br /&gt;
&lt;br /&gt;
To specify this the filter use the extension &amp;lt;code&amp;gt;lineBreakAsCode&amp;lt;/code&amp;gt; extension attribute. This affect all the extracted content.&lt;br /&gt;
&lt;br /&gt;
For example: The following code is an ITS document with the option to treat line-breaks as code. It can be used along with the example of XML document listed below.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:options lineBreakAsCode=&amp;quot;yes&amp;quot;/&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;doc&amp;gt;&lt;br /&gt;
  &amp;lt;data&amp;gt;line 1&amp;amp;amp;#10;line 2.&amp;lt;/data&amp;gt;&lt;br /&gt;
 &amp;lt;/doc&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====codeFinder====&lt;br /&gt;
&lt;br /&gt;
You can define a set of regular expressions to capture span of extracted text that should be treated as inline codes. For example, some element content may have variables, or HTML tags that need to be protected from modification and treated as codes. Use the codeFinder element for this.&lt;br /&gt;
&lt;br /&gt;
In the following parameters file, the &amp;lt;code&amp;gt;codeFinder&amp;lt;/code&amp;gt; element defines two rules:&lt;br /&gt;
&lt;br /&gt;
* The first one (rule0) is &amp;quot;&amp;lt;code&amp;gt;&amp;amp;lt;(/?)\w[^&amp;gt;]*?&amp;gt;&amp;lt;/code&amp;gt;&amp;quot; and matches any XML-type tags (e.g. &amp;quot;&amp;lt;code&amp;gt;&amp;amp;lt;b&amp;gt;&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;&amp;amp;lt;/b&amp;gt;&amp;lt;/code&amp;gt;&amp;quot;, &amp;quot;&amp;lt;code&amp;gt;&amp;amp;lt;br/&amp;gt;&amp;lt;/code&amp;gt;&amp;quot;)&lt;br /&gt;
* The second one (rule1) is &amp;quot;&amp;lt;code&amp;gt;(#\w+?\#)|(%\d+?%)&amp;lt;/code&amp;gt;&amp;quot; and matches any word enclosed in &amp;lt;code&amp;gt;#&amp;lt;/code&amp;gt; (e.g. &amp;quot;&amp;lt;code&amp;gt;#VAR#&amp;lt;/code&amp;gt;&amp;quot;) or number enclosed in &amp;lt;code&amp;gt;%&amp;lt;/code&amp;gt; (e.g. &amp;quot;&amp;lt;code&amp;gt;%1%&amp;lt;/code&amp;gt;&amp;quot;).&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:codeFinder useCodeFinder=&amp;quot;yes&amp;quot;&amp;gt;#v1&lt;br /&gt;
count.i=2&lt;br /&gt;
rule0=&amp;amp;amp;lt;(/?)\w[^&amp;amp;amp;lt;]*?&amp;amp;amp;gt;&lt;br /&gt;
rule1=(#\w+?\#)|(%\d+?%)&lt;br /&gt;
 &amp;lt;/okp:codeFinder&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Some important details:&lt;br /&gt;
&lt;br /&gt;
* Set &amp;lt;code&amp;gt;useCodeFinder&amp;lt;/code&amp;gt; to &amp;quot;yes&amp;quot; to have the rules used, if the attribute is missing its value is assumed to be &amp;quot;no&amp;quot;.&lt;br /&gt;
* Make sure the first line of the &amp;lt;code&amp;gt;&amp;amp;lt;codeFinder&amp;gt;&amp;lt;/code&amp;gt; element content is &amp;lt;code&amp;gt;#v1&amp;lt;/code&amp;gt;. &lt;br /&gt;
* Each entry in the content must be on a separate line. &lt;br /&gt;
* &amp;lt;code&amp;gt;count.i=N&amp;lt;/code&amp;gt; must be before any rules and &amp;lt;code&amp;gt;N&amp;lt;/code&amp;gt; must be the number of rules. &lt;br /&gt;
* &amp;lt;code&amp;gt;ruleN&amp;lt;/code&amp;gt; must be incremented starting at 0. &lt;br /&gt;
* The pattern for a rule must be escaped for XML, for example: &amp;quot;&amp;lt;code&amp;gt;&amp;amp;lt;(/?)\w[^&amp;gt;]*?&amp;gt;&amp;lt;/code&amp;gt;&amp;quot; must be entered &amp;quot;&amp;lt;code&amp;gt;&amp;amp;amp;lt;(/?)\w[^&amp;amp;amp;lt;]*?&amp;amp;amp;gt;&amp;lt;/code&amp;gt;&amp;quot; in the parameters file. &lt;br /&gt;
* Do not put spaces before &amp;lt;code&amp;gt;count.i&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;ruleN&amp;lt;/code&amp;gt;, and not after your expressions.&lt;br /&gt;
&lt;br /&gt;
To facilitate the creation of code finder rules [[Rainbow - Code Finder Editor|Rainbow provides the Code Finder Editor]].&lt;br /&gt;
&lt;br /&gt;
====omitXMLDeclaration====&lt;br /&gt;
&lt;br /&gt;
By default an XML declaration is always set at the top of the output document (regardless wether the original document has one or not). It is an important part of the XML document and it is especially needed when the encoding of the output document is not UTF-8, UTF-16 or UTF-32, as its name must be specified in the XML declaration. However, there are a few special cases when the declaration is better left off. To handle those rare cases, you can use &amp;lt;code&amp;gt;omitXMLDeclation&amp;lt;/code&amp;gt; to indicate the filter to not output the XML declaration.&lt;br /&gt;
&lt;br /&gt;
For example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:options omitXMLDeclaration=&amp;quot;yes&amp;quot;/&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Remember that XML documents without an XML declaration may be read incorrectly if the encoding of the document is not UTF-8, UTF-16 or UTF-32.&lt;br /&gt;
&lt;br /&gt;
====escapeQuotes====&lt;br /&gt;
&lt;br /&gt;
By default, when processing the document, the filter uses double-quotes to enclose all attributes (translatable or not) and use the following rules for escaping/not-escaping the literal quotes:&lt;br /&gt;
&lt;br /&gt;
* Inside the attribute values:&lt;br /&gt;
** Single-quotes (=apostrophes) are never escaped&lt;br /&gt;
** Double-quotes are always escaped&lt;br /&gt;
* In element content:&lt;br /&gt;
** Single-quotes (=apostrophes) are not escaped&lt;br /&gt;
** Double-quotes are escaped escaped by default&lt;br /&gt;
&lt;br /&gt;
You cannot change the escaping rules for attributes.&lt;br /&gt;
&lt;br /&gt;
For element content: If the document is processed without triggering any rule that allow the translation of an attribute, then (and only then) the filter takes into account the &amp;lt;code&amp;gt;escapeQuotes&amp;lt;/code&amp;gt; option to escape or not double-quotes in the translatable content.&lt;br /&gt;
&lt;br /&gt;
For example, the following parameters file allows to not escape double-quotes in element content (for the documents where there is no rule for translatable attributes are triggered):&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:options escapeQuotes=&amp;quot;no&amp;quot;/&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====escapeGT====&lt;br /&gt;
&lt;br /&gt;
By default the character '&amp;lt;code&amp;gt;&amp;gt;&amp;lt;/code&amp;gt;' is escaped. You can indicate to the filter to not escape it using the &amp;lt;code&amp;gt;escapeGT&amp;lt;/code&amp;gt; option.&lt;br /&gt;
&lt;br /&gt;
For example, the following parameters file indicates to not escape greater-than characters:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:options escapeGT=&amp;quot;no&amp;quot;/&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====escapeNbsp====&lt;br /&gt;
&lt;br /&gt;
By default the non-breaking space character is escaped (in the form &amp;lt;code&amp;gt;&amp;amp;amp;#x00a0;&amp;lt;/code&amp;gt;). You can indicate to the filter to not escape it using the &amp;lt;code&amp;gt;escapeNbsp&amp;lt;/code&amp;gt; option.&lt;br /&gt;
&lt;br /&gt;
For example, the following parameters file indicates to not escape the non-breaking space characters:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:options escapeNbsp=&amp;quot;no&amp;quot;/&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====extractIfOnlyCodes====&lt;br /&gt;
&lt;br /&gt;
By default all extractable entries are extracted even when they contain only white-spaces and/or inline codes. You can indicate to the filter to not extract such entries using the &amp;lt;code&amp;gt;extractIfOnlyCodes&amp;lt;/code&amp;gt; option.&lt;br /&gt;
&lt;br /&gt;
For example, the following parameters file indicates to not extract entries with only whte-spaces and/or inline codes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:options extractIfOnlyCodes=&amp;quot;no&amp;quot;/&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====inlineCdata====&lt;br /&gt;
&lt;br /&gt;
By default, CDATA sections will be exposed as regular content, and the CDATA markers themselves will be discarded.  When the &amp;lt;code&amp;gt;inlineCdata&amp;lt;/code&amp;gt; option is set,&lt;br /&gt;
the CDATA markers will be exposed as inline codes.&lt;br /&gt;
&lt;br /&gt;
For example, the following parameters file will expose CDATA markers as inline codes:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;lt;its:rules version=&amp;quot;1.0&amp;quot;&lt;br /&gt;
 xmlns:its=&amp;quot;http://www.w3.org/2005/11/its&amp;quot;&lt;br /&gt;
 xmlns:okp=&amp;quot;okapi-framework:xmlfilter-options&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;okp:options inlineCdata=&amp;quot;yes&amp;quot;/&amp;gt;&lt;br /&gt;
&amp;lt;/its:rules&amp;gt;&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
* Currently, in some cases, the ITS rule &amp;lt;code&amp;gt;withinTextRule&amp;lt;/code&amp;gt; with the value &amp;lt;code&amp;gt;nested&amp;lt;/code&amp;gt; may act like it has a value &amp;lt;code&amp;gt;yes&amp;lt;/code&amp;gt; instead.&lt;br /&gt;
* In output, the values of the &amp;lt;code&amp;gt;xml:lang&amp;lt;/code&amp;gt; attributes are not updated to reflect the target language.&lt;br /&gt;
* When doing the extraction, the whole input file is loaded into memory. You may run into memory limitation if the document is very large.&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]] [[Category:ITS]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Markdown_Filter&amp;diff=658</id>
		<title>Markdown Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Markdown_Filter&amp;diff=658"/>
		<updated>2017-03-27T23:08:23Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
The Markdown Filter is an Okapi component for extracting translatable text from Markdown files. See https://en.wikipedia.org/wiki/Markdown for more information about the format.&lt;br /&gt;
Markdown is a family of formats, not all of them mutually compatible.  This filter is designed to work with markdown based on the [http://commonmark.org CommonMark] specification, with additional features to support [https://guides.github.com/features/mastering-markdown/ GitHub-flavored Markdown].&lt;br /&gt;
&lt;br /&gt;
==Processing Details==&lt;br /&gt;
&lt;br /&gt;
===Input Encoding===&lt;br /&gt;
&lt;br /&gt;
The filter decides which encoding to use for the input file using the following logic:&lt;br /&gt;
&lt;br /&gt;
If the file has a Unicode Byte-Order-Mark:&lt;br /&gt;
Then, the corresponding encoding (e.g. UTF-8, UTF-16, etc.) is used.&lt;br /&gt;
Otherwise, the input encoding used is the default encoding that was specified when setting the filter options.&lt;br /&gt;
&lt;br /&gt;
===Inline Codes===&lt;br /&gt;
&lt;br /&gt;
The filter attempts to convert inline formatting to codes and block-level formatting to document park/skeleton content.&lt;br /&gt;
&lt;br /&gt;
The Inline Code Finder can be used to handle convert additional types of formatting to inline codes, as necessary.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
; Translate URLs&lt;br /&gt;
: By default, URLs in link and image statements are not exposed for translation.  If this option is enabled, they will be extracted.  &amp;lt;i&amp;gt;Note&amp;lt;/i&amp;gt;: URLs are currently extracted inline in their containing segment, rather than as a subflow.&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
* The filter does not currently have a configuration UI in Rainbow.&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Markdown_Filter&amp;diff=657</id>
		<title>Markdown Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Markdown_Filter&amp;diff=657"/>
		<updated>2017-03-27T23:07:29Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
The Markdown Filter is an Okapi component for extracting translatable text from Markdown files. See https://en.wikipedia.org/wiki/Markdown for more information about the format.&lt;br /&gt;
Markdown is a family of formats, not all of them mutually compatible.  This filter is designed to work with markdown based on the [http://commonmark.org CommonMark] specification, with additional features to support [https://guides.github.com/features/mastering-markdown/ GitHub-flavored Markdown].&lt;br /&gt;
&lt;br /&gt;
==Processing Details==&lt;br /&gt;
&lt;br /&gt;
===Input Encoding===&lt;br /&gt;
&lt;br /&gt;
The filter decides which encoding to use for the input file using the following logic:&lt;br /&gt;
&lt;br /&gt;
If the file has a Unicode Byte-Order-Mark:&lt;br /&gt;
Then, the corresponding encoding (e.g. UTF-8, UTF-16, etc.) is used.&lt;br /&gt;
Otherwise, the input encoding used is the default encoding that was specified when setting the filter options.&lt;br /&gt;
&lt;br /&gt;
===Inline Codes===&lt;br /&gt;
&lt;br /&gt;
The filter attempts to convert inline formatting to codes and block-level formatting to document park/skeleton content.&lt;br /&gt;
&lt;br /&gt;
The Inline Code Finder can be used to handle convert additional types of formatting to inline codes, as necessary.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
; Translate URLs&lt;br /&gt;
: By default, URLs in link and image statements are not exposed for translation.  If this option is enabled, they will be extracted.  &amp;lt;i&amp;gt;Note&amp;lt;/i&amp;gt;: URLs are currently extracted inline in their containing segment, rather than as a subflow.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=HTML_Filter&amp;diff=634</id>
		<title>HTML Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=HTML_Filter&amp;diff=634"/>
		<updated>2016-11-05T00:11:40Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* Inline Code Finder */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
The HTML Filter is an Okapi component that implements the IFilter interface for HTML and XHTML documents.&lt;br /&gt;
&lt;br /&gt;
==Processing Details==&lt;br /&gt;
&lt;br /&gt;
===Input Encoding===&lt;br /&gt;
&lt;br /&gt;
The filter decides which encoding to use for the input document using the following logic:&lt;br /&gt;
&lt;br /&gt;
* If the document has an encoding declaration it is used. &lt;br /&gt;
* Otherwise, the input encoding used is the default encoding that was specified when setting the filter options. &lt;br /&gt;
&lt;br /&gt;
===Output Encoding===&lt;br /&gt;
&lt;br /&gt;
If the output encoding is UTF-8:&lt;br /&gt;
&lt;br /&gt;
* If the input encoding was also UTF-8, a Byte-Order-Mark is used for the output document only if one was detected in the input document. &lt;br /&gt;
* If the input encoding was not UTF-8, no Byte-Order-Mark is used in the output document. &lt;br /&gt;
&lt;br /&gt;
If the input file has no declared encoding, the filter tries to add one in output. A &amp;lt;code&amp;gt;&amp;amp;lt;meta&amp;gt;&amp;lt;/code&amp;gt; tag for HTML files, or a &amp;lt;code&amp;gt;&amp;amp;lt;meta /&amp;gt;&amp;lt;/code&amp;gt; tag for XHTML files. The potential addition is done only if there is a &amp;lt;code&amp;gt;&amp;amp;lt;head&amp;gt;&amp;lt;/code&amp;gt; element in the file.&lt;br /&gt;
&lt;br /&gt;
===Line-Breaks===&lt;br /&gt;
&lt;br /&gt;
The type of line-breaks of the output is the same as the one of the original input.&lt;br /&gt;
&lt;br /&gt;
===Entities===&lt;br /&gt;
&lt;br /&gt;
Character and numeric entities are converted to Unicode. Entities defined in a DTD or schema are passed through without change.&lt;br /&gt;
&lt;br /&gt;
Note that text entity declarations can be processed by the [[DTD Filter]].&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
===Built-in Configuration===&lt;br /&gt;
&lt;br /&gt;
The HTML filter does not currently have a user interface to modify its configuration files. By default the HTML filter uses a minimalist configuration file that does not create structural groupings. For example, a table group or list group will never be created.&lt;br /&gt;
&lt;br /&gt;
There is a predefined maximalist configuration (&amp;lt;code&amp;gt;okf_html-wellFormed&amp;lt;/code&amp;gt;) that can be used if structural groupings are needed. The caveat is that any structural tags that map to groups must be well formed, that is, they must have a start and end tag. Otherwise the filter return an error.&lt;br /&gt;
&lt;br /&gt;
===HTML Configuration Syntax===&lt;br /&gt;
&lt;br /&gt;
For the truly brave, you can create your own HTML configuration files. These configurations are written in [http://www.yaml.org/ YAML].  See the &amp;lt;code&amp;gt;[https://bitbucket.org/okapiframework/okapi/src/master/okapi/filters/html/src/main/resources/net/sf/okapi/filters/html/wellformedConfiguration.yml wellformedConfiguration.yml]&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;[https://bitbucket.org/okapiframework/okapi/src/master/okapi/filters/html/src/main/resources/net/sf/okapi/filters/html/nonwellformedConfiguration.yml nonwellformedConfiguration.yml]&amp;lt;/code&amp;gt; for examples. &lt;br /&gt;
&lt;br /&gt;
HTML tags are associated with rules. These rules are used by the filter to process the input document.&lt;br /&gt;
&lt;br /&gt;
Notes:&lt;br /&gt;
&lt;br /&gt;
* All attributes and elements names should be in '''lowercase''' in the configuration file, regardless of their casing in the document.&lt;br /&gt;
* Element or attributes with a prefix should be declared with the prefix (and between single quotes) in the configuration (e.g. &amp;lt;code&amp;gt;'xml:lang'&amp;lt;/code&amp;gt;)&lt;br /&gt;
&lt;br /&gt;
==== Configuring Element Rules ====&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;elements&amp;lt;/code&amp;gt; section of the configuration consists of a set of key-value pairs.  Each key is an element name, and the value is the rules for that element, represented as another set of key-value pairs.  An element declaration should include one or more of the available element rules:&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;5&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;ruleTypes&amp;lt;/code&amp;gt;&lt;br /&gt;
| Basic description of how the filter treats this tag.  See [[#Rule Types]].&lt;br /&gt;
|-  &lt;br /&gt;
| &amp;lt;code&amp;gt;idAttributes&amp;lt;/code&amp;gt;&lt;br /&gt;
| A list containing attributes which may provide the segment ID for text contained within this element. &lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;conditions&amp;lt;/code&amp;gt;&lt;br /&gt;
| A condition that further restricts this rule.  For example, to indicate that the element should only be handled if it contains an attribute with a certain value.  See [[#Condition Syntax]].&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;translatableAttributes&amp;lt;/code&amp;gt;&lt;br /&gt;
| Contains information about translatable attributes in this element. See [[#Configuring Translatable Attributes]].&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;elementType&amp;lt;/code&amp;gt;&lt;br /&gt;
| Indicates the corresponding XLIFF 1.2 &amp;lt;code&amp;gt;type&amp;lt;/code&amp;gt; value for this element.&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;writableLocalizationAttributes&amp;lt;/code&amp;gt;&lt;br /&gt;
| Specifies attributes which are writable, but not translatable. (TODO) &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== Rule Types ====&lt;br /&gt;
The rules types are the following:&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;5&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;INLINE&amp;lt;/code&amp;gt;&lt;br /&gt;
| A tag which may occur inside a text run. For example &amp;lt;code&amp;gt;&amp;amp;lt;b&amp;gt;&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;&amp;amp;lt;i&amp;gt;&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;&amp;amp;lt;u&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
|- &lt;br /&gt;
| &amp;lt;code&amp;gt;GROUP&amp;lt;/code&amp;gt;&lt;br /&gt;
| Defines a group of elements that are structurally bound. For example &amp;lt;code&amp;gt;&amp;amp;lt;table&amp;gt;&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;&amp;amp;lt;div&amp;gt;&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;&amp;amp;lt;menu&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;EXCLUDE&amp;lt;/code&amp;gt;&lt;br /&gt;
| Prevents extraction of any text until the end tag of the same element is found. For example, if the content between a &amp;lt;code&amp;gt;&amp;amp;lt;script&amp;gt;&amp;lt;/code&amp;gt; element should not be extracted then define &amp;lt;code&amp;gt;&amp;amp;lt;script&amp;gt;&amp;lt;/code&amp;gt; as &amp;lt;code&amp;gt;EXCLUDE&amp;lt;/code&amp;gt;.&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;INCLUDE&amp;lt;/code&amp;gt;&lt;br /&gt;
| Overrides any current exclusions. This allows exceptions for children of &amp;lt;code&amp;gt;EXCLUDE&amp;lt;/code&amp;gt;d elements.&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;TEXTUNIT&amp;lt;/code&amp;gt;&lt;br /&gt;
| A tag that starts a complex text unit. Examples include &amp;lt;code&amp;gt;&amp;amp;lt;p&amp;gt;&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;&amp;amp;lt;title&amp;gt;&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;&amp;amp;lt;h1&amp;gt;&amp;lt;/code&amp;gt;. Complex text units carry their surrounding tags along with any extracted text.&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;PRESERVE_WHITESPACE&amp;lt;/code&amp;gt;&lt;br /&gt;
| A tag that must preserve its white spaces as-is. For example &amp;lt;code&amp;gt;&amp;amp;lt;pre&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;ATTRIBUTES_ONLY&amp;lt;/code&amp;gt;&lt;br /&gt;
| A tag that has localizable or translatable attributes but does not have translatable content. &lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;ATTRIBUTE_TRANS&amp;lt;/code&amp;gt;&lt;br /&gt;
| A translatable attribute. &lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;ATTRIBUTE_WRITABLE&amp;lt;/code&amp;gt;&lt;br /&gt;
| A writable or modifiable attribute, but not translatable.&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;ATTRIBUTE_READONLY&amp;lt;/code&amp;gt;&lt;br /&gt;
| A read-only attribute, extracted but that cannot be modified. &lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==== Configuring Translatable Attributes ====&lt;br /&gt;
Translatable attributes may be specified in two ways, depending on the level of complexity needed.  &lt;br /&gt;
&lt;br /&gt;
If all the specified attributes should always be translated, they can be exposed as a simple list.  For example, the definition for the &amp;lt;code&amp;gt;&amp;amp;lt;area&amp;amp;gt;&amp;lt;/code&amp;gt; element specifies that &amp;lt;code&amp;gt;accesskey&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;area&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;alt&amp;lt;/code&amp;gt; attributes are translatable:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;  area:&lt;br /&gt;
    ruleTypes: [ATTRIBUTES_ONLY]&lt;br /&gt;
    translatableAttributes: [accesskey, area, alt]&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, if additional restrictions on translatable attributes are present, the &amp;lt;code&amp;gt;translatableAttributes&amp;lt;/code&amp;gt; rule may be specified as a set of key-value pairs, with each key being a translatable attribute and each value being an (optional) list of conditions, using the [[#Condition Syntax]].  For example, this snippet defines the handling of the &amp;lt;code&amp;gt;&amp;amp;lt;input&amp;amp;gt;&amp;lt;/code&amp;gt; element in the built-in configurations:&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;&lt;br /&gt;
  input:&lt;br /&gt;
    ruleTypes: [INLINE]&lt;br /&gt;
    translatableAttributes:&lt;br /&gt;
      alt: [type, NOT_EQUALS, [file, hidden, image, password]]&lt;br /&gt;
      value: [type, NOT_EQUALS, [file, hidden, image, password]]&lt;br /&gt;
      accesskey: [type, NOT_EQUALS, [file, hidden, image, password]]&lt;br /&gt;
      title: [type, NOT_EQUALS, [file, hidden, image, password]]&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This specifies that there are four attributes (&amp;lt;code&amp;gt;alt&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;value&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;accesskey&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;title&amp;lt;/code&amp;gt;) that are translatable.  The translatability of each of these attributes is conditional on the &amp;lt;code&amp;gt;&amp;amp;lt;input&amp;amp;gt;&amp;lt;/code&amp;gt; element not having particular &amp;lt;code&amp;gt;type&amp;lt;/code&amp;gt; values.&lt;br /&gt;
&lt;br /&gt;
==== Condition Syntax ====&lt;br /&gt;
&lt;br /&gt;
Rule conditions are expressed as a list of the form&lt;br /&gt;
 &amp;lt;nowiki&amp;gt;[attribute, operation, value]&amp;lt;/nowiki&amp;gt;&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;5&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;attribute&amp;lt;/code&amp;gt;&lt;br /&gt;
| The name of the attribute which the condition applies to.&lt;br /&gt;
|- &lt;br /&gt;
| &amp;lt;code&amp;gt;operation&amp;lt;/code&amp;gt;&lt;br /&gt;
| Available operations are &amp;lt;code&amp;gt;EQUALS&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;NOT_EQUALS&amp;lt;/code&amp;gt;, and &amp;lt;code&amp;gt;MATCHES&amp;lt;/code&amp;gt;.  &amp;lt;code&amp;gt;EQUALS&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;NOT_EQUALS&amp;lt;/code&amp;gt; test for (case-insensitive) string matches, while &amp;lt;code&amp;gt;MATCHES&amp;lt;/code&amp;gt; uses a regular expression.&lt;br /&gt;
|-&lt;br /&gt;
| &amp;lt;code&amp;gt;value&amp;lt;/code&amp;gt;&lt;br /&gt;
| The value of the attribute to be compared using the operation.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
===Inline Code Finder===&lt;br /&gt;
&lt;br /&gt;
You can define a set of regular expressions to capture span of extracted text that should be treated as inline codes. For example, some element content may have variables that need to be protected from modification and treated as codes. Use the &amp;lt;code&amp;gt;useCodeFinder&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;codeFinderRules&amp;lt;/code&amp;gt; options for this.&lt;br /&gt;
&lt;br /&gt;
 useCodeFinder: true&lt;br /&gt;
 codeFinderRules: &amp;quot;#v1\ncount.i=1\nrule0=\\bVAR\\d\\b&amp;quot;&lt;br /&gt;
&lt;br /&gt;
You can also use this alternate syntax, which is slightly easier to read:&lt;br /&gt;
&lt;br /&gt;
 useCodeFinder: true&lt;br /&gt;
 codeFinderRules: |-&lt;br /&gt;
    #v1&lt;br /&gt;
    count.i=1&lt;br /&gt;
    rule0=\\bVAR\\d\\b&lt;br /&gt;
&lt;br /&gt;
The options above will set the text &amp;quot;&amp;lt;code&amp;gt;VAR1&amp;lt;/code&amp;gt;&amp;quot; as in-line code in the following HTML:&lt;br /&gt;
&lt;br /&gt;
 &amp;amp;lt;p&amp;gt;Number of files = VAR1&amp;amp;lt;/p&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Note that the regular expression is &amp;quot;&amp;lt;code&amp;gt;\bVAR\d\b&amp;lt;/code&amp;gt;&amp;quot; but you must escape the backslash in the YAML notation as well.&lt;br /&gt;
&lt;br /&gt;
To facilitate the creation of code finder rules [[Rainbow - Code Finder Editor|Rainbow provides the Code Finder Editor]].&lt;br /&gt;
&lt;br /&gt;
===Character Entity References in Output===&lt;br /&gt;
&lt;br /&gt;
By default extended characters are not using character entity references in output (e.g. &amp;lt;code&amp;gt;&amp;amp;amp;copy;&amp;lt;/code&amp;gt; for the character '&amp;amp;copy').&lt;br /&gt;
&lt;br /&gt;
You can change this by specifying the &amp;lt;code&amp;gt;escapeCharacters&amp;lt;/code&amp;gt; rule with a string of all the characters you wish to see output as character entity reference. Any specified character that is not extended or has no HTML character entity defined is processed like a normal character.&lt;br /&gt;
&lt;br /&gt;
For example, given the following rule:&lt;br /&gt;
&lt;br /&gt;
 escapeCharacters: &amp;quot;© €µÆĄ&amp;quot;&lt;br /&gt;
&lt;br /&gt;
The output of &amp;lt;code&amp;gt;&amp;amp;lt;p&amp;gt;© €µÆĄ&amp;amp;lt;/p&amp;gt;&amp;lt;/code&amp;gt; (assuming the output encoding is UTF-8) will be:&lt;br /&gt;
&lt;br /&gt;
 &amp;amp;lt;p&amp;gt;&amp;amp;amp;copy;&amp;amp;amp;nbsp;&amp;amp;amp;euro;&amp;amp;amp;micro;&amp;amp;amp;AElig;Ą&amp;amp;lt;/p&amp;gt; &lt;br /&gt;
&lt;br /&gt;
Only the character &amp;lt;code&amp;gt;Ą&amp;lt;/code&amp;gt; (U+0104) is not represented as an entity reference because there is no HTML character entity defined for it.&lt;br /&gt;
&lt;br /&gt;
===Inline CDATA===&lt;br /&gt;
&lt;br /&gt;
For formats that use CDATA in ways that undesirably break the flow of text, you can set the filter to treat CDATA as if it was an inline element like so:&lt;br /&gt;
&lt;br /&gt;
  inlineCdata: true&lt;br /&gt;
&lt;br /&gt;
Then markup such as &amp;lt;code&amp;gt;&amp;amp;lt;p&amp;gt;Text with &amp;amp;lt;![CDATA[inline]]&amp;gt; CDATA&amp;amp;lt;/p&amp;gt;&amp;lt;/code&amp;gt; will be extracted as if &amp;lt;code&amp;gt;&amp;amp;lt;![CDATA[&amp;lt;/code&amp;gt; was a regular inline opening tag and &amp;lt;code&amp;gt;]]&amp;gt;&amp;lt;/code&amp;gt; was a regular inline closing tag.&lt;br /&gt;
&lt;br /&gt;
===Excluding By Default===&lt;br /&gt;
&lt;br /&gt;
Normally, there is an implicit &amp;quot;default rule&amp;quot; to include elements.  If the filter configuration contained no tag information at all, the default behavior of the filter would be to expose all PCDATA for translation.  Sometimes it is useful to change this behavior in order to make your configuration more concise.  This can be done by setting the &amp;lt;code&amp;gt;exclude_by_default&amp;lt;/code&amp;gt; option in your config.&lt;br /&gt;
&lt;br /&gt;
For example, if you wished to have a custom configuration that exposed the translation of the &amp;lt;code&amp;gt;&amp;amp;lt;title&amp;amp;gt;&amp;lt;/code&amp;gt; element but nothing else.  You could specify this as&lt;br /&gt;
&lt;br /&gt;
 exclude_by_default: true&lt;br /&gt;
 // .... other configuration&lt;br /&gt;
 elements:&lt;br /&gt;
    title:&lt;br /&gt;
      ruleTypes: [TEXTUNIT]&lt;br /&gt;
&lt;br /&gt;
===Quote Mode===&lt;br /&gt;
Escaping of quote and apostrophe (single quote) characters can be changed by adding these lines to the config file:&lt;br /&gt;
&lt;br /&gt;
 quoteModeDefined: true&lt;br /&gt;
 quoteMode: 3&lt;br /&gt;
&lt;br /&gt;
'''Current quote modes:'''&lt;br /&gt;
&lt;br /&gt;
* Do not escape single or double quotes: '''UNESCAPED = 0'''&lt;br /&gt;
* Escape single and double quotes to a named entity: '''ALL = 1'''&lt;br /&gt;
* Escape double quotes to a named entity, and single quotes to a numeric entity: '''NUMERIC_SINGLE_QUOTES = 2'''&lt;br /&gt;
* Escape double quotes only: '''DOUBLE_QUOTES_ONLY = 3'''&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
* In the current version of the filter the content of &amp;lt;code&amp;gt;&amp;amp;lt;style&amp;gt;&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;&amp;amp;lt;script&amp;gt;&amp;lt;/code&amp;gt; elements is not extracted.&lt;br /&gt;
* Tags from server-side scripts such as PHP, ASPX, JSP, etc. are not formally supported and will be treated as non-translatable.&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Whitespace_Correction_Step&amp;diff=618</id>
		<title>Whitespace Correction Step</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Whitespace_Correction_Step&amp;diff=618"/>
		<updated>2016-09-22T17:38:26Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* Parameters */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Steps Header}}&lt;br /&gt;
__TOC__&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
This step is intended to simplify the addition or removal of inter-segment whitespace when translating to or from Chinese or Japanese scripts that do not typically use it.  The step will perform two separate tasks, depending on the source and target-locales:&lt;br /&gt;
&lt;br /&gt;
* When translating from a space-delimited language to a non-space-delimited language, whitespace following segment-ending punctuation will be removed.&lt;br /&gt;
* When translating from a non-space-delimited language to a space-delimited language, whitespace will be added following segment-ending punctuation.&lt;br /&gt;
&lt;br /&gt;
This step will perform no action when translating from one space-delimited language to another space-delimited language (for example, from English to French), or when translating between Chinese and Japanese.&lt;br /&gt;
&lt;br /&gt;
Takes: Filter events. Sends: Filter events.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
The step can be configured to apply its space adjustment to each the following classes of punctuation:&lt;br /&gt;
&lt;br /&gt;
* Full Stop - Converts Ideographic Full Stop (U+3002) and Full-width Full Stop (U+FF0E) to/from a period.&lt;br /&gt;
* Comma - Converts Ideographic Comma (U+3001) and Full-width Comma (U+FF0C) to/from a comma.&lt;br /&gt;
* Exclamation Point - Converts Full-width Exclamation Mark (U+FF01) to/from an exclamation point.&lt;br /&gt;
* Question Mark - Converts Full-width Question Mark (U+FF1F) to/from a question mark.&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
This process is not foolproof, as it relies on the assumption that each source segment contains a single sentence, and has also been translated to a single sentence in the target language.&lt;br /&gt;
&lt;br /&gt;
[[Category:Steps]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Whitespace_Correction_Step&amp;diff=617</id>
		<title>Whitespace Correction Step</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Whitespace_Correction_Step&amp;diff=617"/>
		<updated>2016-09-22T17:33:31Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: Created page with &amp;quot;{{Steps Header}} __TOC__ ==Overview==  This step is intended to simplify the addition or removal of inter-segment whitespace when translating to or from Chinese or Japanese sc...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Steps Header}}&lt;br /&gt;
__TOC__&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
This step is intended to simplify the addition or removal of inter-segment whitespace when translating to or from Chinese or Japanese scripts that do not typically use it.  The step will perform two separate tasks, depending on the source and target-locales:&lt;br /&gt;
&lt;br /&gt;
* When translating from a space-delimited language to a non-space-delimited language, whitespace following segment-ending punctuation will be removed.&lt;br /&gt;
* When translating from a non-space-delimited language to a space-delimited language, whitespace will be added following segment-ending punctuation.&lt;br /&gt;
&lt;br /&gt;
This step will perform no action when translating from one space-delimited language to another space-delimited language (for example, from English to French), or when translating between Chinese and Japanese.&lt;br /&gt;
&lt;br /&gt;
Takes: Filter events. Sends: Filter events.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
The step can be configured to apply its space adjustment to each the following classes of punctuation:&lt;br /&gt;
&lt;br /&gt;
* Full Stop - Converts Ideographic Full Stop (U+3002) and Full-width Full Stop (U+FF0E) to/from a period.&lt;br /&gt;
* Comma - Converts Ideographic Comma (U+3001) and Full-width Comma (U+FF0C) to/from a comma.&lt;br /&gt;
* Exclamation Mark - Converts Full-width Exclamation Mark (U+FF01) to/from an exclamation point.&lt;br /&gt;
* Question Mark - Converts Full-width Question Mark (U+FF1F) to/from a question mark.&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
This process is not foolproof, as it relies on the assumption that each source segment contains a single sentence, and has also been translated to a single sentence in the target language.&lt;br /&gt;
&lt;br /&gt;
[[Category:Steps]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Steps&amp;diff=615</id>
		<title>Steps</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Steps&amp;diff=615"/>
		<updated>2016-09-22T17:17:44Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: Replace Add/remove whitespace steps with whitespace correction step&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Steps are components that execute one specific task. You use them by chaining them into '''pipelines'''. See for example &amp;quot;[[How to Create a Pipeline in Rainbow]]&amp;quot; to see how steps can be used. Rainbow also [[Rainbow - Utilities|comes with several pre-defined pipelines]] using some of these steps.&lt;br /&gt;
&lt;br /&gt;
The Okapi Framework comes with several ready-to-use steps:&lt;br /&gt;
&lt;br /&gt;
{| cellpadding=&amp;quot;8&amp;quot; width=&amp;quot;100%&amp;quot;&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
|&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Batch Translation Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[BOM Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Character Count Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Cleanup Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Copy Or Move Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Combined Xliff Merger Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Create Target Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Desegmentation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Diff Leverage Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Encoding Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Enrycher Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[External Command Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Extraction Verification Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='blue'&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Filter Events to Raw Document Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Format Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Full-Width Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Generate SimpleTM Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[GTT Batch Translation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Id-Based Copy Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Id-Based Aligner Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Image Modification Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Inconsistency Check Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Inline Codes Removal Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Inline Codes Simplifier Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[LanguageTool Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Leveraging Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Line-Break Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Localizables Check Step]]&lt;br /&gt;
|&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Microsoft Batch Translation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Microsoft Batch Submission Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Moses InlineText Extraction Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Moses InlineText Leveraging Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[MS Word Resaver Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[MS Word Search and Replace Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Original Document Xliff Merger Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Paragraph Alignment Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Post-segmentation Inline Codes Removal Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Properties Setting Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[QuEst Quality Estimation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[QuEst SVM Model Builder Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Quality Check Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Quality Check Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Raw Document to Filter Events Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='hi'&amp;gt;RD/FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Rainbow Translation Kit Creation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='hi'&amp;gt;RD/FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Rainbow Translation Kit Merging Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[RTF Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Remove Target Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Repetition Analysis Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Resource Simplifier Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Scoping Report Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Search and Replace Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Search and Replace Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Segmentation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Segments to Text Units Converter Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Sentence Alignment Step]]&lt;br /&gt;
|&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[SimpleTM to TMX Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Space Check Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Skeleton Xliff Merger Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Term Extraction Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Text Modification Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[ThreadedWorkQueue Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[TM Import Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Tokenization Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Translation Comparison Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[URI Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Used Characters Listing Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Word Count Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Simple Word Count Step]]&lt;br /&gt;
* &amp;lt;span class='green&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Whitespace Correction Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Trados Analysis Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Trados Cleanup Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Trados Export Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Trados Import Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Trados Translation Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[TTX Joiner Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[TTX Splitter Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XLIFF Joiner Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XLIFF Splitter Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XML Analysis Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XML Characters Fixing Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XML Validation Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XSL Transformation Step]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Legend:&lt;br /&gt;
: &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt; = the step takes raw document and sends raw document &lt;br /&gt;
: &amp;lt;span class='blue'&amp;gt;RD&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span the step class='green'&amp;gt;FE&amp;lt;/span&amp;gt; = the step takes raw document and sends filter events&lt;br /&gt;
: &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt; = the step takes filter events and sends filter events&lt;br /&gt;
: &amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='blue'&amp;gt;RD&amp;lt;/span&amp;gt; = the step takes filter events and sends raw document&lt;br /&gt;
: &amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='hi'&amp;gt;RD/FE&amp;lt;/span&amp;gt; = the step takes filter events and sends either raw document or filter events&lt;br /&gt;
&lt;br /&gt;
[[Category:Steps]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Steps&amp;diff=614</id>
		<title>Steps</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Steps&amp;diff=614"/>
		<updated>2016-08-30T23:38:44Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Steps are components that execute one specific task. You use them by chaining them into '''pipelines'''. See for example &amp;quot;[[How to Create a Pipeline in Rainbow]]&amp;quot; to see how steps can be used. Rainbow also [[Rainbow - Utilities|comes with several pre-defined pipelines]] using some of these steps.&lt;br /&gt;
&lt;br /&gt;
The Okapi Framework comes with several ready-to-use steps:&lt;br /&gt;
&lt;br /&gt;
{| cellpadding=&amp;quot;8&amp;quot; width=&amp;quot;100%&amp;quot;&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
|&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Batch Translation Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[BOM Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Character Count Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Cleanup Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Copy Or Move Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Combined Xliff Merger Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Create Target Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Desegmentation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Diff Leverage Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Encoding Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Enrycher Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[External Command Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Extraction Verification Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='blue'&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Filter Events to Raw Document Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Format Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Full-Width Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Generate SimpleTM Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[GTT Batch Translation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Id-Based Copy Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Id-Based Aligner Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Image Modification Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Inconsistency Check Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Inline Codes Removal Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Inline Codes Simplifier Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[LanguageTool Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Leveraging Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Line-Break Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Localizables Check Step]]&lt;br /&gt;
|&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Microsoft Batch Translation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Microsoft Batch Submission Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Moses InlineText Extraction Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Moses InlineText Leveraging Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[MS Word Resaver Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[MS Word Search and Replace Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Original Document Xliff Merger Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Paragraph Alignment Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Post-segmentation Inline Codes Removal Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Properties Setting Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[QuEst Quality Estimation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[QuEst SVM Model Builder Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Quality Check Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Quality Check Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Raw Document to Filter Events Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='hi'&amp;gt;RD/FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Rainbow Translation Kit Creation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='hi'&amp;gt;RD/FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Rainbow Translation Kit Merging Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[RTF Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Remove Target Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Repetition Analysis Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Resource Simplifier Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Scoping Report Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Search and Replace Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Search and Replace Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Segmentation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Segments to Text Units Converter Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Sentence Alignment Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[SimpleTM to TMX Step]]&lt;br /&gt;
|&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Space Check Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Skeleton Xliff Merger Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Term Extraction Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Text Modification Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[ThreadedWorkQueue Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[TM Import Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Tokenization Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Translation Comparison Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[URI Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Used Characters Listing Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Word Count Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Simple Word Count Step]]&lt;br /&gt;
* &amp;lt;span class='green&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Add Whitespace After Kuten Step]]&lt;br /&gt;
* &amp;lt;span class='green&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Remove Whitespace After Kuten Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Trados Analysis Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Trados Cleanup Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Trados Export Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Trados Import Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Trados Translation Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[TTX Joiner Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[TTX Splitter Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XLIFF Joiner Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XLIFF Splitter Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XML Analysis Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XML Characters Fixing Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XML Validation Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XSL Transformation Step]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Legend:&lt;br /&gt;
: &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt; = the step takes raw document and sends raw document &lt;br /&gt;
: &amp;lt;span class='blue'&amp;gt;RD&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span the step class='green'&amp;gt;FE&amp;lt;/span&amp;gt; = the step takes raw document and sends filter events&lt;br /&gt;
: &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt; = the step takes filter events and sends filter events&lt;br /&gt;
: &amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='blue'&amp;gt;RD&amp;lt;/span&amp;gt; = the step takes filter events and sends raw document&lt;br /&gt;
: &amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='hi'&amp;gt;RD/FE&amp;lt;/span&amp;gt; = the step takes filter events and sends either raw document or filter events&lt;br /&gt;
&lt;br /&gt;
[[Category:Steps]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Steps&amp;diff=610</id>
		<title>Steps</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Steps&amp;diff=610"/>
		<updated>2016-08-30T23:07:04Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Steps are components that execute one specific task. You use them by chaining them into '''pipelines'''. See for example &amp;quot;[[How to Create a Pipeline in Rainbow]]&amp;quot; to see how steps can be used. Rainbow also [[Rainbow - Utilities|comes with several pre-defined pipelines]] using some of these steps.&lt;br /&gt;
&lt;br /&gt;
The Okapi Framework comes with several ready-to-use steps:&lt;br /&gt;
&lt;br /&gt;
{| cellpadding=&amp;quot;8&amp;quot; width=&amp;quot;100%&amp;quot;&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
|&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Batch Translation Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[BOM Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Character Count Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Cleanup Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Copy Or Move Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Combined Xliff Merger Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Create Target Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Desegmentation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Diff Leverage Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Encoding Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Enrycher Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[External Command Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Extraction Verification Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='blue'&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Filter Events to Raw Document Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Format Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Full-Width Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Generate SimpleTM Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[GTT Batch Translation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Id-Based Copy Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Id-Based Aligner Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Image Modification Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Inconsistency Check Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Inline Codes Removal Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Inline Codes Simplifier Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[LanguageTool Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Leveraging Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Line-Break Conversion Step]]&lt;br /&gt;
|&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Localizables Check Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Microsoft Batch Translation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Microsoft Batch Submission Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Moses InlineText Extraction Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Moses InlineText Leveraging Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[MS Word Resaver Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[MS Word Search and Replace Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Original Document Xliff Merger Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Paragraph Alignment Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Post-segmentation Inline Codes Removal Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Properties Setting Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[QuEst Quality Estimation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[QuEst SVM Model Builder Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Quality Check Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Quality Check Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Raw Document to Filter Events Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='hi'&amp;gt;RD/FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Rainbow Translation Kit Creation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='hi'&amp;gt;RD/FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Rainbow Translation Kit Merging Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[RTF Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Remove Target Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Repetition Analysis Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Resource Simplifier Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Scoping Report Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Search and Replace Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Search and Replace Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Segmentation Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Segments to Text Units Converter Step]]&lt;br /&gt;
|&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Sentence Alignment Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[SimpleTM to TMX Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Space Check Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Skeleton Xliff Merger Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Term Extraction Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Text Modification Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[ThreadedWorkQueue Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[TM Import Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Tokenization Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Translation Comparison Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[URI Conversion Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Used Characters Listing Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Word Count Step]]&lt;br /&gt;
* &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Simple Word Count Step]]&lt;br /&gt;
* &amp;lt;span class='green&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Add Whitespace After Kuten Step]]&lt;br /&gt;
* &amp;lt;span class='green&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Remove Whitespace After Kuten Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Trados Analysis Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Trados Cleanup Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Trados Export Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Trados Import Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[Trados Translation Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[TTX Joiner Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[TTX Splitter Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XLIFF Joiner Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XLIFF Splitter Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XML Analysis Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XML Characters Fixing Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XML Validation Step]]&lt;br /&gt;
* &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt;&amp;amp;nbsp; [[XSL Transformation Step]]&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Legend:&lt;br /&gt;
: &amp;lt;span class='blue'&amp;gt;RD-&amp;gt;RD&amp;lt;/span&amp;gt; = the step takes raw document and sends raw document &lt;br /&gt;
: &amp;lt;span class='blue'&amp;gt;RD&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span the step class='green'&amp;gt;FE&amp;lt;/span&amp;gt; = the step takes raw document and sends filter events&lt;br /&gt;
: &amp;lt;span class='green'&amp;gt;FE-&amp;gt;FE&amp;lt;/span&amp;gt; = the step takes filter events and sends filter events&lt;br /&gt;
: &amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='blue'&amp;gt;RD&amp;lt;/span&amp;gt; = the step takes filter events and sends raw document&lt;br /&gt;
: &amp;lt;span class='green'&amp;gt;FE&amp;lt;/span&amp;gt;-&amp;gt;&amp;lt;span class='hi'&amp;gt;RD/FE&amp;lt;/span&amp;gt; = the step takes filter events and sends either raw document or filter events&lt;br /&gt;
&lt;br /&gt;
[[Category:Steps]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Rainbow_-_Command_Line&amp;diff=240</id>
		<title>Rainbow - Command Line</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Rainbow_-_Command_Line&amp;diff=240"/>
		<updated>2016-04-23T23:56:36Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Rainbow Common Menu}}&lt;br /&gt;
&lt;br /&gt;
When starting Rainbow has different behaviors depending on the arguments it has when starting:&lt;br /&gt;
&lt;br /&gt;
* If Rainbow is started with just one argument: it starts in normal mode and takes the argument as a project file to be loaded.&lt;br /&gt;
&lt;br /&gt;
* If Rainbow is started with more than one argument: it starts in command-line mode and interprets the arguments as described in the table below.&lt;br /&gt;
&lt;br /&gt;
When running in batch mode, the log is saved into a file named &amp;lt;code&amp;gt;rainbowBatchLog.txt&amp;lt;/code&amp;gt; in the home directory of the user.&lt;br /&gt;
&lt;br /&gt;
Note that you can also use [[Tikal]] to execute various function from a command line.&lt;br /&gt;
&lt;br /&gt;
The arguments of the command-line can be the following:&lt;br /&gt;
&lt;br /&gt;
{| border=&amp;quot;1&amp;quot; cellpadding=&amp;quot;5&amp;quot; cellspacing=&amp;quot;0&amp;quot;&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| &amp;lt;code&amp;gt;&amp;amp;lt;inputFile&amp;gt;[ -fc &amp;amp;lt;filterConfiguration&amp;gt;]&amp;lt;/code&amp;gt;&lt;br /&gt;
| Sets the input file, and optionally sets the filter configuration to assign to it. You can specify an absolute or a local filename. The input file root is reset to the folder of the given input file. If a project was loaded, all input files in that projects are removed and the input file root reset.&lt;br /&gt;
&lt;br /&gt;
If you specify several input files (and their filter configurations) the first one will be assigned to the &amp;lt;cite&amp;gt;Input List 1&amp;lt;/cite&amp;gt;, the second to the &amp;lt;cite&amp;gt;Input List 2&amp;lt;/cite&amp;gt;, etc.&lt;br /&gt;
&lt;br /&gt;
If the filter configuration is not specified in the command line, the default filter (if one can be found) is used.&lt;br /&gt;
Input files must be specified prior to an output location being specified (via &amp;lt;code&amp;gt;-o&amp;lt;/code&amp;gt;), and the &amp;lt;code&amp;gt;-fc&amp;lt;/code&amp;gt; option must always follow an input file.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| &amp;lt;code&amp;gt;-p &amp;amp;lt;projectFilename&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
| Loads an existing project file &amp;lt;code&amp;gt;&amp;amp;lt;projectFilename&amp;gt;.&amp;lt;/code&amp;gt;&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| &amp;lt;code&amp;gt;-x &amp;amp;lt;Id&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
| Executes the [[Rainbow - Utilities|utility or the predefined pipeline]] with the ID &amp;lt;code&amp;gt;&amp;amp;lt;Id&amp;gt;&amp;lt;/code&amp;gt;. This is done after all arguments of the command line have been processed.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| &amp;lt;code&amp;gt;-pln &amp;amp;lt;pipelineFilename&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
| Loads and execute the specified pipeline stored in &amp;lt;code&amp;gt;&amp;amp;lt;pipelineFilename&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| &amp;lt;code&amp;gt;-se &amp;amp;lt;encoding&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
| Sets the default source encoding to &amp;lt;code&amp;gt;&amp;amp;lt;encoding&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| &amp;lt;code&amp;gt;-te &amp;amp;lt;encoding&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
| Sets the default target encoding to &amp;lt;code&amp;gt;&amp;amp;lt;encoding&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| &amp;lt;code&amp;gt;-sl &amp;amp;lt;langCode&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
| Sets the source language using &amp;lt;code&amp;gt;&amp;amp;lt;langCode&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| &amp;lt;code&amp;gt;-tl &amp;amp;lt;langCode&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
| Sets the target language using &amp;lt;code&amp;gt;&amp;amp;lt;langCode&amp;gt;&amp;lt;/code&amp;gt;.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| &amp;lt;code&amp;gt;-opt &amp;amp;lt;optionFilename&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
| Sets the options file to use for the utility to execute. Use the &amp;lt;code&amp;gt;-np&amp;lt;/code&amp;gt; flag to be prompted or not to modify the options when the command line is executed. The options file must be for the utility defined with &amp;lt;code&amp;gt;-x&amp;lt;/code&amp;gt;. Note that option file are only for utilities, not predefined pipelines.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| &amp;lt;code&amp;gt;-log &amp;amp;lt;logFile&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
| Sets the path to the log file. If not specified &amp;lt;code&amp;gt;{user.home}/rainbowBatchLog.txt&amp;lt;/code&amp;gt; is used.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| &amp;lt;code&amp;gt;-np&amp;lt;/code&amp;gt;&lt;br /&gt;
| No prompt for utility's options.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| &amp;lt;code&amp;gt;-o &amp;amp;lt;outputFile&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
| Sets the output file. If this option is not used and an input file is specified, the output file path and name is build based on the output options of the project (loaded or default).  If this option is specified before an input file is provided, an error will be reported in the log.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| &amp;lt;code&amp;gt;-pd &amp;amp;lt;directory&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
| Sets the parameters directory (the location where the filter parameters files are stored). You can use &amp;lt;code&amp;gt;.&amp;lt;/code&amp;gt; (dot) to specify the current directory. By default, if not project is loaded, the default parameters directory is the user home directory.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| &amp;lt;code&amp;gt;-ir &amp;amp;lt;directory&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
| Sets the input root directory for the first input list. You can use &amp;lt;code&amp;gt;.&amp;lt;/code&amp;gt; (dot) to specify the current directory. This value is also used to set the &amp;lt;code&amp;gt;${inputRootDir}&amp;lt;/code&amp;gt; variable that can be used in some path parameters.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| &amp;lt;code&amp;gt;-rd &amp;amp;lt;directory&amp;gt;&amp;lt;/code&amp;gt;&lt;br /&gt;
| Sets the root directory. You can use &amp;lt;code&amp;gt;.&amp;lt;/code&amp;gt; (dot) to specify the current directory. This value is also used to set the &amp;lt;code&amp;gt;${rootDir}&amp;lt;/code&amp;gt; variable that can be used in some path parameters.&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
| &amp;lt;code&amp;gt;-? or -h&amp;lt;/code&amp;gt;&lt;br /&gt;
| Opens this help page.&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Here are some example of command lines in Windows. They assume Rainbow is installed in &amp;lt;code&amp;gt;C:\rnb&amp;lt;/code&amp;gt; directory.&lt;br /&gt;
&lt;br /&gt;
 C:\&amp;gt;java -jar \rnb\lib\rainbow.jar -x TextRewriting -sl EN -tl FR myInput.xlf -o myOutput.xlf&lt;br /&gt;
&lt;br /&gt;
The command-line above executes the Text Rewriting predefined pipeline with the source language set to EN and the target language set to FR. The input document is the XLIFF file &amp;lt;code&amp;gt;myInput.xlf&amp;lt;/code&amp;gt;, and the modified file is saved as &amp;lt;code&amp;gt;myOutput.xlf&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
 C:\&amp;gt;java -jar \rnb\lib\rainbow.jar -x TranslationComparison -sl EN -tl FR -pd . myHumanTrans.xlf myMachineTrans.txt -fc okf_regex@myText&lt;br /&gt;
&lt;br /&gt;
The command-line above executes the Translation Comparison predefined pipeline with the source language set to EN and the target language set to FR. The current folder (&amp;lt;code&amp;gt;.&amp;lt;/code&amp;gt;) is specified as the parameters directory. The input file &amp;lt;code&amp;gt;myHumanTrans.xlf&amp;lt;/code&amp;gt; is the input document for the &amp;lt;cite&amp;gt;Input List 1&amp;lt;/cite&amp;gt;, and the default XLIFF filter configuration assigned to it. The input file &amp;lt;code&amp;gt;myMachineTrans.txt&amp;lt;/code&amp;gt; is the input document for the &amp;lt;cite&amp;gt;Input List 2&amp;lt;/cite&amp;gt;, and the custom filter parameters &amp;lt;code&amp;gt;okf_regex@myText.fprm&amp;lt;/code&amp;gt; is associated with it. No utility options are specified, so the use will be prompted to set the options.&lt;br /&gt;
&lt;br /&gt;
 C:\&amp;gt;java -jar \rnb\lib\rainbow -h&lt;br /&gt;
&lt;br /&gt;
The command-line above opens this help page.&lt;br /&gt;
&lt;br /&gt;
[[Category:Rainbow]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Id-Based_Aligner_Step&amp;diff=372</id>
		<title>Id-Based Aligner Step</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Id-Based_Aligner_Step&amp;diff=372"/>
		<updated>2016-03-18T19:41:01Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Steps Header}}&lt;br /&gt;
__TOC__&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
This step aligns the text units of two input files based on matching ids. The ids are taken from the name (&amp;lt;code&amp;gt;TextUnit.getName()&amp;lt;/code&amp;gt;) of each&lt;br /&gt;
text unit. Any filter that produces unique names (i.e., id) for its text units will work with this aligner, for example the [[Properties Filter]].&lt;br /&gt;
&lt;br /&gt;
Takes: filter events. Sends: filter events.&lt;br /&gt;
&lt;br /&gt;
If the option &amp;lt;cite&amp;gt;Generate a TMX file&amp;lt;/cite&amp;gt; is set, the events returned are unchanged. If the option is not set, in the events returned each text unit is a new (aligned) bi-lingual text unit. The text units that are in the source but not in the target generate a warning. The text units in the target but not in the source are ignored.&lt;br /&gt;
&lt;br /&gt;
The process expects both input to be non-segmented.&lt;br /&gt;
&lt;br /&gt;
The 'source' file contains is the first input file and provides the source content. The 'target' file is the second input file and provides the target content. If the 'target' file is in a monolingual format (like a Java properties file) the source extracted from that file is used as target content. If the 'target' file is a multilingual file (like an XLIFF document) the target extracted from that file is used as target content.&lt;br /&gt;
&lt;br /&gt;
If the 'target' file is multilingual, in addition to match in id, the step looks if the source content of both text unit with that same name have also the same source text. If they do not, not alignment is made.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Generate a TMX file&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set this option to produce a TMX file. When this option is set, the event returned by the step are unchanged. When this option is not set, each text unit in the returned events is a new text unit with the aligned source and target content. &amp;lt;b&amp;gt;Note&amp;lt;/b&amp;gt;: for target &amp;lt;code&amp;gt;&amp;amp;lt;tuv&amp;amp;gt;&amp;lt;/code&amp;gt; data to be generated in the TMX, the &amp;quot;Copy to/over the target&amp;quot; option must also be checked.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;TMX output path&amp;lt;/cite&amp;gt; &amp;amp;mdash; Output path of the TMX file.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Fall back to source text&amp;lt;/cite&amp;gt; &amp;amp;mdash; If no target text available, use the source text.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Copy to/over the target&amp;lt;/cite&amp;gt; &amp;amp;mdash; Copy the target. Existing target will be lost, and the target will not be segmented. If the entry from the 'target' file is set to ''approved'', the property is passed along too.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Create an alternate translation annotation&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set this option to attach an alternate translation annotation to the processed entry.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;Suppress TUs with no target&amp;lt;/cite&amp;gt; &amp;amp;mdash; Set this option to prevent the step from passing on any text units that lack a target.&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
* Assumes that each text unit has a unique name value. Make sure the filter being used is one that produces unique names (&amp;lt;code&amp;gt;TextUnit.getName()&amp;lt;/code&amp;gt;) for all text units in the documents.&lt;br /&gt;
* This step aligns the text units, not the possible segments inside the text units.&lt;br /&gt;
&lt;br /&gt;
[[Category:Steps]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=OpenXML_Filter&amp;diff=20</id>
		<title>OpenXML Filter</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=OpenXML_Filter&amp;diff=20"/>
		<updated>2015-11-28T06:15:35Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Filters Header}}&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
This filter allows you to process the different types of documents of the Microsoft Office suite from 2007 and later, such as DOCX (text documents), XLSX (spreadsheets) and PPTX (presentations).  These documents are based on the OpenXML format, opposed to the binary formats used by pre-2007 versions of Office.&lt;br /&gt;
&lt;br /&gt;
==Processing Details==&lt;br /&gt;
&lt;br /&gt;
TODO&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
The filter parameters are divided into '''General Options''', which apply to all formats, and format-specific options.&lt;br /&gt;
&lt;br /&gt;
===General Options===&lt;br /&gt;
; Translate Document Properties&lt;br /&gt;
: When checked, exposes the following document properties for translation: title, subject, creator, description, category, keywords, content status. Default: on.&lt;br /&gt;
; Translate Comments&lt;br /&gt;
: When checked, exposes document comments for translation.  Default: on.&lt;br /&gt;
; Clean Tags Aggressively&lt;br /&gt;
: When checked, strips additional formatting tags related to text spacing.  This is meant to improve filtering in cases where Office documents were converted from other formats (in particular, PDF), and imperfect conversion added a lot of extra formatting noise.  Default: off.&lt;br /&gt;
&lt;br /&gt;
=== Word Options ===&lt;br /&gt;
; Translated Headers and Footers&lt;br /&gt;
: When checked, exposes header and footer content for translation.  Default: on.&lt;br /&gt;
; Translated Hidden Text&lt;br /&gt;
: When checked, exposes hidden text for translation.  Default: on.&lt;br /&gt;
; Exclude Graphical Metadata&lt;br /&gt;
: When not checked, labels associated with drawings and word art are exposed for translation.  When checked, these labels (which are frequently not displayed in the document) are suppressed. Default: off.&lt;br /&gt;
; Styles to Exclude&lt;br /&gt;
: Text using any of the selected styles will not be exposed for translation . Default: none.&lt;br /&gt;
&lt;br /&gt;
=== Excel Options ===&lt;br /&gt;
; Translate Hidden Rows and Columns&lt;br /&gt;
: When checked, hidden rows and columns are exposed for translation.  Default: off.&lt;br /&gt;
; Exclude Marked Columns in Each Sheet&lt;br /&gt;
: When checked, columns selected in the &amp;quot;Sheet # Columns to Exclude&amp;quot; lists will be excluded from translation.  The filter allows for sheets 1 and 2 to be configured individually.  Sheets 3 and higher must be configured as a single group.  Default: off.&lt;br /&gt;
; Colors to Exclude&lt;br /&gt;
: Text with a foreground color matching any of the selected colors in this option will be excluded from translation.  These colors correspond to the standard color palette of Excel 2010.  The configuration itself stores these values as RGB, so specific colors not explicitly listed here may be excluded by modifying the .fprm file by hand.  Default: none.&lt;br /&gt;
&lt;br /&gt;
=== PowerPoint Options ===&lt;br /&gt;
; Translate Notes&lt;br /&gt;
: When checked, expose slide notes for translation.  Default: off.&lt;br /&gt;
; Translate Masters&lt;br /&gt;
: When checked, expose master slides for translation.  This will also expose for translation content from layouts that are currently in use by at least one slide.  Default: off.&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
* Various, see [https://bitbucket.org/okapiframework/okapi/issues?status=new&amp;amp;title=~OpenXML the issues list].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Filters]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=KantanMT_Connector&amp;diff=154</id>
		<title>KantanMT Connector</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=KantanMT_Connector&amp;diff=154"/>
		<updated>2015-11-13T01:28:03Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: Created page with &amp;quot;{{Connectors Header}} __TOC__ ==Overview==  The commercial [https://kantanmt.com KantanMT] service can be accessed via API, which is documented at http://docs.kantanmt.apiary.io....&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;{{Connectors Header}}&lt;br /&gt;
__TOC__&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
The commercial [https://kantanmt.com KantanMT] service can be accessed via API, which is documented at http://docs.kantanmt.apiary.io.&lt;br /&gt;
&lt;br /&gt;
The connector assumes that the specified KantanMT engine is running, which is not always the case.  Users must start the the appropriate engine prior to using the connector, either via the KantanMT dashboard or using the API (for example, with &amp;lt;tt&amp;gt;curl&amp;lt;/tt&amp;gt;.)&lt;br /&gt;
&lt;br /&gt;
==Using the Connector==&lt;br /&gt;
&lt;br /&gt;
In [[Rainbow]], the connector can be accessed through the [[Leveraging Step]].  It can also be called programmatically.&lt;br /&gt;
&lt;br /&gt;
==Parameters==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;KantanMT Client Profile&amp;lt;/cite&amp;gt; (internal name: &amp;lt;tt&amp;gt;profileName&amp;lt;/tt&amp;gt;) &amp;amp;mdash; the client profile to use.  (Sample value: &amp;quot;Test-EN-DE&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
&amp;lt;cite&amp;gt;KantanMT Authorization Token&amp;lt;/cite&amp;gt; (internal name: &amp;lt;tt&amp;gt;apiToken&amp;lt;/tt&amp;gt;) &amp;amp;mdash; the authorization token.  (Sample value: &amp;quot;ABCdef123467&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
==Limitations==&lt;br /&gt;
&lt;br /&gt;
* The connector assumes that the specified KantanMT engine is running, which is not always the case.  Users must start the the appropriate engine prior to using the connector, either via the KantanMT dashboard or using the API (for example, with &amp;lt;tt&amp;gt;curl&amp;lt;/tt&amp;gt;.)&lt;br /&gt;
&lt;br /&gt;
[[Category:Connectors]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Longhorn&amp;diff=206</id>
		<title>Longhorn</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Longhorn&amp;diff=206"/>
		<updated>2015-06-16T21:19:49Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* Download and Installation */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__TOC__&lt;br /&gt;
==Overview==&lt;br /&gt;
&lt;br /&gt;
Longhorn is a server application that allows you to execute Batch Configurations remotely on any set of input files. Batch Configurations which include pre-defined pipelines and filter configurations, can be exported from [[Rainbow]].&lt;br /&gt;
&lt;br /&gt;
The distribution also includes a client library to access the Longhorn Web services.&lt;br /&gt;
&lt;br /&gt;
==Download and Installation==&lt;br /&gt;
&lt;br /&gt;
* '''Stable release: http://bintray.com/okapi/Distribution/Longhorn&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;del&amp;gt;Development release (snapshot): http://okapi.opentag.com/snapshots&amp;lt;/del&amp;gt;  Development snapshots are not currently available.&lt;br /&gt;
&lt;br /&gt;
To install Longhorn:&lt;br /&gt;
&lt;br /&gt;
* Unzip the distribution file on your server.&lt;br /&gt;
* Follow the instructions provided with the &amp;lt;code&amp;gt;readme&amp;lt;/code&amp;gt; file of the distribution.&lt;br /&gt;
* Starting with m24, Longhorn requires Java 1.7.&lt;br /&gt;
&lt;br /&gt;
==Functionality==&lt;br /&gt;
&lt;br /&gt;
To process files with Longhorn these steps are required:&lt;br /&gt;
# Create a temporary project&lt;br /&gt;
# Upload a Batch Configuration file into that project&lt;br /&gt;
# Upload the input files into that project&lt;br /&gt;
# Execute the project&lt;br /&gt;
# Download the output files&lt;br /&gt;
# Delete the project&lt;br /&gt;
&lt;br /&gt;
==Usage==&lt;br /&gt;
&lt;br /&gt;
There are three ways to access Longhorns functionality. There is&lt;br /&gt;
* a REST interface,&lt;br /&gt;
* a Java API and&lt;br /&gt;
* an HTML client.&lt;br /&gt;
&lt;br /&gt;
They can be used as described below.&lt;br /&gt;
&lt;br /&gt;
===REST-Interface===&lt;br /&gt;
&lt;br /&gt;
Longhorn can be accessed directly via HTTP methods:&lt;br /&gt;
;POST http://{host}/okapi-longhorn/projects/new : Creates a new temporary project and returns its URI (e.g. &amp;lt;code&amp;gt;http://localhost/okapi-longhorn/projects/1&amp;lt;/code&amp;gt;)&lt;br /&gt;
;POST http://{host}/okapi-longhorn/projects/1/batchConfiguration : Uploads a Batch Configuration file&lt;br /&gt;
;POST http://{host}/okapi-longhorn/projects/1/inputFiles.zip : Adds input files as a zip archive (the zip will be extracted and the included files will be used as input files)&lt;br /&gt;
;PUT http://{host}/okapi-longhorn/projects/1/inputFiles/help.html : Uploads a file that will have the name 'help.html'&lt;br /&gt;
;GET http://{host}/okapi-longhorn/projects/1/inputFiles/help.html: Retrieve an input file that was previously added with PUT or POST&lt;br /&gt;
;POST http://{host}/okapi-longhorn/projects/1/tasks/execute : Executes the Batch Configuration on the uploaded input files&lt;br /&gt;
;POST http://{host}/okapi-longhorn/projects/1/tasks/execute/en-US/de-DE : Executes the Batch Configuration on the uploaded input files with the source language set to 'en-US' and the target language set to 'de-DE'&lt;br /&gt;
;POST http://{host}/okapi-longhorn/projects/1/tasks/execute/en-US?targets=de-DE&amp;amp;targets=fr-FR : Executes the Batch Configuration on the uploaded input files with the source language set to 'en-US' and multiple target languages, 'de-DE' and 'fr-FR'&lt;br /&gt;
;GET http://{host}/okapi-longhorn/projects/1/outputFiles : Returns a list of the output files generated&lt;br /&gt;
;GET http://{host}/okapi-longhorn/projects/1/outputFiles/help.out.html : Accesses the output file 'help.out.html' directly&lt;br /&gt;
;GET http://{host}/okapi-longhorn/projects/1/outputFiles.zip : Returns all output files in a zip archive&lt;br /&gt;
;DEL http://{host}/okapi-longhorn/projects/1 : Deletes the project&lt;br /&gt;
;GET http://{host}/okapi-longhorn/projects : Returns a list of all projects on the server&lt;br /&gt;
&lt;br /&gt;
===Java API===&lt;br /&gt;
&lt;br /&gt;
The API is distributed as a &amp;lt;code&amp;gt;.jar&amp;lt;/code&amp;gt; file in the Longhorn distribution package. You can also build it from the Okapi source code via Maven from the project &amp;lt;code&amp;gt;lib-longhorn-api&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
====Maven====&lt;br /&gt;
The API is available as a maven dependency.  Add this repository to your &amp;lt;tt&amp;gt;pom.xml&amp;lt;/tt&amp;gt;:&lt;br /&gt;
    &amp;lt;repository&amp;gt;&lt;br /&gt;
        &amp;lt;id&amp;gt;okapi-longhorn-release&amp;lt;/id&amp;gt;&lt;br /&gt;
        &amp;lt;name&amp;gt;Okapi Longhorn Release&amp;lt;/name&amp;gt;&lt;br /&gt;
        &amp;lt;url&amp;gt;http://repository-opentag.forge.cloudbees.com/release/&amp;lt;/url&amp;gt;&lt;br /&gt;
    &amp;lt;/repository&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Along with this dependency, substituting in a valid version number (e.g, &amp;lt;tt&amp;gt;0.27&amp;lt;/tt&amp;gt;):&lt;br /&gt;
    &amp;lt;dependency&amp;gt;&lt;br /&gt;
      &amp;lt;groupId&amp;gt;net.sf.okapi.lib&amp;lt;/groupId&amp;gt;&lt;br /&gt;
      &amp;lt;artifactId&amp;gt;okapi-lib-longhorn-api&amp;lt;/artifactId&amp;gt;&lt;br /&gt;
      &amp;lt;version&amp;gt;${okapi.version}&amp;lt;/version&amp;gt;&lt;br /&gt;
    &amp;lt;/dependency&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====Sample Code====&lt;br /&gt;
&lt;br /&gt;
 LonghornService ws = new RESTService(new URI(&amp;quot;http://localhost:9095/okapi-longhorn&amp;quot;));&lt;br /&gt;
 &lt;br /&gt;
 // Create project&lt;br /&gt;
 LonghornProject proj = ws.createProject();&lt;br /&gt;
 &lt;br /&gt;
 // Post batch configuration&lt;br /&gt;
 File bconfFile = new File(&amp;quot;C:\\setup.bconf&amp;quot;);&lt;br /&gt;
 proj.addBatchConfiguration(bconfFile);&lt;br /&gt;
 &lt;br /&gt;
 // Send input files&lt;br /&gt;
 &lt;br /&gt;
 // First by single upload...&lt;br /&gt;
 File file1 = new File(&amp;quot;C:\\help.html&amp;quot;);&lt;br /&gt;
 // * in the root directory&lt;br /&gt;
 proj.addInputFile(file1, file1.getName());&lt;br /&gt;
 // * and in a sub-directory&lt;br /&gt;
 proj.addInputFile(file1, &amp;quot;samefile/&amp;quot; + file1.getName());&lt;br /&gt;
 &lt;br /&gt;
 // ...then by package upload&lt;br /&gt;
 File inputPackage = new File(&amp;quot;C:\\more_files.zip&amp;quot;);&lt;br /&gt;
 proj.addInputFilesFromZip(inputPackage);&lt;br /&gt;
 &lt;br /&gt;
 // Execute pipeline&lt;br /&gt;
 // Languages don't matter&lt;br /&gt;
 proj.executePipeline();&lt;br /&gt;
 // Languages matter&lt;br /&gt;
 proj.executePipeline(&amp;quot;en-US&amp;quot;, &amp;quot;de-DE&amp;quot;);&lt;br /&gt;
 &lt;br /&gt;
 // Get output files&lt;br /&gt;
 ArrayList&amp;lt;LonghornFile&amp;gt; outputFiles = proj.getOutputFiles();&lt;br /&gt;
 &lt;br /&gt;
 // Does the fetching of files work?&lt;br /&gt;
 for (LonghornFile of : outputFiles) {&lt;br /&gt;
 	InputStream is = of.openStream();&lt;br /&gt;
 	//TODO save InputStream to local file&lt;br /&gt;
 }&lt;br /&gt;
 &lt;br /&gt;
 // Delete project&lt;br /&gt;
 proj.delete();&lt;br /&gt;
&lt;br /&gt;
===HTML-Client===&lt;br /&gt;
&lt;br /&gt;
You can create projects and upload/download files via an integrated HTML client, too. Uploading input files (and downloading output files) as a zip archive is currently not implemented for the HTML client.&lt;br /&gt;
&lt;br /&gt;
[[File:longhorn_html_client.png]]&lt;br /&gt;
&lt;br /&gt;
===Configuration===&lt;br /&gt;
Since Okapi M22 Okapi Longhorn can be build to run multiple instances on one server.&lt;br /&gt;
You can adjust the build so that it is possible to run multiple Longhorn instances in one JBoss application server. Therefore, the build must be called with an additional parameter:&lt;br /&gt;
&lt;br /&gt;
 mvn clean verify -DuseUniqueContextRoot&lt;br /&gt;
&lt;br /&gt;
====Configure working directory path====&lt;br /&gt;
Longhorn has 2 options to configure the working directory of longhorn (sort by priority): &lt;br /&gt;
#system parameter &amp;quot;LONGHORN_WORKDIR&amp;quot;&lt;br /&gt;
#configuration file in user.home &amp;quot;/okapi-longhorn-configuration.xml&amp;quot;&lt;br /&gt;
If nothing is defined, the working-directory is in user.home in folder &amp;quot;Okapi-Longhorn-Files&amp;quot;.&lt;br /&gt;
Longhorn configuration file example:&lt;br /&gt;
&lt;br /&gt;
 &amp;lt;longhorn-config&amp;gt;&lt;br /&gt;
     &amp;lt;use-unique-working-directory&amp;gt;True&amp;lt;/use-unique-working-directory&amp;gt;&lt;br /&gt;
     &amp;lt;working-directory&amp;gt;D:\testData\longhorn-files&amp;lt;/working-directory&amp;gt;&lt;br /&gt;
 &amp;lt;/longhorn-config&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====Configuration Options====&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! option&lt;br /&gt;
! description&lt;br /&gt;
! data type&lt;br /&gt;
|-&lt;br /&gt;
| working-directory&lt;br /&gt;
| path of the working directory&lt;br /&gt;
| string&lt;br /&gt;
|-&lt;br /&gt;
| use-unique-working-directory&lt;br /&gt;
| if set to true the version of longhorn will be added to working directory name&lt;br /&gt;
e.g path/to/working/directory_M0.21&lt;br /&gt;
| boolean(True or False)&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
[[Category:Longhorn]]&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=Knowledge_Base&amp;diff=506</id>
		<title>Knowledge Base</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=Knowledge_Base&amp;diff=506"/>
		<updated>2014-07-28T21:18:08Z</updated>

		<summary type="html">&lt;p&gt;Ctingley: /* For Developers */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
For larger tutorials see the [[Tutorials|Tutorials page]].&lt;br /&gt;
{| border=&amp;quot;0&amp;quot; cellspacing=&amp;quot;0&amp;quot; cellpadding=&amp;quot;8&amp;quot; width=&amp;quot;100%&amp;quot;&lt;br /&gt;
|- valign=&amp;quot;top&amp;quot;&lt;br /&gt;
|&lt;br /&gt;
==Overview==&lt;br /&gt;
* [[Getting Started|Installing the tools]]&lt;br /&gt;
* [[Filters|List of the file formats supported]]&lt;br /&gt;
* [[Steps|List of the functions available]]&lt;br /&gt;
* [[Connectors|List of the connectors to TM and MT systems]]&lt;br /&gt;
&lt;br /&gt;
==Filters==&lt;br /&gt;
* [[Understanding Filter Configurations]]&lt;br /&gt;
* [[How to Create a Custom Configuration for the XML Filter]]&lt;br /&gt;
* [[How to Extract Text for Translation]]&lt;br /&gt;
* [[How to Translate XLIFF Documents]]&lt;br /&gt;
* [[How to Post-Process Extracted Text]]&lt;br /&gt;
* [[Okapi Filters Plugin for OmegaT]]&lt;br /&gt;
* [[How to Translate Transifex Projects with OmegaT]]&lt;br /&gt;
* [[How to create an XLIFF file from Excel]]&lt;br /&gt;
&lt;br /&gt;
==Pipelines and Steps==&lt;br /&gt;
* [[How to Create a Pipeline in Rainbow]]&lt;br /&gt;
&lt;br /&gt;
==Standards==&lt;br /&gt;
* [[Open Standards|Open Standards used in translation and localization]]&lt;br /&gt;
* [[SRX and Java]]&lt;br /&gt;
|&lt;br /&gt;
==Translation Resource Connectors==&lt;br /&gt;
* [[How to Machine-Translate a TMX File]]&lt;br /&gt;
* [[Match Types|List of the types of match]]&lt;br /&gt;
* [[Trying out the Microsoft Translator Connector]]&lt;br /&gt;
&lt;br /&gt;
==Translation Memories==&lt;br /&gt;
* [[How to Create a Pensieve TM]]&lt;br /&gt;
** [[How to Create a Pensieve TM#Using Rainbow|Using Rainbow]]&lt;br /&gt;
** [[How to Create a Pensieve TM#Using Tikal|Using Tikal]]&lt;br /&gt;
* [[How to Query a Pensieve TM]]&lt;br /&gt;
* [[How to Create a TMX File from a Transifex Project]]&lt;br /&gt;
&lt;br /&gt;
==Miscellaneous==&lt;br /&gt;
* [[How to Change the Java Parameters for Rainbow]]&lt;br /&gt;
* [[How to Add Languages to Rainbow]]&lt;br /&gt;
* [[How to Use CheckMate with OmegaT]]&lt;br /&gt;
&lt;br /&gt;
==For Developers==&lt;br /&gt;
* [[Maven Basics]]&lt;br /&gt;
* [http://okapi.opentag.com/devguide/ Okapi Developer's Guide]&lt;br /&gt;
* [http://okapi.opentag.com/javadoc/ Okapi Javadoc]&lt;br /&gt;
* [[Okapi Java Persistence API]]&lt;br /&gt;
* [[Okapi Subfilters]]&lt;br /&gt;
* [[Creating UI with the net.sf.okapi.common.ui.abstracteditor Package]]&lt;br /&gt;
|}&lt;/div&gt;</summary>
		<author><name>Ctingley</name></author>
	</entry>
</feed>