<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>http://okapiframework.org/wiki/index.php?action=history&amp;feed=atom&amp;title=How_to_Machine-Translate_a_TMX_File</id>
	<title>How to Machine-Translate a TMX File - Revision history</title>
	<link rel="self" type="application/atom+xml" href="http://okapiframework.org/wiki/index.php?action=history&amp;feed=atom&amp;title=How_to_Machine-Translate_a_TMX_File"/>
	<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=How_to_Machine-Translate_a_TMX_File&amp;action=history"/>
	<updated>2026-04-22T21:41:47Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.38.2</generator>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=How_to_Machine-Translate_a_TMX_File&amp;diff=459&amp;oldid=prev</id>
		<title>Ysavourel: 1 revision imported</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=How_to_Machine-Translate_a_TMX_File&amp;diff=459&amp;oldid=prev"/>
		<updated>2016-06-04T23:20:03Z</updated>

		<summary type="html">&lt;p&gt;1 revision imported&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;1&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;1&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 19:20, 4 June 2016&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-notice&quot; lang=&quot;en&quot;&gt;&lt;div class=&quot;mw-diff-empty&quot;&gt;(No difference)&lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;</summary>
		<author><name>Ysavourel</name></author>
	</entry>
	<entry>
		<id>http://okapiframework.org/wiki/index.php?title=How_to_Machine-Translate_a_TMX_File&amp;diff=458&amp;oldid=prev</id>
		<title>Ysavourel at 12:19, 19 November 2015</title>
		<link rel="alternate" type="text/html" href="http://okapiframework.org/wiki/index.php?title=How_to_Machine-Translate_a_TMX_File&amp;diff=458&amp;oldid=prev"/>
		<updated>2015-11-19T12:19:04Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;Imagine that you have a TMX file of segments to be translated, and you need to fill it with machine-translation entries so you can use the file as a fall-back TM in a tool where you do not have access to machine translation.&lt;br /&gt;
&lt;br /&gt;
{{WarningBox|You must be careful with the resulting file: It will be a TMX file with raw (un-edited) machine translation in it, without indication that the content is MT rather than a final translation. Using MT on TMX files is usually done within a specific process, where ultimately the MT candidates are post-edited in a controlled environment.}}&lt;br /&gt;
&lt;br /&gt;
There are several ways to do this with the Okapi tools:&lt;br /&gt;
&lt;br /&gt;
==Using the Leveraging Step==&lt;br /&gt;
&lt;br /&gt;
If you want to use a machine translation system for which you have a [[Connectors|connector]], you can easily create a simple pipeline that uses the [[Leveraging Step]].&lt;br /&gt;
&lt;br /&gt;
1. Start [[Rainbow]].&lt;br /&gt;
&lt;br /&gt;
2. Drop your TMX document in the &amp;lt;cite&amp;gt;Input List 1&amp;lt;/cite&amp;gt; tab.&lt;br /&gt;
&lt;br /&gt;
3. In the &amp;lt;cite&amp;gt;Languages and Encoding&amp;lt;/cite&amp;gt; tab: select the proper languages and encoding. For a TMX document, only the target (output) encoding will be used as the input encoding is detected automatically.&lt;br /&gt;
&lt;br /&gt;
4. In the &amp;lt;cite&amp;gt;Other Settings&amp;lt;/cite&amp;gt; tab: if needed, change the name or location for the output file. We will keep the default which is the same name as the input file, with an extra &amp;lt;code&amp;gt;.out&amp;lt;/code&amp;gt; prepended to the &amp;lt;code&amp;gt;.tmx&amp;lt;/code&amp;gt; extension.&lt;br /&gt;
&lt;br /&gt;
5. Select &amp;lt;cite&amp;gt;Utilities&amp;lt;/cite&amp;gt; &amp;gt; &amp;lt;cite&amp;gt;Edit / Execute Pipeline&amp;lt;/cite&amp;gt;. This opens the &amp;lt;cite&amp;gt;[[Rainbow - Edit / Execute Pipeline|Edit / Execute Pipeline]]&amp;lt;/cite&amp;gt; dialog box where you create the new pipeline.&lt;br /&gt;
&lt;br /&gt;
We need three steps:&lt;br /&gt;
&lt;br /&gt;
* [[Raw Document to Filter Events Step]] to extract the translatable text from the TMX input.&lt;br /&gt;
* [[Leveraging Step]] to perform the machine translation.&lt;br /&gt;
* [[Filter Events to Raw Document Step]] to re-write the document back into its original TMX format.&lt;br /&gt;
&lt;br /&gt;
6. Use the &amp;lt;cite&amp;gt;Add Step&amp;lt;/cite&amp;gt; button to add those three steps in that order.&lt;br /&gt;
&lt;br /&gt;
The first and last steps have no parameters as they take their information from Rainbow's main tabs.&lt;br /&gt;
&lt;br /&gt;
7. Select the [[Leveraging Step]] to set up your machine translation option. First,make sure the option &amp;lt;cite&amp;gt;Leverage the text units with existing translations&amp;lt;/cite&amp;gt; is set. Those &amp;quot;existing translations&amp;quot; come from the connector you select. In this example we want to use a machine translation system, but you could also use translation memories. In our case an MT system accessible to everyone is Google Translate: Select the &amp;lt;cite&amp;gt;Google Translate Services&amp;lt;/cite&amp;gt;. For more information on other systems see the &amp;quot;[[Connectors]]&amp;quot; page.&lt;br /&gt;
&lt;br /&gt;
8. Make sure the option &amp;lt;cite&amp;gt;Leverage only if the match is equal or above this score&amp;lt;/cite&amp;gt; has its value set to 95 or lower. Translation proposals coming from the [[Google MT Connector]] have a score of 95. If you set a higher value, no translation will be retained.&lt;br /&gt;
&lt;br /&gt;
9. Make sure the option &amp;lt;cite&amp;gt;Fill the target with the leveraged translation&amp;lt;/cite&amp;gt; is set. This tells the tool to copy the translation coming from the connector into the target.&lt;br /&gt;
&lt;br /&gt;
Note that if there is already a target entry (empty or with text) the machine translation is copied over the existing one. The original target content is not overwritten by the machine translation is the following cases:&lt;br /&gt;
&lt;br /&gt;
* If the text unit is marked as non-translatable.&lt;br /&gt;
* If the target as an ''approved'' property set to &amp;quot;yes&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
None of those condition is likely to exist in text units coming directly from a TMX file.&lt;br /&gt;
&lt;br /&gt;
Notice that you could generate a TMX document with the translation directly from this step, instead of re-writing our original TMX. But in this case we want to translate the original TMX file, keeping all its attributes, comments, etc. So the best way to do this is to re-write the original file with the modified text units.&lt;br /&gt;
&lt;br /&gt;
10. At this point you are ready to process the input file. Click &amp;lt;cite&amp;gt;Execute&amp;lt;/cite&amp;gt; to run the pipeline.&lt;br /&gt;
&lt;br /&gt;
Depending on the number of files you process and their size it may take some time. Note also that the translation is fetched from the Internet so that may slow down the process a bit too.&lt;br /&gt;
&lt;br /&gt;
When it is done you should have an output TMX document in the same directory as the input one, and that file should have the machine translation for each source entry.&lt;br /&gt;
&lt;br /&gt;
==Using the Batch Translation Step==&lt;br /&gt;
&lt;br /&gt;
In some cases you may have an MT system for which there is no connector in Okapi. You still can use it, as long as a few requirements are fulfilled:&lt;br /&gt;
&lt;br /&gt;
* the MT system must be able to translate HTML files&lt;br /&gt;
* the MT system must have a command-line mode&lt;br /&gt;
&lt;br /&gt;
For example, a system that fills those requirements is ProMT. It can translate HTML documents, and can be run from the command-line. Note that some version of ProMT are capable of taking the TMX file directly in input, but for the purpose of this example we assume you cannot do that.&lt;br /&gt;
&lt;br /&gt;
1. Start [[Rainbow]].&lt;br /&gt;
&lt;br /&gt;
2. Drop your TMX document in the &amp;lt;cite&amp;gt;Input List 1&amp;lt;/cite&amp;gt; tab.&lt;br /&gt;
&lt;br /&gt;
3. In the &amp;lt;cite&amp;gt;Languages and Encoding&amp;lt;/cite&amp;gt; tab: select the proper languages and encoding. For a TMX document, only the target (output) encoding will be used as the input encoding is detected automatically.&lt;br /&gt;
&lt;br /&gt;
4. Select &amp;lt;cite&amp;gt;Utilities&amp;lt;/cite&amp;gt; &amp;gt; &amp;lt;cite&amp;gt;Batch Translation&amp;lt;/cite&amp;gt;. This is a pre-defined pipeline, with a single step: the [[Batch Translation Step]].&lt;br /&gt;
&lt;br /&gt;
5. In the &amp;lt;cite&amp;gt;Command line&amp;lt;/cite&amp;gt; field enter the DOS command that calls ProMT to translate an HTML document. For the input file use the variable &amp;lt;code&amp;gt;${inputPath}&amp;lt;/code&amp;gt;, for the output use the variable &amp;lt;code&amp;gt;${outputPath}&amp;lt;/code&amp;gt;. You also need to specify the language pair with the &amp;lt;code&amp;gt;/d&amp;lt;/code&amp;gt; parameter. You can use the two variable &amp;lt;code&amp;gt;${srcLangName}&amp;lt;/code&amp;gt; and &amp;lt;code&amp;gt;${trgLangName}&amp;lt;/code&amp;gt; for this.&lt;br /&gt;
&lt;br /&gt;
 &amp;quot;C:\Program Files\PRMT9\FILETRANS\FileTranslator.exe&amp;quot; ${inputPath} /as /ac /d:${srcLangName}-${trgLangName} /o:${outputPath}&lt;br /&gt;
&lt;br /&gt;
6. Make sure the option &amp;lt;cite&amp;gt;Create the following TMX document&amp;lt;/cite&amp;gt; is set and enter the full path of the TMX document to create.&lt;br /&gt;
&lt;br /&gt;
6. At this point you are ready to execute the process: click &amp;lt;cite&amp;gt;Execute&amp;lt;/cite&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
This will take the input TMX, convert chunks of its content into temporary HTML file, run the command line on that HTML document, get back the translation from the translated HTML and place it into the TMX output.&lt;br /&gt;
&lt;br /&gt;
Note that because the [[Batch Translation Step]] is a step you can alos use it in your own pipelines, along with other steps, to perform a set of customized tasks that corresponds to your specific needs. See &amp;quot;[[How to Create a Pipeline in Rainbow]]&amp;quot; for more details.&lt;br /&gt;
&lt;br /&gt;
[[Category:TMX]]&lt;/div&gt;</summary>
		<author><name>Ysavourel</name></author>
	</entry>
</feed>