Trying out the Microsoft Translator Connector
The Microsoft Translator Connector is an Okapi component that connects to Microsoft Translator Text Service (referred to as Translator Service hereafter), which is part of the Microsoft Cognitive Services.
This wiki page explains how to try out the Translator Service using the Tikal command line utility.
Retirement of version 2 API
Microsoft has retired their version 2 API on 2019-4-30 as described in this page. Because of this, the Microsoft Connector found in the latest stable release, M37, no longer works on and after 2019-5-01.
The support of the version 3 API has been added to Okapi in mid April after the M37 release. To use Microsoft's machine translation service, please pick up the M38 snapshot version from here.
The rest of this page assumes that you are using the M38 snapshot version built after mid April, 2019, the M38 stable release (which has not been released as of this writing in mid August, 2019), or later.
Obtaining Azure Key
To use the Microsoft Translator Connector, you need an Azure Key. If you already have a key for version 2 API, the same key should work. Otherwise, please read this page. Microsoft issues a key free of charge with certain limitations, which is enough to try out the connector as described in this page.
Tikal provides a way to try out the connector easily.
First you need to create a configuration file that looks like:
#v1 azureKey=your-azure-key baseURL=the-base-url
using a text editor. Here your-azure-key is the Azure Key that was obtained from Microsoft. the-base-url is one of the URLs listed in Base URLs section in the API Reference.
For example (warning: the Azure Key here is not valid):
#v1 azureKey=4f4cfe47becf471a0123456789abcdef baseURL=https://api-nam.cognitive.microsofttranslator.com
We assume you have saved this file as
Now you can use the connector with Tikal. Try for instance:
tikal.sh -q "This is a test" -sl en -tl fr -ms config.cfg
(On a Windows system, type "tikal" instead of "./tikal.sh".)
(On a Linux/Unix/macOS system and PATH doesn't include ".", type "./tikal.sh" instead.)
This command line uses the following parameters:
-q "This is a test"indicates that we want to search for a translation (i.e. do a query) and the source text to search for is "
This is a test".
-sl enindicates that the source language is English
-tl frindicates that the target language is French
-ms config.cfgspecifies to use the Microsoft Translator Connector and to use
config.cfgfor the connector's configuration.
This should give you back something like:
= From net.sf.okapi.connectors.microsoft.MicrosoftMTConnector (en->fr) Threshold=-10, Maximum hits=1 Engine: 'general' score: 95, origin: 'Microsoft-Translator' (from MT) Source: "This is a test" Target: "C'est un test"
With the Leveraging Step
The connector is available in the Leveraging Step, so you can use it on any pipeline you need.
You can also use Tikal's Translate Files command to process directly an file supported by Okapi. For example, the following command creates an output file
myFile.out.docx translated into Japanese. That is if the file is small enough to be processed within the limitations of your license.
tikal.sh -t myFile.docx -sl en -tl ja -ms config.cfg
With the Microsoft Batch Translation Step
The Microsoft Batch Translation Step can also be used to generate the target text using the Translator Service.
For example, to translate any document for which Okapi has a filter you can use the following pipeline:
- = Raw Document to Filter Events Step
- + Microsoft Batch Translation Step
- + Filter Events to Raw Document Step
The Microsoft Batch Translation Step is the preferred Step to use over the Leveraging Step because it sends many pieces (paragraphs) of text in one batch and more efficient. However, this might cause too many or too large text to be sent to the Translator Service than the service's limits. If that happens, the work around might be to use the Leveraging Step.
The following features are no longer supported because the Translator Service no longer supports the underlying features:
- The Translator Service no longer has a built-in translation memory feature.
- Microsoft Batch Submission Step
- The threshold and the number of maximum hits that could be specified with
-optcommand line flag for Tikal or the Microsoft Batch Translation Step UI have no effect.