Microsoft Translator Connector

From Okapi Framework
Jump to navigation Jump to search

Retirement of version 2 API

MICROSOFT CONNECTOR of the Okapi stable releases will STOP WORKING at the end of April, 2019.

Microsoft will retire their version 2 API on 2019-4-30 as described in this page. Because of this, the Microsoft Connector found in the latest stable release, M37, will no longer work on and after 2019-5-01.

The support of the version 3 API has been added to Okapi in mid April after the M37 release. If you need to use Microsoft's machine translation service, please pick up the M38 snapshot version from here. Please note this is a minimal implementation and it does not support any new features such as profanity filtering,

Because the version 3 API no longer supports the translation memory, that aspect of function is not available even if you use the latest Okapi M38 snapshot version.

You will need an "azure key" to use the version 3 API. If you already have a key for version 2, the same key should work. For information on how to obtain an azure key, please see this page.

Information below is mostly out of date. It is kept as reference until full update of this page is done.

Overview

The Microsoft MT engine is freely available from Microsoft at http://www.microsofttranslator.com. Volume limitations apply. The engine supports a large number of language pairs, both common and less common. The list is available at http://www.microsofttranslator.com/help.

This connector uses the HTTP v2 API. You can get more information about the API and its terms here: http://sdk.microsofttranslator.com.

To use this connector you need a "Azure Key" from Microsoft. See the Microsoft pages for more information.

You must also respect Microsoft's Terms of Service. If you intend to use the Microsoft Translator API for commercial or high volume purposes, you would need to sign a commercial license agreement and provide your AppID to the Microsoft Translator team. For more details contact mtlic@microsoft.com.

The engine supports inline codes.

When using the query functions of this connector, you are accessessing a remote server and makes your source text available to Microsoft, but no corresponding translation is sent to Microsoft when doing queries.

For more examples on how to use this connector see the article "Trying out the Microsoft Translator Connector" in the Knowledge Base. See also the Microsoft Batch Translation Step.

Calculation of the combined score

The original score of the query is preserved in the score field of the query result.

The combinedScore of the query result holds a re-calculated value that takes into account both the MatchDegree and Rating values returned by the engine.

For the results with a MatchDegree or 90 or above, the combined score is computed by adding the Rating value minus 10. For the results with a MatchDegree below 90, the combined score is simply the MatchDegree.

MatchDegree Rating Combined Score
100 5 95 (i.e. 100+(5-10))
100 6 96 (i.e. 100+(6-10))
100 0 90 (i.e. 100+(0-10))
100 -3 87 (i.e. 100+(-3-10))
98 9 97 (i.e. 98+(9-10))
95 5 90 (i.e. 95+(5-10))

Such calculation is far from perfect especially between highly rated high fuzzy matches and a low rated exact matches. But such entries are difficult to rank even manually. We will try to improve this scoring and welcome any feedback you may have.

If a result has no Rating the default is set to 5. Unverified MT translation will generally return a MatchDegree of 100 and a Rating of 5, which will compute into a combined score of 95 in the Okapi interface.

Parameters

Starting with M32:

Azure Key — The Microsoft Azure key to connect to the MT server. See See the Microsoft pages for more information.

Category — An optional category to use when working with trained engines.

Example of a configuration file:

#v1
azureKey=myAzureKey
category=

Prior M32:

Client ID — The Client ID to use to connect to the MT server. See See the MSDN pages for more information.

Secret — The secret corresponding to the Client ID.

Category — An optional category to use when working with trained engines.

Example of a configuration file:

#v1
clientId=myPersonalClientID
secret=theSecretForThatClientID

Limitations

  • The engine may, on occasion, not generate back the proper spaces. This happens especially when there are inline codes present in the source.