Microsoft Translator Connector

From Okapi Framework
Jump to navigation Jump to search

Overview

The Microsoft Translator Text Service provides a machine translation over a REST API. The service supports a large number of language pairs, both common and less common. The list is available at [1]. (Please see the list under V3 Translator API.)

This connector uses the V3 API. To use this connector you need an Azure Key from Microsoft. See the Microsoft pages for more information.

For more examples on how to use this connector see the article "Trying out the Microsoft Translator Connector" in the Knowledge Base. See also the Microsoft Batch Translation Step.

Parameters

Azure Key — The Microsoft Azure key to use this Translator Text and other Microsoft Cognitive Services.

Category — An optional category to use when working with trained engines. The service defaults to "general" if none is supplied.

Example of a configuration file:

#v1
azureKey=myAzureKey
category=general

Details

Calculation of the combined score

The original score of the query is preserved in the score field of the query result.

The combinedScore of the query result holds a re-calculated value that takes into account both the MatchDegree and Rating values returned by the engine.

For the results with a MatchDegree or 90 or above, the combined score is computed by adding the Rating value minus 10. For the results with a MatchDegree below 90, the combined score is simply the MatchDegree.

MatchDegree Rating Combined Score
100 5 95 (i.e. 100+(5-10))
100 6 96 (i.e. 100+(6-10))
100 0 90 (i.e. 100+(0-10))
100 -3 87 (i.e. 100+(-3-10))
98 9 97 (i.e. 98+(9-10))
95 5 90 (i.e. 95+(5-10))

Such calculation is far from perfect especially between highly rated high fuzzy matches and a low rated exact matches. But such entries are difficult to rank even manually. We will try to improve this scoring and welcome any feedback you may have.

If a result has no Rating the default is set to 5. Unverified MT translation will generally return a MatchDegree of 100 and a Rating of 5, which will compute into a combined score of 95 in the Okapi interface.


Limitations

  • According to the API document, at most 100 JSON array elements can be supplied and the entire text cannot exceeds 5000 characters.
  • The service may, on occasion, not generate back the proper spaces. This happens especially when there are inline codes present in the source.

History

Retirement of version 2 API

Microsoft has retireed their version 2 API on 2019-4-30 as described in this page. Because of this, the Microsoft Connector found in the latest stable release, M37, will no longer work on and after 2019-5-01.

The support of the version 3 API has been added to the M38 snapshot version from here in April 2019.

Please note this is a minimal implementation and it does not support any new features such as profanity filtering,

Because the version 3 API no longer supports the translation memory, that aspect of function is not available even if you use the latest Okapi M38 snapshot version.

You will need an "azure key" to use the version 3 API. If you already have a key for version 2, the same key should work. For information on how to obtain an azure key, please see this page.

Old Parameters Prior To M32

Client ID — The Client ID to use to connect to the MT server. See See the MSDN pages for more information.

Secret — The secret corresponding to the Client ID.

Category — An optional category to use when working with trained engines.

Example of a configuration file:

#v1
clientId=myPersonalClientID
secret=theSecretForThatClientID