Microsoft Translator Connector

From Okapi Framework
Revision as of 07:28, 25 August 2012 by Ysavourel (talk | contribs) (Overview)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


The Microsoft MT engine is freely available from Microsoft at Volume limitations apply. The engine supports a large number of language pairs, both common and less common. The list is available at

This connector uses the HTTP v2 API. You can get more information about the API and its terms here:

To use this connector you need a "Client ID" and a "Client Secret" from Microsoft. If you get those by obtaining a Windows Live ID, and then registering an application in your Live account. See the MSDN pages for more information.

You must also respect Microsoft's Terms of Service. If you intend to use the Microsoft Translator API for commercial or high volume purposes, you would need to sign a commercial license agreement and provide your AppID to the Microsoft Translator team. For more details contact

The engine supports inline codes.

When using the query functions of this connector, you are accessessing a remote server and makes your source text available to Microsoft, but no corresponding translation is sent to Microsoft when doing queries.

For more examples on how to use this connector see the article "Trying out the Microsoft Translator Connector" in the Knowledge Base. See also the Microsoft Batch Translation Step.

Calculation of the combined score

The original score of the query is preserved in the score field of the query result.

The combinedScore of the query result holds a re-calculated value that takes into account both the MatchDegree and Rating values returned by the engine.

For the results with a MatchDegree or 90 or above, the combined score is computed by adding the Rating value minus 10. For the results with a MatchDegree below 90, the combined score is simply the MatchDegree.

MatchDegree Rating Combined Score
100 5 95 (i.e. 100+(5-10))
100 6 96 (i.e. 100+(6-10))
100 0 90 (i.e. 100+(0-10))
100 -3 87 (i.e. 100+(-3-10))
98 9 97 (i.e. 98+(9-10))
95 5 90 (i.e. 95+(5-10))

Such calculation is far from perfect especially between highly rated high fuzzy matches and a low rated exact matches. But such entries are difficult to rank even manually. We will try to improve this scoring and welcome any feedback you may have.

If a result has no Rating the default is set to 5. Unverified MT translation will generally return a MatchDegree of 100 and a Rating of 5, which will compute into a combined score of 95 in the Okapi interface.


Client ID — The Client ID to use to connect to the MT server. See See the MSDN pages for more information.

Secret — The secret corresponding to the Client ID.

Category — An optional category to use when working with trained engines.

Example of a configuration file:



  • The engine may, on occasion, not generate back the proper spaces. This happens especially when there are inline codes present in the source.