Microsoft Translator Connector: Difference between revisions

From Okapi Framework
Jump to navigation Jump to search
 
 
(6 intermediate revisions by 3 users not shown)
Line 2: Line 2:
__TOC__
__TOC__
==Overview==
==Overview==
The Microsoft [https://docs.microsoft.com/en-us/azure/cognitive-services/translator/ Translator Text Service] provides a machine translation over a REST API. The service supports a large number of language pairs, both common and less common. The list is available at [https://docs.microsoft.com/en-us/azure/cognitive-services/Translator/language-support#translation]. (Please see the list under '''V3 Translator API'''.)


The Microsoft MT engine is freely available from Microsoft at [http://www.microsofttranslator.com http://www.microsofttranslator.com]. Volume limitations apply. The engine supports a large number of language pairs, both common and less common. The list is available at [http://www.microsofttranslator.com/help http://www.microsofttranslator.com/help].
This connector uses [https://docs.microsoft.com/en-us/azure/cognitive-services/translator/reference/v3-0-reference the V3 API]. To use this connector you need an '''Azure Key''' from Microsoft. See [https://translatorbusiness.uservoice.com/knowledgebase/articles/1078534-microsoft-translator-on-azure#signup the Microsoft pages] for more information.


This connector uses the HTTP v2 API. You can get more information about the API and its terms here: [http://sdk.microsofttranslator.com http://sdk.microsofttranslator.com].
For more examples on how to use this connector see the article "[[Trying out the Microsoft Translator Connector]]" in the [[Knowledge Base]]. See also the [[Microsoft Batch Translation Step]].


To use this connector you need a "Client ID" and a "Client Secret" from Microsoft. If you get those by obtaining a Windows Live ID, and then registering an application in your Live account. See [http://msdn.microsoft.com/en-us/library/hh454950.aspx the MSDN pages] for more information.
==Parameters==


You must also respect Microsoft's Terms of Service. If you intend to use the Microsoft Translator API for commercial or high volume purposes, you would need to sign a commercial license agreement and provide your AppID to the Microsoft Translator team. For more details contact [mailto:mtlic@microsoft.com mtlic@microsoft.com].
<cite>Azure Key</cite> &mdash; The Microsoft Azure key to use this Translator Text and other Microsoft Cognitive Services.


The engine supports inline codes.
<cite>Category</cite> &mdash; An optional category to use when working with trained engines. The service defaults to "general" if none is supplied.


When using the query functions of this connector, you are accessessing a remote server and makes your '''source text''' available to Microsoft, but no corresponding translation is sent to Microsoft when doing queries.
Example of a configuration file:


For more examples on how to use this connector see the article "[[Trying out the Microsoft Translator Connector]]" in the [[Knowledge Base]]. See also the [[Microsoft Batch Translation Step]].
#v1
azureKey=myAzureKey
category=general


==Details==
==== Calculation of the combined score ====
==== Calculation of the combined score ====


Line 46: Line 50:
If a result has no <code>Rating</code> the default is set to 5. Unverified MT translation will generally return a <code>MatchDegree</code> of 100 and a <code>Rating</code> of 5, which will compute into a combined score of 95 in the Okapi interface.
If a result has no <code>Rating</code> the default is set to 5. Unverified MT translation will generally return a <code>MatchDegree</code> of 100 and a <code>Rating</code> of 5, which will compute into a combined score of 95 in the Okapi interface.


==Parameters==
 
==Limitations==
 
* According to the [https://docs.microsoft.com/en-us/azure/cognitive-services/translator/reference/v3-0-translate?tabs=curl#request-body API document], at most 100 JSON array elements can be supplied and the entire text cannot exceeds 5000 characters.
* The service may, on occasion, not generate back the proper spaces. This happens especially when there are inline codes present in the source.
* Only the translation feature of the Translator Text Service is supported by the connector. Obtaining a list of supported languages, transliteration, or language identification (detection) is not supported.
* Only the category parameter can be specified. Profanity detection and deletion, script conversion, and other features are not supported.
 
==History==
===Retirement of version 2 API===
Microsoft has retireed their version 2 API on 2019-4-30 as described in [https://docs.microsoft.com/en-us/azure/cognitive-services/translator/migrate-to-v3 this page].
Because of this, the Microsoft Connector found in the latest stable release, M37, will no longer work on and after 2019-5-01.
 
The support of the version 3 API has been added to the M38 snapshot version from [http://okapiframework.org/snapshots/ here] in April 2019.
 
Please note this is a minimal implementation and it does not support any new features such as profanity filtering,
 
Because the version 3 API no longer supports the translation memory, that aspect of function is not available even if you use the latest Okapi M38 snapshot version.
 
You will need an "azure key" to use the version 3 API. If you already have a key for version 2, the same key should work.
For information on how to obtain an azure key, please see [https://azure.microsoft.com/en-us/pricing/details/cognitive-services/ this page].
 
===Old Parameters Prior To M32===


<cite>Client ID</cite> &mdash; The Client ID to use to connect to the MT server. See See [http://msdn.microsoft.com/en-us/library/hh454950.aspx the MSDN pages] for more information.
<cite>Client ID</cite> &mdash; The Client ID to use to connect to the MT server. See See [http://msdn.microsoft.com/en-us/library/hh454950.aspx the MSDN pages] for more information.
Line 60: Line 86:
  secret=theSecretForThatClientID
  secret=theSecretForThatClientID


==Limitations==
* The engine may, on occasion, not generate back the proper spaces. This happens especially when there are inline codes present in the source.


[[Category:Connectors]]
[[Category:Connectors]]

Latest revision as of 04:44, 23 August 2019

Overview

The Microsoft Translator Text Service provides a machine translation over a REST API. The service supports a large number of language pairs, both common and less common. The list is available at [1]. (Please see the list under V3 Translator API.)

This connector uses the V3 API. To use this connector you need an Azure Key from Microsoft. See the Microsoft pages for more information.

For more examples on how to use this connector see the article "Trying out the Microsoft Translator Connector" in the Knowledge Base. See also the Microsoft Batch Translation Step.

Parameters

Azure Key — The Microsoft Azure key to use this Translator Text and other Microsoft Cognitive Services.

Category — An optional category to use when working with trained engines. The service defaults to "general" if none is supplied.

Example of a configuration file:

#v1
azureKey=myAzureKey
category=general

Details

Calculation of the combined score

The original score of the query is preserved in the score field of the query result.

The combinedScore of the query result holds a re-calculated value that takes into account both the MatchDegree and Rating values returned by the engine.

For the results with a MatchDegree or 90 or above, the combined score is computed by adding the Rating value minus 10. For the results with a MatchDegree below 90, the combined score is simply the MatchDegree.

MatchDegree Rating Combined Score
100 5 95 (i.e. 100+(5-10))
100 6 96 (i.e. 100+(6-10))
100 0 90 (i.e. 100+(0-10))
100 -3 87 (i.e. 100+(-3-10))
98 9 97 (i.e. 98+(9-10))
95 5 90 (i.e. 95+(5-10))

Such calculation is far from perfect especially between highly rated high fuzzy matches and a low rated exact matches. But such entries are difficult to rank even manually. We will try to improve this scoring and welcome any feedback you may have.

If a result has no Rating the default is set to 5. Unverified MT translation will generally return a MatchDegree of 100 and a Rating of 5, which will compute into a combined score of 95 in the Okapi interface.


Limitations

  • According to the API document, at most 100 JSON array elements can be supplied and the entire text cannot exceeds 5000 characters.
  • The service may, on occasion, not generate back the proper spaces. This happens especially when there are inline codes present in the source.
  • Only the translation feature of the Translator Text Service is supported by the connector. Obtaining a list of supported languages, transliteration, or language identification (detection) is not supported.
  • Only the category parameter can be specified. Profanity detection and deletion, script conversion, and other features are not supported.

History

Retirement of version 2 API

Microsoft has retireed their version 2 API on 2019-4-30 as described in this page. Because of this, the Microsoft Connector found in the latest stable release, M37, will no longer work on and after 2019-5-01.

The support of the version 3 API has been added to the M38 snapshot version from here in April 2019.

Please note this is a minimal implementation and it does not support any new features such as profanity filtering,

Because the version 3 API no longer supports the translation memory, that aspect of function is not available even if you use the latest Okapi M38 snapshot version.

You will need an "azure key" to use the version 3 API. If you already have a key for version 2, the same key should work. For information on how to obtain an azure key, please see this page.

Old Parameters Prior To M32

Client ID — The Client ID to use to connect to the MT server. See See the MSDN pages for more information.

Secret — The secret corresponding to the Client ID.

Category — An optional category to use when working with trained engines.

Example of a configuration file:

#v1
clientId=myPersonalClientID
secret=theSecretForThatClientID