Difference between revisions of "Microsoft Translator Connector"

From Okapi Framework
Jump to navigation Jump to search
(Added minimum info about MS API v3)
(Update to match v3 API, first attempt)
Line 1: Line 1:
 
{{Connectors Header}}
 
{{Connectors Header}}
 
__TOC__
 
__TOC__
==Retirement of version 2 API==
+
==Overview==
<span class="red">MICROSOFT CONNECTOR of the Okapi stable releases will STOP WORKING</span> at the end of April, 2019.
+
The Microsoft [https://docs.microsoft.com/en-us/azure/cognitive-services/translator/ Translator Text Service] provides a machine translation over a REST API. The service supports a large number of language pairs, both common and less common. The list is available at [https://docs.microsoft.com/en-us/azure/cognitive-services/Translator/language-support#translation]. (Please see the list under '''V3 Translator API'''.)
  
Microsoft will retire their version 2 API on 2019-4-30 as described in [https://docs.microsoft.com/en-us/azure/cognitive-services/translator/migrate-to-v3 this page].  
+
This connector uses [https://docs.microsoft.com/en-us/azure/cognitive-services/translator/reference/v3-0-reference the V3 API]. To use this connector you need an '''Azure Key''' from Microsoft. See [https://translatorbusiness.uservoice.com/knowledgebase/articles/1078534-microsoft-translator-on-azure#signup the Microsoft pages] for more information.
Because of this, the Microsoft Connector found in the latest stable release, M37, will no longer work on and after 2019-5-01.
 
  
The support of the version 3 API has been added to Okapi in mid April after the M37 release. If you need to use Microsoft's machine translation service, please pick up the M38 snapshot version from [http://okapiframework.org/snapshots/ here].  
+
For more examples on how to use this connector see the article "[[Trying out the Microsoft Translator Connector]]" in the [[Knowledge Base]]. See also the [[Microsoft Batch Translation Step]].
Please note this is a minimal implementation and it does not support any new features such as profanity filtering,
 
  
Because the version 3 API no longer supports the translation memory, that aspect of function is not available even if you use the latest Okapi M38 snapshot version.
+
==Parameters==
  
You will need an "azure key" to use the version 3 API. If you already have a key for version 2, the same key should work.  
+
<cite>Azure Key</cite> &mdash; The Microsoft Azure key to use this Translator Text and other Microsoft Cognitive Services.
For information on how to obtain an azure key, please see [https://azure.microsoft.com/en-us/pricing/details/cognitive-services/ this page].  
+
<cite>Category</cite> &mdash; An optional category to use when working with trained engines. The service defaults to "general" if none is supplied.
  
'''Information below is mostly out of date. It is kept as reference until full update of this page is done.'''
+
Example of a configuration file:
  
==Overview==
+
#v1
The Microsoft MT engine is freely available from Microsoft at [http://www.microsofttranslator.com http://www.microsofttranslator.com]. Volume limitations apply. The engine supports a large number of language pairs, both common and less common. The list is available at [http://www.microsofttranslator.com/help http://www.microsofttranslator.com/help].
+
azureKey=myAzureKey
 
+
category=general
This connector uses the HTTP v2 API. You can get more information about the API and its terms here: [http://sdk.microsofttranslator.com http://sdk.microsofttranslator.com].
 
 
 
To use this connector you need a "Azure Key" from Microsoft. See [https://translatorbusiness.uservoice.com/knowledgebase/articles/1078534-microsoft-translator-on-azure#signup the Microsoft pages] for more information.
 
 
 
You must also respect Microsoft's Terms of Service. If you intend to use the Microsoft Translator API for commercial or high volume purposes, you would need to sign a commercial license agreement and provide your AppID to the Microsoft Translator team. For more details contact [mailto:mtlic@microsoft.com mtlic@microsoft.com].
 
 
 
The engine supports inline codes.
 
  
When using the query functions of this connector, you are accessessing a remote server and makes your '''source text''' available to Microsoft, but no corresponding translation is sent to Microsoft when doing queries.
 
 
For more examples on how to use this connector see the article "[[Trying out the Microsoft Translator Connector]]" in the [[Knowledge Base]]. See also the [[Microsoft Batch Translation Step]].
 
  
 +
==Details==
 
==== Calculation of the combined score ====
 
==== Calculation of the combined score ====
  
Line 61: Line 50:
 
If a result has no <code>Rating</code> the default is set to 5. Unverified MT translation will generally return a <code>MatchDegree</code> of 100 and a <code>Rating</code> of 5, which will compute into a combined score of 95 in the Okapi interface.
 
If a result has no <code>Rating</code> the default is set to 5. Unverified MT translation will generally return a <code>MatchDegree</code> of 100 and a <code>Rating</code> of 5, which will compute into a combined score of 95 in the Okapi interface.
  
==Parameters==
 
  
===Starting with M32:===
+
==Limitations==
 +
 
 +
* According to the [https://docs.microsoft.com/en-us/azure/cognitive-services/translator/reference/v3-0-translate?tabs=curl#request-body API document], at most 100 JSON array elements can be supplied and the entire text cannot exceeds 5000 characters.
 +
* The service may, on occasion, not generate back the proper spaces. This happens especially when there are inline codes present in the source.
 +
 
 +
==History==
 +
===Retirement of version 2 API===
 +
Microsoft has retireed their version 2 API on 2019-4-30 as described in [https://docs.microsoft.com/en-us/azure/cognitive-services/translator/migrate-to-v3 this page].
 +
Because of this, the Microsoft Connector found in the latest stable release, M37, will no longer work on and after 2019-5-01.
  
<cite>Azure Key</cite> &mdash; The Microsoft Azure key to connect to the MT server. See See [https://translatorbusiness.uservoice.com/knowledgebase/articles/1078534-microsoft-translator-on-azure#signup the Microsoft pages] for more information.
+
The support of the version 3 API has been added to the M38 snapshot version from [http://okapiframework.org/snapshots/ here] in April 2019.  
  
<cite>Category</cite> &mdash; An optional category to use when working with trained engines.
+
Please note this is a minimal implementation and it does not support any new features such as profanity filtering,
  
Example of a configuration file:
+
Because the version 3 API no longer supports the translation memory, that aspect of function is not available even if you use the latest Okapi M38 snapshot version.
  
#v1
+
You will need an "azure key" to use the version 3 API. If you already have a key for version 2, the same key should work.
azureKey=myAzureKey
+
For information on how to obtain an azure key, please see [https://azure.microsoft.com/en-us/pricing/details/cognitive-services/ this page].
category=
 
  
===Prior M32:===
+
===Old Parameters Prior To M32===
  
 
<cite>Client ID</cite> &mdash; The Client ID to use to connect to the MT server. See See [http://msdn.microsoft.com/en-us/library/hh454950.aspx the MSDN pages] for more information.
 
<cite>Client ID</cite> &mdash; The Client ID to use to connect to the MT server. See See [http://msdn.microsoft.com/en-us/library/hh454950.aspx the MSDN pages] for more information.
Line 89: Line 84:
 
  secret=theSecretForThatClientID
 
  secret=theSecretForThatClientID
  
==Limitations==
 
 
* The engine may, on occasion, not generate back the proper spaces. This happens especially when there are inline codes present in the source.
 
  
 
[[Category:Connectors]]
 
[[Category:Connectors]]

Revision as of 20:06, 14 August 2019

Overview

The Microsoft Translator Text Service provides a machine translation over a REST API. The service supports a large number of language pairs, both common and less common. The list is available at [1]. (Please see the list under V3 Translator API.)

This connector uses the V3 API. To use this connector you need an Azure Key from Microsoft. See the Microsoft pages for more information.

For more examples on how to use this connector see the article "Trying out the Microsoft Translator Connector" in the Knowledge Base. See also the Microsoft Batch Translation Step.

Parameters

Azure Key — The Microsoft Azure key to use this Translator Text and other Microsoft Cognitive Services. Category — An optional category to use when working with trained engines. The service defaults to "general" if none is supplied.

Example of a configuration file:

#v1
azureKey=myAzureKey
category=general


Details

Calculation of the combined score

The original score of the query is preserved in the score field of the query result.

The combinedScore of the query result holds a re-calculated value that takes into account both the MatchDegree and Rating values returned by the engine.

For the results with a MatchDegree or 90 or above, the combined score is computed by adding the Rating value minus 10. For the results with a MatchDegree below 90, the combined score is simply the MatchDegree.

MatchDegree Rating Combined Score
100 5 95 (i.e. 100+(5-10))
100 6 96 (i.e. 100+(6-10))
100 0 90 (i.e. 100+(0-10))
100 -3 87 (i.e. 100+(-3-10))
98 9 97 (i.e. 98+(9-10))
95 5 90 (i.e. 95+(5-10))

Such calculation is far from perfect especially between highly rated high fuzzy matches and a low rated exact matches. But such entries are difficult to rank even manually. We will try to improve this scoring and welcome any feedback you may have.

If a result has no Rating the default is set to 5. Unverified MT translation will generally return a MatchDegree of 100 and a Rating of 5, which will compute into a combined score of 95 in the Okapi interface.


Limitations

  • According to the API document, at most 100 JSON array elements can be supplied and the entire text cannot exceeds 5000 characters.
  • The service may, on occasion, not generate back the proper spaces. This happens especially when there are inline codes present in the source.

History

Retirement of version 2 API

Microsoft has retireed their version 2 API on 2019-4-30 as described in this page. Because of this, the Microsoft Connector found in the latest stable release, M37, will no longer work on and after 2019-5-01.

The support of the version 3 API has been added to the M38 snapshot version from here in April 2019.

Please note this is a minimal implementation and it does not support any new features such as profanity filtering,

Because the version 3 API no longer supports the translation memory, that aspect of function is not available even if you use the latest Okapi M38 snapshot version.

You will need an "azure key" to use the version 3 API. If you already have a key for version 2, the same key should work. For information on how to obtain an azure key, please see this page.

Old Parameters Prior To M32

Client ID — The Client ID to use to connect to the MT server. See See the MSDN pages for more information.

Secret — The secret corresponding to the Client ID.

Category — An optional category to use when working with trained engines.

Example of a configuration file:

#v1
clientId=myPersonalClientID
secret=theSecretForThatClientID