synapse.ml.cognitive.translate package

Submodules

synapse.ml.cognitive.translate.BreakSentence module

class synapse.ml.cognitive.translate.BreakSentence.BreakSentence(java_obj=None, AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='BreakSentence_ac32cd123263_error', handler=None, language=None, languageCol=None, outputCol='BreakSentence_ac32cd123263_output', script=None, scriptCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters
  • AADToken (object) – AAD Token used for authentication

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • errorCol (str) – column to hold http errors

  • handler (object) – Which strategy to use when handling requests

  • language (object) – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.

  • outputCol (str) – The name of the output column

  • script (object) – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.

  • subscriptionKey (object) – the API key to use

  • subscriptionRegion (object) – the API region to use

  • text (object) – the string to translate

  • timeout (float) – number of seconds to wait before closing the connection

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
getAADToken()[source]
Returns

AAD Token used for authentication

Return type

AADToken

getConcurrency()[source]
Returns

max number of concurrent calls

Return type

concurrency

getConcurrentTimeout()[source]
Returns

max number seconds to wait on futures if concurrency >= 1

Return type

concurrentTimeout

getErrorCol()[source]
Returns

column to hold http errors

Return type

errorCol

getHandler()[source]
Returns

Which strategy to use when handling requests

Return type

handler

static getJavaPackage()[source]

Returns package name String.

getLanguage()[source]
Returns

Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.

Return type

language

getOutputCol()[source]
Returns

The name of the output column

Return type

outputCol

getScript()[source]
Returns

Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.

Return type

script

getSubscriptionKey()[source]
Returns

the API key to use

Return type

subscriptionKey

getSubscriptionRegion()[source]
Returns

the API region to use

Return type

subscriptionRegion

getText()[source]
Returns

the string to translate

Return type

text

getTimeout()[source]
Returns

number of seconds to wait before closing the connection

Return type

timeout

getUrl()[source]
Returns

Url of the service

Return type

url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
language = Param(parent='undefined', name='language', doc='ServiceParam: Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

script = Param(parent='undefined', name='script', doc='ServiceParam: Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.')
setAADToken(value)[source]
Parameters

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters

AADToken – AAD Token used for authentication

setConcurrency(value)[source]
Parameters

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters

errorCol – column to hold http errors

setHandler(value)[source]
Parameters

handler – Which strategy to use when handling requests

setLanguage(value)[source]
Parameters

language – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.

setLanguageCol(value)[source]
Parameters

language – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.

setLinkedService(value)[source]
setLocation(value)[source]
setOutputCol(value)[source]
Parameters

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='BreakSentence_ac32cd123263_error', handler=None, language=None, languageCol=None, outputCol='BreakSentence_ac32cd123263_output', script=None, scriptCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Set the (keyword only) parameters

setScript(value)[source]
Parameters

script – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.

setScriptCol(value)[source]
Parameters

script – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.

setSubscriptionKey(value)[source]
Parameters

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters

subscriptionKey – the API key to use

setSubscriptionRegion(value)[source]
Parameters

subscriptionRegion – the API region to use

setSubscriptionRegionCol(value)[source]
Parameters

subscriptionRegion – the API region to use

setText(value)[source]
Parameters

text – the string to translate

setTextCol(value)[source]
Parameters

text – the string to translate

setTimeout(value)[source]
Parameters

timeout – number of seconds to wait before closing the connection

setUrl(value)[source]
Parameters

url – Url of the service

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.cognitive.translate.Detect module

class synapse.ml.cognitive.translate.Detect.Detect(java_obj=None, AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='Detect_3ef2f4a0b5af_error', handler=None, outputCol='Detect_3ef2f4a0b5af_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters
  • AADToken (object) – AAD Token used for authentication

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • errorCol (str) – column to hold http errors

  • handler (object) – Which strategy to use when handling requests

  • outputCol (str) – The name of the output column

  • subscriptionKey (object) – the API key to use

  • subscriptionRegion (object) – the API region to use

  • text (object) – the string to translate

  • timeout (float) – number of seconds to wait before closing the connection

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
getAADToken()[source]
Returns

AAD Token used for authentication

Return type

AADToken

getConcurrency()[source]
Returns

max number of concurrent calls

Return type

concurrency

getConcurrentTimeout()[source]
Returns

max number seconds to wait on futures if concurrency >= 1

Return type

concurrentTimeout

getErrorCol()[source]
Returns

column to hold http errors

Return type

errorCol

getHandler()[source]
Returns

Which strategy to use when handling requests

Return type

handler

static getJavaPackage()[source]

Returns package name String.

getOutputCol()[source]
Returns

The name of the output column

Return type

outputCol

getSubscriptionKey()[source]
Returns

the API key to use

Return type

subscriptionKey

getSubscriptionRegion()[source]
Returns

the API region to use

Return type

subscriptionRegion

getText()[source]
Returns

the string to translate

Return type

text

getTimeout()[source]
Returns

number of seconds to wait before closing the connection

Return type

timeout

getUrl()[source]
Returns

Url of the service

Return type

url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setAADToken(value)[source]
Parameters

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters

AADToken – AAD Token used for authentication

setConcurrency(value)[source]
Parameters

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters

errorCol – column to hold http errors

setHandler(value)[source]
Parameters

handler – Which strategy to use when handling requests

setLinkedService(value)[source]
setLocation(value)[source]
setOutputCol(value)[source]
Parameters

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='Detect_3ef2f4a0b5af_error', handler=None, outputCol='Detect_3ef2f4a0b5af_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Set the (keyword only) parameters

setSubscriptionKey(value)[source]
Parameters

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters

subscriptionKey – the API key to use

setSubscriptionRegion(value)[source]
Parameters

subscriptionRegion – the API region to use

setSubscriptionRegionCol(value)[source]
Parameters

subscriptionRegion – the API region to use

setText(value)[source]
Parameters

text – the string to translate

setTextCol(value)[source]
Parameters

text – the string to translate

setTimeout(value)[source]
Parameters

timeout – number of seconds to wait before closing the connection

setUrl(value)[source]
Parameters

url – Url of the service

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.cognitive.translate.DictionaryExamples module

class synapse.ml.cognitive.translate.DictionaryExamples.DictionaryExamples(java_obj=None, AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='DictionaryExamples_40b01f5db1cc_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryExamples_40b01f5db1cc_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, textAndTranslation=None, textAndTranslationCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters
  • AADToken (object) – AAD Token used for authentication

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • errorCol (str) – column to hold http errors

  • fromLanguage (object) – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.

  • handler (object) – Which strategy to use when handling requests

  • outputCol (str) – The name of the output column

  • subscriptionKey (object) – the API key to use

  • subscriptionRegion (object) – the API region to use

  • textAndTranslation (object) – A string specifying the translated text previously returned by the Dictionary lookup operation.

  • timeout (float) – number of seconds to wait before closing the connection

  • toLanguage (object) – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
fromLanguage = Param(parent='undefined', name='fromLanguage', doc='ServiceParam: Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.')
getAADToken()[source]
Returns

AAD Token used for authentication

Return type

AADToken

getConcurrency()[source]
Returns

max number of concurrent calls

Return type

concurrency

getConcurrentTimeout()[source]
Returns

max number seconds to wait on futures if concurrency >= 1

Return type

concurrentTimeout

getErrorCol()[source]
Returns

column to hold http errors

Return type

errorCol

getFromLanguage()[source]
Returns

Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.

Return type

fromLanguage

getHandler()[source]
Returns

Which strategy to use when handling requests

Return type

handler

static getJavaPackage()[source]

Returns package name String.

getOutputCol()[source]
Returns

The name of the output column

Return type

outputCol

getSubscriptionKey()[source]
Returns

the API key to use

Return type

subscriptionKey

getSubscriptionRegion()[source]
Returns

the API region to use

Return type

subscriptionRegion

getTextAndTranslation()[source]
Returns

A string specifying the translated text previously returned by the Dictionary lookup operation.

Return type

textAndTranslation

getTimeout()[source]
Returns

number of seconds to wait before closing the connection

Return type

timeout

getToLanguage()[source]
Returns

Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.

Return type

toLanguage

getUrl()[source]
Returns

Url of the service

Return type

url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setAADToken(value)[source]
Parameters

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters

AADToken – AAD Token used for authentication

setConcurrency(value)[source]
Parameters

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters

errorCol – column to hold http errors

setFromLanguage(value)[source]
Parameters

fromLanguage – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.

setFromLanguageCol(value)[source]
Parameters

fromLanguage – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.

setHandler(value)[source]
Parameters

handler – Which strategy to use when handling requests

setLinkedService(value)[source]
setLocation(value)[source]
setOutputCol(value)[source]
Parameters

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='DictionaryExamples_40b01f5db1cc_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryExamples_40b01f5db1cc_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, textAndTranslation=None, textAndTranslationCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]

Set the (keyword only) parameters

setSubscriptionKey(value)[source]
Parameters

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters

subscriptionKey – the API key to use

setSubscriptionRegion(value)[source]
Parameters

subscriptionRegion – the API region to use

setSubscriptionRegionCol(value)[source]
Parameters

subscriptionRegion – the API region to use

setTextAndTranslation(value)[source]
Parameters

textAndTranslation – A string specifying the translated text previously returned by the Dictionary lookup operation.

setTextAndTranslationCol(value)[source]
Parameters

textAndTranslation – A string specifying the translated text previously returned by the Dictionary lookup operation.

setTimeout(value)[source]
Parameters

timeout – number of seconds to wait before closing the connection

setToLanguage(value)[source]
Parameters

toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.

setToLanguageCol(value)[source]
Parameters

toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.

setUrl(value)[source]
Parameters

url – Url of the service

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
textAndTranslation = Param(parent='undefined', name='textAndTranslation', doc='ServiceParam:  A string specifying the translated text previously returned by the Dictionary lookup operation.')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
toLanguage = Param(parent='undefined', name='toLanguage', doc='ServiceParam: Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.')
url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.cognitive.translate.DictionaryLookup module

class synapse.ml.cognitive.translate.DictionaryLookup.DictionaryLookup(java_obj=None, AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='DictionaryLookup_330fdc843191_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryLookup_330fdc843191_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters
  • AADToken (object) – AAD Token used for authentication

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • errorCol (str) – column to hold http errors

  • fromLanguage (object) – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.

  • handler (object) – Which strategy to use when handling requests

  • outputCol (str) – The name of the output column

  • subscriptionKey (object) – the API key to use

  • subscriptionRegion (object) – the API region to use

  • text (object) – the string to translate

  • timeout (float) – number of seconds to wait before closing the connection

  • toLanguage (object) – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
fromLanguage = Param(parent='undefined', name='fromLanguage', doc='ServiceParam: Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.')
getAADToken()[source]
Returns

AAD Token used for authentication

Return type

AADToken

getConcurrency()[source]
Returns

max number of concurrent calls

Return type

concurrency

getConcurrentTimeout()[source]
Returns

max number seconds to wait on futures if concurrency >= 1

Return type

concurrentTimeout

getErrorCol()[source]
Returns

column to hold http errors

Return type

errorCol

getFromLanguage()[source]
Returns

Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.

Return type

fromLanguage

getHandler()[source]
Returns

Which strategy to use when handling requests

Return type

handler

static getJavaPackage()[source]

Returns package name String.

getOutputCol()[source]
Returns

The name of the output column

Return type

outputCol

getSubscriptionKey()[source]
Returns

the API key to use

Return type

subscriptionKey

getSubscriptionRegion()[source]
Returns

the API region to use

Return type

subscriptionRegion

getText()[source]
Returns

the string to translate

Return type

text

getTimeout()[source]
Returns

number of seconds to wait before closing the connection

Return type

timeout

getToLanguage()[source]
Returns

Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.

Return type

toLanguage

getUrl()[source]
Returns

Url of the service

Return type

url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setAADToken(value)[source]
Parameters

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters

AADToken – AAD Token used for authentication

setConcurrency(value)[source]
Parameters

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters

errorCol – column to hold http errors

setFromLanguage(value)[source]
Parameters

fromLanguage – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.

setFromLanguageCol(value)[source]
Parameters

fromLanguage – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.

setHandler(value)[source]
Parameters

handler – Which strategy to use when handling requests

setLinkedService(value)[source]
setLocation(value)[source]
setOutputCol(value)[source]
Parameters

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='DictionaryLookup_330fdc843191_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryLookup_330fdc843191_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]

Set the (keyword only) parameters

setSubscriptionKey(value)[source]
Parameters

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters

subscriptionKey – the API key to use

setSubscriptionRegion(value)[source]
Parameters

subscriptionRegion – the API region to use

setSubscriptionRegionCol(value)[source]
Parameters

subscriptionRegion – the API region to use

setText(value)[source]
Parameters

text – the string to translate

setTextCol(value)[source]
Parameters

text – the string to translate

setTimeout(value)[source]
Parameters

timeout – number of seconds to wait before closing the connection

setToLanguage(value)[source]
Parameters

toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.

setToLanguageCol(value)[source]
Parameters

toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.

setUrl(value)[source]
Parameters

url – Url of the service

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
toLanguage = Param(parent='undefined', name='toLanguage', doc='ServiceParam: Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.')
url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.cognitive.translate.DocumentTranslator module

class synapse.ml.cognitive.translate.DocumentTranslator.DocumentTranslator(java_obj=None, AADToken=None, AADTokenCol=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='DocumentTranslator_a6f875f3eb65_error', filterPrefix=None, filterPrefixCol=None, filterSuffix=None, filterSuffixCol=None, initialPollingDelay=300, maxPollingRetries=1000, outputCol='DocumentTranslator_a6f875f3eb65_output', pollingDelay=300, serviceName=None, sourceLanguage=None, sourceLanguageCol=None, sourceStorageSource=None, sourceStorageSourceCol=None, sourceUrl=None, sourceUrlCol=None, storageType=None, storageTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, targets=None, targetsCol=None, timeout=60.0, url=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters
  • AADToken (object) – AAD Token used for authentication

  • backoffs (list) – array of backoffs to use in the handler

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • errorCol (str) – column to hold http errors

  • filterPrefix (object) – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.

  • filterSuffix (object) – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.

  • initialPollingDelay (int) – number of milliseconds to wait before first poll for result

  • maxPollingRetries (int) – number of times to poll

  • outputCol (str) – The name of the output column

  • pollingDelay (int) – number of milliseconds to wait between polling

  • serviceName (str) –

  • sourceLanguage (object) – Language code. If none is specified, we will perform auto detect on the document.

  • sourceStorageSource (object) – Storage source of source input.

  • sourceUrl (object) – Location of the folder / container or single file with your documents.

  • storageType (object) – Storage type of the input documents source string. Required for single document translation only.

  • subscriptionKey (object) – the API key to use

  • suppressMaxRetriesException (bool) – set true to suppress the maxumimum retries exception and report in the error column

  • targets (object) – Destination for the finished translated documents.

  • timeout (float) – number of seconds to wait before closing the connection

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
filterPrefix = Param(parent='undefined', name='filterPrefix', doc='ServiceParam: A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.')
filterSuffix = Param(parent='undefined', name='filterSuffix', doc='ServiceParam: A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.')
getAADToken()[source]
Returns

AAD Token used for authentication

Return type

AADToken

getBackoffs()[source]
Returns

array of backoffs to use in the handler

Return type

backoffs

getConcurrency()[source]
Returns

max number of concurrent calls

Return type

concurrency

getConcurrentTimeout()[source]
Returns

max number seconds to wait on futures if concurrency >= 1

Return type

concurrentTimeout

getErrorCol()[source]
Returns

column to hold http errors

Return type

errorCol

getFilterPrefix()[source]
Returns

A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.

Return type

filterPrefix

getFilterSuffix()[source]
Returns

A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.

Return type

filterSuffix

getInitialPollingDelay()[source]
Returns

number of milliseconds to wait before first poll for result

Return type

initialPollingDelay

static getJavaPackage()[source]

Returns package name String.

getMaxPollingRetries()[source]
Returns

number of times to poll

Return type

maxPollingRetries

getOutputCol()[source]
Returns

The name of the output column

Return type

outputCol

getPollingDelay()[source]
Returns

number of milliseconds to wait between polling

Return type

pollingDelay

getServiceName()[source]
Returns

Return type

serviceName

getSourceLanguage()[source]
Returns

Language code. If none is specified, we will perform auto detect on the document.

Return type

sourceLanguage

getSourceStorageSource()[source]
Returns

Storage source of source input.

Return type

sourceStorageSource

getSourceUrl()[source]
Returns

Location of the folder / container or single file with your documents.

Return type

sourceUrl

getStorageType()[source]
Returns

Storage type of the input documents source string. Required for single document translation only.

Return type

storageType

getSubscriptionKey()[source]
Returns

the API key to use

Return type

subscriptionKey

getSuppressMaxRetriesException()[source]
Returns

set true to suppress the maxumimum retries exception and report in the error column

Return type

suppressMaxRetriesException

getTargets()[source]
Returns

Destination for the finished translated documents.

Return type

targets

getTimeout()[source]
Returns

number of seconds to wait before closing the connection

Return type

timeout

getUrl()[source]
Returns

Url of the service

Return type

url

initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
classmethod read()[source]

Returns an MLReader instance for this class.

serviceName = Param(parent='undefined', name='serviceName', doc='')
setAADToken(value)[source]
Parameters

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters

AADToken – AAD Token used for authentication

setBackoffs(value)[source]
Parameters

backoffs – array of backoffs to use in the handler

setConcurrency(value)[source]
Parameters

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters

errorCol – column to hold http errors

setFilterPrefix(value)[source]
Parameters

filterPrefix – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.

setFilterPrefixCol(value)[source]
Parameters

filterPrefix – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.

setFilterSuffix(value)[source]
Parameters

filterSuffix – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.

setFilterSuffixCol(value)[source]
Parameters

filterSuffix – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.

setInitialPollingDelay(value)[source]
Parameters

initialPollingDelay – number of milliseconds to wait before first poll for result

setLinkedService(value)[source]
setMaxPollingRetries(value)[source]
Parameters

maxPollingRetries – number of times to poll

setOutputCol(value)[source]
Parameters

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='DocumentTranslator_a6f875f3eb65_error', filterPrefix=None, filterPrefixCol=None, filterSuffix=None, filterSuffixCol=None, initialPollingDelay=300, maxPollingRetries=1000, outputCol='DocumentTranslator_a6f875f3eb65_output', pollingDelay=300, serviceName=None, sourceLanguage=None, sourceLanguageCol=None, sourceStorageSource=None, sourceStorageSourceCol=None, sourceUrl=None, sourceUrlCol=None, storageType=None, storageTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, targets=None, targetsCol=None, timeout=60.0, url=None)[source]

Set the (keyword only) parameters

setPollingDelay(value)[source]
Parameters

pollingDelay – number of milliseconds to wait between polling

setServiceName(value)[source]
Parameters

serviceName

setSourceLanguage(value)[source]
Parameters

sourceLanguage – Language code. If none is specified, we will perform auto detect on the document.

setSourceLanguageCol(value)[source]
Parameters

sourceLanguage – Language code. If none is specified, we will perform auto detect on the document.

setSourceStorageSource(value)[source]
Parameters

sourceStorageSource – Storage source of source input.

setSourceStorageSourceCol(value)[source]
Parameters

sourceStorageSource – Storage source of source input.

setSourceUrl(value)[source]
Parameters

sourceUrl – Location of the folder / container or single file with your documents.

setSourceUrlCol(value)[source]
Parameters

sourceUrl – Location of the folder / container or single file with your documents.

setStorageType(value)[source]
Parameters

storageType – Storage type of the input documents source string. Required for single document translation only.

setStorageTypeCol(value)[source]
Parameters

storageType – Storage type of the input documents source string. Required for single document translation only.

setSubscriptionKey(value)[source]
Parameters

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters

subscriptionKey – the API key to use

setSuppressMaxRetriesException(value)[source]
Parameters

suppressMaxRetriesException – set true to suppress the maxumimum retries exception and report in the error column

setTargets(value)[source]
Parameters

targets – Destination for the finished translated documents.

setTargetsCol(value)[source]
Parameters

targets – Destination for the finished translated documents.

setTimeout(value)[source]
Parameters

timeout – number of seconds to wait before closing the connection

setUrl(value)[source]
Parameters

url – Url of the service

sourceLanguage = Param(parent='undefined', name='sourceLanguage', doc='ServiceParam: Language code. If none is specified, we will perform auto detect on the document.')
sourceStorageSource = Param(parent='undefined', name='sourceStorageSource', doc='ServiceParam: Storage source of source input.')
sourceUrl = Param(parent='undefined', name='sourceUrl', doc='ServiceParam: Location of the folder / container or single file with your documents.')
storageType = Param(parent='undefined', name='storageType', doc='ServiceParam: Storage type of the input documents source string. Required for single document translation only.')
subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
targets = Param(parent='undefined', name='targets', doc='ServiceParam: Destination for the finished translated documents.')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.cognitive.translate.Translate module

class synapse.ml.cognitive.translate.Translate.Translate(java_obj=None, AADToken=None, AADTokenCol=None, allowFallback=None, allowFallbackCol=None, category=None, categoryCol=None, concurrency=1, concurrentTimeout=None, errorCol='Translate_746836f4c9cb_error', fromLanguage=None, fromLanguageCol=None, fromScript=None, fromScriptCol=None, handler=None, includeAlignment=None, includeAlignmentCol=None, includeSentenceLength=None, includeSentenceLengthCol=None, outputCol='Translate_746836f4c9cb_output', profanityAction=None, profanityActionCol=None, profanityMarker=None, profanityMarkerCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, suggestedFrom=None, suggestedFromCol=None, text=None, textCol=None, textType=None, textTypeCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, toScript=None, toScriptCol=None, url=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters
  • AADToken (object) – AAD Token used for authentication

  • allowFallback (object) – Specifies that the service is allowed to fall back to a general system when a custom system does not exist.

  • category (object) – A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • errorCol (str) – column to hold http errors

  • fromLanguage (object) – Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.

  • fromScript (object) – Specifies the script of the input text.

  • handler (object) – Which strategy to use when handling requests

  • includeAlignment (object) – Specifies whether to include alignment projection from source text to translated text.

  • includeSentenceLength (object) – Specifies whether to include sentence boundaries for the input text and the translated text.

  • outputCol (str) – The name of the output column

  • profanityAction (object) – Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.

  • profanityMarker (object) – Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.

  • subscriptionKey (object) – the API key to use

  • subscriptionRegion (object) – the API region to use

  • suggestedFrom (object) – Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.

  • text (object) – the string to translate

  • textType (object) – Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.

  • timeout (float) – number of seconds to wait before closing the connection

  • toLanguage (object) – Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.

  • toScript (object) – Specifies the script of the translated text.

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
allowFallback = Param(parent='undefined', name='allowFallback', doc='ServiceParam: Specifies that the service is allowed to fall back to a general system when a custom system does not exist. ')
category = Param(parent='undefined', name='category', doc='ServiceParam: A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
fromLanguage = Param(parent='undefined', name='fromLanguage', doc='ServiceParam: Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.')
fromScript = Param(parent='undefined', name='fromScript', doc='ServiceParam: Specifies the script of the input text.')
getAADToken()[source]
Returns

AAD Token used for authentication

Return type

AADToken

getAllowFallback()[source]
Returns

Specifies that the service is allowed to fall back to a general system when a custom system does not exist.

Return type

allowFallback

getCategory()[source]
Returns

A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.

Return type

category

getConcurrency()[source]
Returns

max number of concurrent calls

Return type

concurrency

getConcurrentTimeout()[source]
Returns

max number seconds to wait on futures if concurrency >= 1

Return type

concurrentTimeout

getErrorCol()[source]
Returns

column to hold http errors

Return type

errorCol

getFromLanguage()[source]
Returns

Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.

Return type

fromLanguage

getFromScript()[source]
Returns

Specifies the script of the input text.

Return type

fromScript

getHandler()[source]
Returns

Which strategy to use when handling requests

Return type

handler

getIncludeAlignment()[source]
Returns

Specifies whether to include alignment projection from source text to translated text.

Return type

includeAlignment

getIncludeSentenceLength()[source]
Returns

Specifies whether to include sentence boundaries for the input text and the translated text.

Return type

includeSentenceLength

static getJavaPackage()[source]

Returns package name String.

getOutputCol()[source]
Returns

The name of the output column

Return type

outputCol

getProfanityAction()[source]
Returns

Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.

Return type

profanityAction

getProfanityMarker()[source]
Returns

Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.

Return type

profanityMarker

getSubscriptionKey()[source]
Returns

the API key to use

Return type

subscriptionKey

getSubscriptionRegion()[source]
Returns

the API region to use

Return type

subscriptionRegion

getSuggestedFrom()[source]
Returns

Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.

Return type

suggestedFrom

getText()[source]
Returns

the string to translate

Return type

text

getTextType()[source]
Returns

Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.

Return type

textType

getTimeout()[source]
Returns

number of seconds to wait before closing the connection

Return type

timeout

getToLanguage()[source]
Returns

Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.

Return type

toLanguage

getToScript()[source]
Returns

Specifies the script of the translated text.

Return type

toScript

getUrl()[source]
Returns

Url of the service

Return type

url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
includeAlignment = Param(parent='undefined', name='includeAlignment', doc='ServiceParam: Specifies whether to include alignment projection from source text to translated text.')
includeSentenceLength = Param(parent='undefined', name='includeSentenceLength', doc='ServiceParam: Specifies whether to include sentence boundaries for the input text and the translated text. ')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
profanityAction = Param(parent='undefined', name='profanityAction', doc='ServiceParam: Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted. ')
profanityMarker = Param(parent='undefined', name='profanityMarker', doc='ServiceParam: Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.')
classmethod read()[source]

Returns an MLReader instance for this class.

setAADToken(value)[source]
Parameters

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters

AADToken – AAD Token used for authentication

setAllowFallback(value)[source]
Parameters

allowFallback – Specifies that the service is allowed to fall back to a general system when a custom system does not exist.

setAllowFallbackCol(value)[source]
Parameters

allowFallback – Specifies that the service is allowed to fall back to a general system when a custom system does not exist.

setCategory(value)[source]
Parameters

category – A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.

setCategoryCol(value)[source]
Parameters

category – A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.

setConcurrency(value)[source]
Parameters

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters

errorCol – column to hold http errors

setFromLanguage(value)[source]
Parameters

fromLanguage – Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.

setFromLanguageCol(value)[source]
Parameters

fromLanguage – Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.

setFromScript(value)[source]
Parameters

fromScript – Specifies the script of the input text.

setFromScriptCol(value)[source]
Parameters

fromScript – Specifies the script of the input text.

setHandler(value)[source]
Parameters

handler – Which strategy to use when handling requests

setIncludeAlignment(value)[source]
Parameters

includeAlignment – Specifies whether to include alignment projection from source text to translated text.

setIncludeAlignmentCol(value)[source]
Parameters

includeAlignment – Specifies whether to include alignment projection from source text to translated text.

setIncludeSentenceLength(value)[source]
Parameters

includeSentenceLength – Specifies whether to include sentence boundaries for the input text and the translated text.

setIncludeSentenceLengthCol(value)[source]
Parameters

includeSentenceLength – Specifies whether to include sentence boundaries for the input text and the translated text.

setLinkedService(value)[source]
setLocation(value)[source]
setOutputCol(value)[source]
Parameters

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, allowFallback=None, allowFallbackCol=None, category=None, categoryCol=None, concurrency=1, concurrentTimeout=None, errorCol='Translate_746836f4c9cb_error', fromLanguage=None, fromLanguageCol=None, fromScript=None, fromScriptCol=None, handler=None, includeAlignment=None, includeAlignmentCol=None, includeSentenceLength=None, includeSentenceLengthCol=None, outputCol='Translate_746836f4c9cb_output', profanityAction=None, profanityActionCol=None, profanityMarker=None, profanityMarkerCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, suggestedFrom=None, suggestedFromCol=None, text=None, textCol=None, textType=None, textTypeCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, toScript=None, toScriptCol=None, url=None)[source]

Set the (keyword only) parameters

setProfanityAction(value)[source]
Parameters

profanityAction – Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.

setProfanityActionCol(value)[source]
Parameters

profanityAction – Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.

setProfanityMarker(value)[source]
Parameters

profanityMarker – Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.

setProfanityMarkerCol(value)[source]
Parameters

profanityMarker – Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.

setSubscriptionKey(value)[source]
Parameters

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters

subscriptionKey – the API key to use

setSubscriptionRegion(value)[source]
Parameters

subscriptionRegion – the API region to use

setSubscriptionRegionCol(value)[source]
Parameters

subscriptionRegion – the API region to use

setSuggestedFrom(value)[source]
Parameters

suggestedFrom – Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.

setSuggestedFromCol(value)[source]
Parameters

suggestedFrom – Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.

setText(value)[source]
Parameters

text – the string to translate

setTextCol(value)[source]
Parameters

text – the string to translate

setTextType(value)[source]
Parameters

textType – Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.

setTextTypeCol(value)[source]
Parameters

textType – Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.

setTimeout(value)[source]
Parameters

timeout – number of seconds to wait before closing the connection

setToLanguage(value)[source]
Parameters

toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.

setToLanguageCol(value)[source]
Parameters

toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.

setToScript(value)[source]
Parameters

toScript – Specifies the script of the translated text.

setToScriptCol(value)[source]
Parameters

toScript – Specifies the script of the translated text.

setUrl(value)[source]
Parameters

url – Url of the service

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
suggestedFrom = Param(parent='undefined', name='suggestedFrom', doc="ServiceParam: Specifies a fallback language if the language of the input text can't be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.")
text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
textType = Param(parent='undefined', name='textType', doc='ServiceParam: Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
toLanguage = Param(parent='undefined', name='toLanguage', doc="ServiceParam: Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It's possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.")
toScript = Param(parent='undefined', name='toScript', doc='ServiceParam: Specifies the script of the translated text.')
url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.cognitive.translate.Transliterate module

class synapse.ml.cognitive.translate.Transliterate.Transliterate(java_obj=None, AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='Transliterate_0bd5762da30f_error', fromScript=None, fromScriptCol=None, handler=None, language=None, languageCol=None, outputCol='Transliterate_0bd5762da30f_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toScript=None, toScriptCol=None, url=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters
  • AADToken (object) – AAD Token used for authentication

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • errorCol (str) – column to hold http errors

  • fromScript (object) – Specifies the script of the input text.

  • handler (object) – Which strategy to use when handling requests

  • language (object) – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.

  • outputCol (str) – The name of the output column

  • subscriptionKey (object) – the API key to use

  • subscriptionRegion (object) – the API region to use

  • text (object) – the string to translate

  • timeout (float) – number of seconds to wait before closing the connection

  • toScript (object) – Specifies the script of the translated text.

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
fromScript = Param(parent='undefined', name='fromScript', doc='ServiceParam: Specifies the script of the input text.')
getAADToken()[source]
Returns

AAD Token used for authentication

Return type

AADToken

getConcurrency()[source]
Returns

max number of concurrent calls

Return type

concurrency

getConcurrentTimeout()[source]
Returns

max number seconds to wait on futures if concurrency >= 1

Return type

concurrentTimeout

getErrorCol()[source]
Returns

column to hold http errors

Return type

errorCol

getFromScript()[source]
Returns

Specifies the script of the input text.

Return type

fromScript

getHandler()[source]
Returns

Which strategy to use when handling requests

Return type

handler

static getJavaPackage()[source]

Returns package name String.

getLanguage()[source]
Returns

Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.

Return type

language

getOutputCol()[source]
Returns

The name of the output column

Return type

outputCol

getSubscriptionKey()[source]
Returns

the API key to use

Return type

subscriptionKey

getSubscriptionRegion()[source]
Returns

the API region to use

Return type

subscriptionRegion

getText()[source]
Returns

the string to translate

Return type

text

getTimeout()[source]
Returns

number of seconds to wait before closing the connection

Return type

timeout

getToScript()[source]
Returns

Specifies the script of the translated text.

Return type

toScript

getUrl()[source]
Returns

Url of the service

Return type

url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
language = Param(parent='undefined', name='language', doc='ServiceParam: Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setAADToken(value)[source]
Parameters

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters

AADToken – AAD Token used for authentication

setConcurrency(value)[source]
Parameters

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters

errorCol – column to hold http errors

setFromScript(value)[source]
Parameters

fromScript – Specifies the script of the input text.

setFromScriptCol(value)[source]
Parameters

fromScript – Specifies the script of the input text.

setHandler(value)[source]
Parameters

handler – Which strategy to use when handling requests

setLanguage(value)[source]
Parameters

language – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.

setLanguageCol(value)[source]
Parameters

language – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.

setLinkedService(value)[source]
setLocation(value)[source]
setOutputCol(value)[source]
Parameters

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='Transliterate_0bd5762da30f_error', fromScript=None, fromScriptCol=None, handler=None, language=None, languageCol=None, outputCol='Transliterate_0bd5762da30f_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toScript=None, toScriptCol=None, url=None)[source]

Set the (keyword only) parameters

setSubscriptionKey(value)[source]
Parameters

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters

subscriptionKey – the API key to use

setSubscriptionRegion(value)[source]
Parameters

subscriptionRegion – the API region to use

setSubscriptionRegionCol(value)[source]
Parameters

subscriptionRegion – the API region to use

setText(value)[source]
Parameters

text – the string to translate

setTextCol(value)[source]
Parameters

text – the string to translate

setTimeout(value)[source]
Parameters

timeout – number of seconds to wait before closing the connection

setToScript(value)[source]
Parameters

toScript – Specifies the script of the translated text.

setToScriptCol(value)[source]
Parameters

toScript – Specifies the script of the translated text.

setUrl(value)[source]
Parameters

url – Url of the service

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
toScript = Param(parent='undefined', name='toScript', doc='ServiceParam: Specifies the script of the translated text.')
url = Param(parent='undefined', name='url', doc='Url of the service')

Module contents

SynapseML is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK), LightGBM and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of datasources.

SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. In this vein, SynapseML provides easy to use SparkML transformers for a wide variety of Microsoft Cognitive Services. For production grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.

SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+.