synapse.ml.cognitive.translate package
Submodules
synapse.ml.cognitive.translate.BreakSentence module
- class synapse.ml.cognitive.translate.BreakSentence.BreakSentence(java_obj=None, AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='BreakSentence_ac32cd123263_error', handler=None, language=None, languageCol=None, outputCol='BreakSentence_ac32cd123263_output', script=None, scriptCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
script¶ (object) – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
timeout¶ (float) – number of seconds to wait before closing the connection
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- Return type
language
- getScript()[source]
- Returns
Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
- Return type
script
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- script = Param(parent='undefined', name='script', doc='ServiceParam: Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language¶ – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setLanguageCol(value)[source]
- Parameters
language¶ – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setParams(AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='BreakSentence_ac32cd123263_error', handler=None, language=None, languageCol=None, outputCol='BreakSentence_ac32cd123263_output', script=None, scriptCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setScript(value)[source]
- Parameters
script¶ – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
- setScriptCol(value)[source]
- Parameters
script¶ – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.translate.Detect module
- class synapse.ml.cognitive.translate.Detect.Detect(java_obj=None, AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='Detect_3ef2f4a0b5af_error', handler=None, outputCol='Detect_3ef2f4a0b5af_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setParams(AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='Detect_3ef2f4a0b5af_error', handler=None, outputCol='Detect_3ef2f4a0b5af_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.translate.DictionaryExamples module
- class synapse.ml.cognitive.translate.DictionaryExamples.DictionaryExamples(java_obj=None, AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='DictionaryExamples_40b01f5db1cc_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryExamples_40b01f5db1cc_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, textAndTranslation=None, textAndTranslationCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
fromLanguage¶ (object) – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
handler¶ (object) – Which strategy to use when handling requests
textAndTranslation¶ (object) – A string specifying the translated text previously returned by the Dictionary lookup operation.
timeout¶ (float) – number of seconds to wait before closing the connection
toLanguage¶ (object) – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fromLanguage = Param(parent='undefined', name='fromLanguage', doc='ServiceParam: Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFromLanguage()[source]
- Returns
Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- Return type
fromLanguage
- getTextAndTranslation()[source]
- Returns
A string specifying the translated text previously returned by the Dictionary lookup operation.
- Return type
textAndTranslation
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getToLanguage()[source]
- Returns
Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- Return type
toLanguage
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setFromLanguage(value)[source]
- Parameters
fromLanguage¶ – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- setFromLanguageCol(value)[source]
- Parameters
fromLanguage¶ – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- setParams(AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='DictionaryExamples_40b01f5db1cc_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryExamples_40b01f5db1cc_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, textAndTranslation=None, textAndTranslationCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]
Set the (keyword only) parameters
- setTextAndTranslation(value)[source]
- Parameters
textAndTranslation¶ – A string specifying the translated text previously returned by the Dictionary lookup operation.
- setTextAndTranslationCol(value)[source]
- Parameters
textAndTranslation¶ – A string specifying the translated text previously returned by the Dictionary lookup operation.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- setToLanguage(value)[source]
- Parameters
toLanguage¶ – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- setToLanguageCol(value)[source]
- Parameters
toLanguage¶ – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- textAndTranslation = Param(parent='undefined', name='textAndTranslation', doc='ServiceParam: A string specifying the translated text previously returned by the Dictionary lookup operation.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- toLanguage = Param(parent='undefined', name='toLanguage', doc='ServiceParam: Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.translate.DictionaryLookup module
- class synapse.ml.cognitive.translate.DictionaryLookup.DictionaryLookup(java_obj=None, AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='DictionaryLookup_330fdc843191_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryLookup_330fdc843191_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
fromLanguage¶ (object) – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
handler¶ (object) – Which strategy to use when handling requests
timeout¶ (float) – number of seconds to wait before closing the connection
toLanguage¶ (object) – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fromLanguage = Param(parent='undefined', name='fromLanguage', doc='ServiceParam: Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFromLanguage()[source]
- Returns
Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- Return type
fromLanguage
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getToLanguage()[source]
- Returns
Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- Return type
toLanguage
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setFromLanguage(value)[source]
- Parameters
fromLanguage¶ – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- setFromLanguageCol(value)[source]
- Parameters
fromLanguage¶ – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- setParams(AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='DictionaryLookup_330fdc843191_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryLookup_330fdc843191_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- setToLanguage(value)[source]
- Parameters
toLanguage¶ – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- setToLanguageCol(value)[source]
- Parameters
toLanguage¶ – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- toLanguage = Param(parent='undefined', name='toLanguage', doc='ServiceParam: Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.translate.DocumentTranslator module
- class synapse.ml.cognitive.translate.DocumentTranslator.DocumentTranslator(java_obj=None, AADToken=None, AADTokenCol=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='DocumentTranslator_a6f875f3eb65_error', filterPrefix=None, filterPrefixCol=None, filterSuffix=None, filterSuffixCol=None, initialPollingDelay=300, maxPollingRetries=1000, outputCol='DocumentTranslator_a6f875f3eb65_output', pollingDelay=300, serviceName=None, sourceLanguage=None, sourceLanguageCol=None, sourceStorageSource=None, sourceStorageSourceCol=None, sourceUrl=None, sourceUrlCol=None, storageType=None, storageTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, targets=None, targetsCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
filterPrefix¶ (object) – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.
filterSuffix¶ (object) – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
pollingDelay¶ (int) – number of milliseconds to wait between polling
sourceLanguage¶ (object) – Language code. If none is specified, we will perform auto detect on the document.
sourceStorageSource¶ (object) – Storage source of source input.
sourceUrl¶ (object) – Location of the folder / container or single file with your documents.
storageType¶ (object) – Storage type of the input documents source string. Required for single document translation only.
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
targets¶ (object) – Destination for the finished translated documents.
timeout¶ (float) – number of seconds to wait before closing the connection
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- filterPrefix = Param(parent='undefined', name='filterPrefix', doc='ServiceParam: A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.')
- filterSuffix = Param(parent='undefined', name='filterSuffix', doc='ServiceParam: A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFilterPrefix()[source]
- Returns
A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.
- Return type
filterPrefix
- getFilterSuffix()[source]
- Returns
A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
- Return type
filterSuffix
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSourceLanguage()[source]
- Returns
Language code. If none is specified, we will perform auto detect on the document.
- Return type
sourceLanguage
- getSourceStorageSource()[source]
- Returns
Storage source of source input.
- Return type
sourceStorageSource
- getSourceUrl()[source]
- Returns
Location of the folder / container or single file with your documents.
- Return type
sourceUrl
- getStorageType()[source]
- Returns
Storage type of the input documents source string. Required for single document translation only.
- Return type
storageType
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- getTargets()[source]
- Returns
Destination for the finished translated documents.
- Return type
targets
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- serviceName = Param(parent='undefined', name='serviceName', doc='')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setFilterPrefix(value)[source]
- Parameters
filterPrefix¶ – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.
- setFilterPrefixCol(value)[source]
- Parameters
filterPrefix¶ – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.
- setFilterSuffix(value)[source]
- Parameters
filterSuffix¶ – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
- setFilterSuffixCol(value)[source]
- Parameters
filterSuffix¶ – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setParams(AADToken=None, AADTokenCol=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='DocumentTranslator_a6f875f3eb65_error', filterPrefix=None, filterPrefixCol=None, filterSuffix=None, filterSuffixCol=None, initialPollingDelay=300, maxPollingRetries=1000, outputCol='DocumentTranslator_a6f875f3eb65_output', pollingDelay=300, serviceName=None, sourceLanguage=None, sourceLanguageCol=None, sourceStorageSource=None, sourceStorageSourceCol=None, sourceUrl=None, sourceUrlCol=None, storageType=None, storageTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, targets=None, targetsCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setSourceLanguage(value)[source]
- Parameters
sourceLanguage¶ – Language code. If none is specified, we will perform auto detect on the document.
- setSourceLanguageCol(value)[source]
- Parameters
sourceLanguage¶ – Language code. If none is specified, we will perform auto detect on the document.
- setSourceStorageSource(value)[source]
- Parameters
sourceStorageSource¶ – Storage source of source input.
- setSourceStorageSourceCol(value)[source]
- Parameters
sourceStorageSource¶ – Storage source of source input.
- setSourceUrl(value)[source]
- Parameters
sourceUrl¶ – Location of the folder / container or single file with your documents.
- setSourceUrlCol(value)[source]
- Parameters
sourceUrl¶ – Location of the folder / container or single file with your documents.
- setStorageType(value)[source]
- Parameters
storageType¶ – Storage type of the input documents source string. Required for single document translation only.
- setStorageTypeCol(value)[source]
- Parameters
storageType¶ – Storage type of the input documents source string. Required for single document translation only.
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- setTargetsCol(value)[source]
- Parameters
targets¶ – Destination for the finished translated documents.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- sourceLanguage = Param(parent='undefined', name='sourceLanguage', doc='ServiceParam: Language code. If none is specified, we will perform auto detect on the document.')
- sourceStorageSource = Param(parent='undefined', name='sourceStorageSource', doc='ServiceParam: Storage source of source input.')
- sourceUrl = Param(parent='undefined', name='sourceUrl', doc='ServiceParam: Location of the folder / container or single file with your documents.')
- storageType = Param(parent='undefined', name='storageType', doc='ServiceParam: Storage type of the input documents source string. Required for single document translation only.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- targets = Param(parent='undefined', name='targets', doc='ServiceParam: Destination for the finished translated documents.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.translate.Translate module
- class synapse.ml.cognitive.translate.Translate.Translate(java_obj=None, AADToken=None, AADTokenCol=None, allowFallback=None, allowFallbackCol=None, category=None, categoryCol=None, concurrency=1, concurrentTimeout=None, errorCol='Translate_746836f4c9cb_error', fromLanguage=None, fromLanguageCol=None, fromScript=None, fromScriptCol=None, handler=None, includeAlignment=None, includeAlignmentCol=None, includeSentenceLength=None, includeSentenceLengthCol=None, outputCol='Translate_746836f4c9cb_output', profanityAction=None, profanityActionCol=None, profanityMarker=None, profanityMarkerCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, suggestedFrom=None, suggestedFromCol=None, text=None, textCol=None, textType=None, textTypeCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, toScript=None, toScriptCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
allowFallback¶ (object) – Specifies that the service is allowed to fall back to a general system when a custom system does not exist.
category¶ (object) – A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
fromLanguage¶ (object) – Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.
fromScript¶ (object) – Specifies the script of the input text.
handler¶ (object) – Which strategy to use when handling requests
includeAlignment¶ (object) – Specifies whether to include alignment projection from source text to translated text.
includeSentenceLength¶ (object) – Specifies whether to include sentence boundaries for the input text and the translated text.
profanityAction¶ (object) – Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.
profanityMarker¶ (object) – Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.
suggestedFrom¶ (object) – Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.
textType¶ (object) – Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.
timeout¶ (float) – number of seconds to wait before closing the connection
toLanguage¶ (object) – Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.
toScript¶ (object) – Specifies the script of the translated text.
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- allowFallback = Param(parent='undefined', name='allowFallback', doc='ServiceParam: Specifies that the service is allowed to fall back to a general system when a custom system does not exist. ')
- category = Param(parent='undefined', name='category', doc='ServiceParam: A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fromLanguage = Param(parent='undefined', name='fromLanguage', doc='ServiceParam: Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.')
- fromScript = Param(parent='undefined', name='fromScript', doc='ServiceParam: Specifies the script of the input text.')
- getAllowFallback()[source]
- Returns
Specifies that the service is allowed to fall back to a general system when a custom system does not exist.
- Return type
allowFallback
- getCategory()[source]
- Returns
A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.
- Return type
category
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFromLanguage()[source]
- Returns
Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.
- Return type
fromLanguage
- getIncludeAlignment()[source]
- Returns
Specifies whether to include alignment projection from source text to translated text.
- Return type
includeAlignment
- getIncludeSentenceLength()[source]
- Returns
Specifies whether to include sentence boundaries for the input text and the translated text.
- Return type
includeSentenceLength
- getProfanityAction()[source]
- Returns
Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.
- Return type
profanityAction
- getProfanityMarker()[source]
- Returns
Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.
- Return type
profanityMarker
- getSuggestedFrom()[source]
- Returns
Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.
- Return type
suggestedFrom
- getTextType()[source]
- Returns
Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.
- Return type
textType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getToLanguage()[source]
- Returns
Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.
- Return type
toLanguage
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- includeAlignment = Param(parent='undefined', name='includeAlignment', doc='ServiceParam: Specifies whether to include alignment projection from source text to translated text.')
- includeSentenceLength = Param(parent='undefined', name='includeSentenceLength', doc='ServiceParam: Specifies whether to include sentence boundaries for the input text and the translated text. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- profanityAction = Param(parent='undefined', name='profanityAction', doc='ServiceParam: Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted. ')
- profanityMarker = Param(parent='undefined', name='profanityMarker', doc='ServiceParam: Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.')
- setAllowFallback(value)[source]
- Parameters
allowFallback¶ – Specifies that the service is allowed to fall back to a general system when a custom system does not exist.
- setAllowFallbackCol(value)[source]
- Parameters
allowFallback¶ – Specifies that the service is allowed to fall back to a general system when a custom system does not exist.
- setCategory(value)[source]
- Parameters
category¶ – A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.
- setCategoryCol(value)[source]
- Parameters
category¶ – A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setFromLanguage(value)[source]
- Parameters
fromLanguage¶ – Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.
- setFromLanguageCol(value)[source]
- Parameters
fromLanguage¶ – Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.
- setIncludeAlignment(value)[source]
- Parameters
includeAlignment¶ – Specifies whether to include alignment projection from source text to translated text.
- setIncludeAlignmentCol(value)[source]
- Parameters
includeAlignment¶ – Specifies whether to include alignment projection from source text to translated text.
- setIncludeSentenceLength(value)[source]
- Parameters
includeSentenceLength¶ – Specifies whether to include sentence boundaries for the input text and the translated text.
- setIncludeSentenceLengthCol(value)[source]
- Parameters
includeSentenceLength¶ – Specifies whether to include sentence boundaries for the input text and the translated text.
- setParams(AADToken=None, AADTokenCol=None, allowFallback=None, allowFallbackCol=None, category=None, categoryCol=None, concurrency=1, concurrentTimeout=None, errorCol='Translate_746836f4c9cb_error', fromLanguage=None, fromLanguageCol=None, fromScript=None, fromScriptCol=None, handler=None, includeAlignment=None, includeAlignmentCol=None, includeSentenceLength=None, includeSentenceLengthCol=None, outputCol='Translate_746836f4c9cb_output', profanityAction=None, profanityActionCol=None, profanityMarker=None, profanityMarkerCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, suggestedFrom=None, suggestedFromCol=None, text=None, textCol=None, textType=None, textTypeCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, toScript=None, toScriptCol=None, url=None)[source]
Set the (keyword only) parameters
- setProfanityAction(value)[source]
- Parameters
profanityAction¶ – Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.
- setProfanityActionCol(value)[source]
- Parameters
profanityAction¶ – Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.
- setProfanityMarker(value)[source]
- Parameters
profanityMarker¶ – Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.
- setProfanityMarkerCol(value)[source]
- Parameters
profanityMarker¶ – Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.
- setSuggestedFrom(value)[source]
- Parameters
suggestedFrom¶ – Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.
- setSuggestedFromCol(value)[source]
- Parameters
suggestedFrom¶ – Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.
- setTextType(value)[source]
- Parameters
textType¶ – Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.
- setTextTypeCol(value)[source]
- Parameters
textType¶ – Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- setToLanguage(value)[source]
- Parameters
toLanguage¶ – Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.
- setToLanguageCol(value)[source]
- Parameters
toLanguage¶ – Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- suggestedFrom = Param(parent='undefined', name='suggestedFrom', doc="ServiceParam: Specifies a fallback language if the language of the input text can't be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.")
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- textType = Param(parent='undefined', name='textType', doc='ServiceParam: Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- toLanguage = Param(parent='undefined', name='toLanguage', doc="ServiceParam: Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It's possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.")
- toScript = Param(parent='undefined', name='toScript', doc='ServiceParam: Specifies the script of the translated text.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.translate.Transliterate module
- class synapse.ml.cognitive.translate.Transliterate.Transliterate(java_obj=None, AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='Transliterate_0bd5762da30f_error', fromScript=None, fromScriptCol=None, handler=None, language=None, languageCol=None, outputCol='Transliterate_0bd5762da30f_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toScript=None, toScriptCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
fromScript¶ (object) – Specifies the script of the input text.
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
timeout¶ (float) – number of seconds to wait before closing the connection
toScript¶ (object) – Specifies the script of the translated text.
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fromScript = Param(parent='undefined', name='fromScript', doc='ServiceParam: Specifies the script of the input text.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language¶ – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setLanguageCol(value)[source]
- Parameters
language¶ – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setParams(AADToken=None, AADTokenCol=None, concurrency=1, concurrentTimeout=None, errorCol='Transliterate_0bd5762da30f_error', fromScript=None, fromScriptCol=None, handler=None, language=None, languageCol=None, outputCol='Transliterate_0bd5762da30f_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toScript=None, toScriptCol=None, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- toScript = Param(parent='undefined', name='toScript', doc='ServiceParam: Specifies the script of the translated text.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
Module contents
SynapseML is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK), LightGBM and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of datasources.
SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. In this vein, SynapseML provides easy to use SparkML transformers for a wide variety of Microsoft Cognitive Services. For production grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.
SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+.