synapse.ml.cognitive.text package

Submodules

synapse.ml.cognitive.text.AnalyzeHealthText module

class synapse.ml.cognitive.text.AnalyzeHealthText.AnalyzeHealthText(java_obj=None, AADToken=None, AADTokenCol=None, backoffs=[100, 500, 1000], batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='AnalyzeHealthText_97554c0bafdf_error', initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeHealthText_97554c0bafdf_output', pollingDelay=300, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters

AADToken¶ (object) – AAD Token used for authentication
backoffs¶ (list) – array of backoffs to use in the handler
batchSize¶ (int) – The max size of the buffer
concurrency¶ (int) – max number of concurrent calls
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
disableServiceLogs¶ (object) – disableServiceLogs option
errorCol¶ (str) – column to hold http errors
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
language¶ (object) – the language code of the text (optional for some services)
maxPollingRetries¶ (int) – number of times to poll
modelVersion¶ (object) – Version of the model
outputCol¶ (str) – The name of the output column
pollingDelay¶ (int) – number of milliseconds to wait between polling
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
subscriptionKey¶ (object) – the API key to use
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
text¶ (object) – the text in the request body
timeout¶ (float) – number of seconds to wait before closing the connection
url¶ (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')

backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')

batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')

concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')

concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')

disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')

errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')

getAADToken()[source]

Returns: AAD Token used for authentication
Return type: AADToken

getBackoffs()[source]

Returns: array of backoffs to use in the handler
Return type: backoffs

getBatchSize()[source]

Returns: The max size of the buffer
Return type: batchSize

getConcurrency()[source]

Returns: max number of concurrent calls
Return type: concurrency

getConcurrentTimeout()[source]

Returns: max number seconds to wait on futures if concurrency >= 1
Return type: concurrentTimeout

getDisableServiceLogs()[source]

Returns: disableServiceLogs option
Return type: disableServiceLogs

getErrorCol()[source]

Returns: column to hold http errors
Return type: errorCol

getInitialPollingDelay()[source]

Returns: number of milliseconds to wait before first poll for result
Return type: initialPollingDelay

static getJavaPackage()[source]: Returns package name String.

getLanguage()[source]

Returns: the language code of the text (optional for some services)
Return type: language

getMaxPollingRetries()[source]

Returns: number of times to poll
Return type: maxPollingRetries

getModelVersion()[source]

Returns: Version of the model
Return type: modelVersion

getOutputCol()[source]

Returns: The name of the output column
Return type: outputCol

getPollingDelay()[source]

Returns: number of milliseconds to wait between polling
Return type: pollingDelay

getShowStats()[source]

Returns: Whether to include detailed statistics in the response
Return type: showStats

getStringIndexType()[source]

Returns: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
Return type: stringIndexType

getSubscriptionKey()[source]

Returns: the API key to use
Return type: subscriptionKey

getSuppressMaxRetriesException()[source]

Returns: set true to suppress the maxumimum retries exception and report in the error column
Return type: suppressMaxRetriesException

getText()[source]

Returns: the text in the request body
Return type: text

getTimeout()[source]

Returns: number of seconds to wait before closing the connection
Return type: timeout

getUrl()[source]

Returns: Url of the service
Return type: url

initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')

language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')

maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')

modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')

outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')

pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')

classmethod read()[source]: Returns an MLReader instance for this class.

setAADToken(value)[source]

Parameters: AADToken¶ – AAD Token used for authentication

setAADTokenCol(value)[source]

Parameters: AADToken¶ – AAD Token used for authentication

setBackoffs(value)[source]

Parameters: backoffs¶ – array of backoffs to use in the handler

setBatchSize(value)[source]

Parameters: batchSize¶ – The max size of the buffer

setConcurrency(value)[source]

Parameters: concurrency¶ – max number of concurrent calls

setConcurrentTimeout(value)[source]

Parameters: concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]

setDefaultInternalEndpoint(value)[source]

setDisableServiceLogs(value)[source]

Parameters: disableServiceLogs¶ – disableServiceLogs option

setDisableServiceLogsCol(value)[source]

Parameters: disableServiceLogs¶ – disableServiceLogs option

setEndpoint(value)[source]

setErrorCol(value)[source]

Parameters: errorCol¶ – column to hold http errors

setInitialPollingDelay(value)[source]

Parameters: initialPollingDelay¶ – number of milliseconds to wait before first poll for result

setLanguage(value)[source]

Parameters: language¶ – the language code of the text (optional for some services)

setLanguageCol(value)[source]

Parameters: language¶ – the language code of the text (optional for some services)

setLinkedService(value)[source]

setLocation(value)[source]

setMaxPollingRetries(value)[source]

Parameters: maxPollingRetries¶ – number of times to poll

setModelVersion(value)[source]

Parameters: modelVersion¶ – Version of the model

setModelVersionCol(value)[source]

Parameters: modelVersion¶ – Version of the model

setOutputCol(value)[source]

Parameters: outputCol¶ – The name of the output column

setParams(AADToken=None, AADTokenCol=None, backoffs=[100, 500, 1000], batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='AnalyzeHealthText_97554c0bafdf_error', initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeHealthText_97554c0bafdf_output', pollingDelay=300, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, text=None, textCol=None, timeout=60.0, url=None)[source]: Set the (keyword only) parameters

setPollingDelay(value)[source]

Parameters: pollingDelay¶ – number of milliseconds to wait between polling

setShowStats(value)[source]

Parameters: showStats¶ – Whether to include detailed statistics in the response

setShowStatsCol(value)[source]

Parameters: showStats¶ – Whether to include detailed statistics in the response

setStringIndexType(value)[source]

Parameters: stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setStringIndexTypeCol(value)[source]

Parameters: stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setSubscriptionKey(value)[source]

Parameters: subscriptionKey¶ – the API key to use

setSubscriptionKeyCol(value)[source]

Parameters: subscriptionKey¶ – the API key to use

setSuppressMaxRetriesException(value)[source]

Parameters: suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column

setText(value)[source]

Parameters: text¶ – the text in the request body

setTextCol(value)[source]

Parameters: text¶ – the text in the request body

setTimeout(value)[source]

Parameters: timeout¶ – number of seconds to wait before closing the connection

setUrl(value)[source]

Parameters: url¶ – Url of the service

showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')

stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')

suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')

text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')

timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')

url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.cognitive.text.EntityDetector module

class synapse.ml.cognitive.text.EntityDetector.EntityDetector(java_obj=None, AADToken=None, AADTokenCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='EntityDetector_e03c3cd46e91_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='EntityDetector_e03c3cd46e91_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters

AADToken¶ (object) – AAD Token used for authentication
batchSize¶ (int) – The max size of the buffer
concurrency¶ (int) – max number of concurrent calls
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
disableServiceLogs¶ (object) – disableServiceLogs option
errorCol¶ (str) – column to hold http errors
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
modelVersion¶ (object) – Version of the model
outputCol¶ (str) – The name of the output column
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
subscriptionKey¶ (object) – the API key to use
text¶ (object) – the text in the request body
timeout¶ (float) – number of seconds to wait before closing the connection
url¶ (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')

batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')

concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')

concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')

disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')

errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')

getAADToken()[source]

Returns: AAD Token used for authentication
Return type: AADToken

getBatchSize()[source]

Returns: The max size of the buffer
Return type: batchSize

getConcurrency()[source]

Returns: max number of concurrent calls
Return type: concurrency

getConcurrentTimeout()[source]

Returns: max number seconds to wait on futures if concurrency >= 1
Return type: concurrentTimeout

getDisableServiceLogs()[source]

Returns: disableServiceLogs option
Return type: disableServiceLogs

getErrorCol()[source]

Returns: column to hold http errors
Return type: errorCol

getHandler()[source]

Returns: Which strategy to use when handling requests
Return type: handler

static getJavaPackage()[source]: Returns package name String.

getLanguage()[source]

Returns: the language code of the text (optional for some services)
Return type: language

getModelVersion()[source]

Returns: Version of the model
Return type: modelVersion

getOutputCol()[source]

Returns: The name of the output column
Return type: outputCol

getShowStats()[source]

Returns: Whether to include detailed statistics in the response
Return type: showStats

getStringIndexType()[source]

Returns: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
Return type: stringIndexType

getSubscriptionKey()[source]

Returns: the API key to use
Return type: subscriptionKey

getText()[source]

Returns: the text in the request body
Return type: text

getTimeout()[source]

Returns: number of seconds to wait before closing the connection
Return type: timeout

getUrl()[source]

Returns: Url of the service
Return type: url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')

language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')

modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')

outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')

classmethod read()[source]: Returns an MLReader instance for this class.

setAADToken(value)[source]

Parameters: AADToken¶ – AAD Token used for authentication

setAADTokenCol(value)[source]

Parameters: AADToken¶ – AAD Token used for authentication

setBatchSize(value)[source]

Parameters: batchSize¶ – The max size of the buffer

setConcurrency(value)[source]

Parameters: concurrency¶ – max number of concurrent calls

setConcurrentTimeout(value)[source]

Parameters: concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]

setDefaultInternalEndpoint(value)[source]

setDisableServiceLogs(value)[source]

Parameters: disableServiceLogs¶ – disableServiceLogs option

setDisableServiceLogsCol(value)[source]

Parameters: disableServiceLogs¶ – disableServiceLogs option

setEndpoint(value)[source]

setErrorCol(value)[source]

Parameters: errorCol¶ – column to hold http errors

setHandler(value)[source]

Parameters: handler¶ – Which strategy to use when handling requests

setLanguage(value)[source]

Parameters: language¶ – the language code of the text (optional for some services)

setLanguageCol(value)[source]

Parameters: language¶ – the language code of the text (optional for some services)

setLinkedService(value)[source]

setLocation(value)[source]

setModelVersion(value)[source]

Parameters: modelVersion¶ – Version of the model

setModelVersionCol(value)[source]

Parameters: modelVersion¶ – Version of the model

setOutputCol(value)[source]

Parameters: outputCol¶ – The name of the output column

setParams(AADToken=None, AADTokenCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='EntityDetector_e03c3cd46e91_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='EntityDetector_e03c3cd46e91_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]: Set the (keyword only) parameters

setShowStats(value)[source]

Parameters: showStats¶ – Whether to include detailed statistics in the response

setShowStatsCol(value)[source]

Parameters: showStats¶ – Whether to include detailed statistics in the response

setStringIndexType(value)[source]

Parameters: stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setStringIndexTypeCol(value)[source]

Parameters: stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setSubscriptionKey(value)[source]

Parameters: subscriptionKey¶ – the API key to use

setSubscriptionKeyCol(value)[source]

Parameters: subscriptionKey¶ – the API key to use

setText(value)[source]

Parameters: text¶ – the text in the request body

setTextCol(value)[source]

Parameters: text¶ – the text in the request body

setTimeout(value)[source]

Parameters: timeout¶ – number of seconds to wait before closing the connection

setUrl(value)[source]

Parameters: url¶ – Url of the service

showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')

stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')

text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')

timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')

url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.cognitive.text.KeyPhraseExtractor module

class synapse.ml.cognitive.text.KeyPhraseExtractor.KeyPhraseExtractor(java_obj=None, AADToken=None, AADTokenCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='KeyPhraseExtractor_198226088f4b_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='KeyPhraseExtractor_198226088f4b_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters

AADToken¶ (object) – AAD Token used for authentication
batchSize¶ (int) – The max size of the buffer
concurrency¶ (int) – max number of concurrent calls
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
disableServiceLogs¶ (object) – disableServiceLogs option
errorCol¶ (str) – column to hold http errors
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
modelVersion¶ (object) – Version of the model
outputCol¶ (str) – The name of the output column
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
subscriptionKey¶ (object) – the API key to use
text¶ (object) – the text in the request body
timeout¶ (float) – number of seconds to wait before closing the connection
url¶ (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')

batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')

concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')

concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')

disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')

errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')

getAADToken()[source]

Returns: AAD Token used for authentication
Return type: AADToken

getBatchSize()[source]

Returns: The max size of the buffer
Return type: batchSize

getConcurrency()[source]

Returns: max number of concurrent calls
Return type: concurrency

getConcurrentTimeout()[source]

Returns: max number seconds to wait on futures if concurrency >= 1
Return type: concurrentTimeout

getDisableServiceLogs()[source]

Returns: disableServiceLogs option
Return type: disableServiceLogs

getErrorCol()[source]

Returns: column to hold http errors
Return type: errorCol

getHandler()[source]

Returns: Which strategy to use when handling requests
Return type: handler

static getJavaPackage()[source]: Returns package name String.

getLanguage()[source]

Returns: the language code of the text (optional for some services)
Return type: language

getModelVersion()[source]

Returns: Version of the model
Return type: modelVersion

getOutputCol()[source]

Returns: The name of the output column
Return type: outputCol

getShowStats()[source]

Returns: Whether to include detailed statistics in the response
Return type: showStats

getStringIndexType()[source]

Returns: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
Return type: stringIndexType

getSubscriptionKey()[source]

Returns: the API key to use
Return type: subscriptionKey

getText()[source]

Returns: the text in the request body
Return type: text

getTimeout()[source]

Returns: number of seconds to wait before closing the connection
Return type: timeout

getUrl()[source]

Returns: Url of the service
Return type: url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')

language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')

modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')

outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')

classmethod read()[source]: Returns an MLReader instance for this class.

setAADToken(value)[source]

Parameters: AADToken¶ – AAD Token used for authentication

setAADTokenCol(value)[source]

Parameters: AADToken¶ – AAD Token used for authentication

setBatchSize(value)[source]

Parameters: batchSize¶ – The max size of the buffer

setConcurrency(value)[source]

Parameters: concurrency¶ – max number of concurrent calls

setConcurrentTimeout(value)[source]

Parameters: concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]

setDefaultInternalEndpoint(value)[source]

setDisableServiceLogs(value)[source]

Parameters: disableServiceLogs¶ – disableServiceLogs option

setDisableServiceLogsCol(value)[source]

Parameters: disableServiceLogs¶ – disableServiceLogs option

setEndpoint(value)[source]

setErrorCol(value)[source]

Parameters: errorCol¶ – column to hold http errors

setHandler(value)[source]

Parameters: handler¶ – Which strategy to use when handling requests

setLanguage(value)[source]

Parameters: language¶ – the language code of the text (optional for some services)

setLanguageCol(value)[source]

Parameters: language¶ – the language code of the text (optional for some services)

setLinkedService(value)[source]

setLocation(value)[source]

setModelVersion(value)[source]

Parameters: modelVersion¶ – Version of the model

setModelVersionCol(value)[source]

Parameters: modelVersion¶ – Version of the model

setOutputCol(value)[source]

Parameters: outputCol¶ – The name of the output column

setParams(AADToken=None, AADTokenCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='KeyPhraseExtractor_198226088f4b_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='KeyPhraseExtractor_198226088f4b_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]: Set the (keyword only) parameters

setShowStats(value)[source]

Parameters: showStats¶ – Whether to include detailed statistics in the response

setShowStatsCol(value)[source]

Parameters: showStats¶ – Whether to include detailed statistics in the response

setStringIndexType(value)[source]

Parameters: stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setStringIndexTypeCol(value)[source]

Parameters: stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setSubscriptionKey(value)[source]

Parameters: subscriptionKey¶ – the API key to use

setSubscriptionKeyCol(value)[source]

Parameters: subscriptionKey¶ – the API key to use

setText(value)[source]

Parameters: text¶ – the text in the request body

setTextCol(value)[source]

Parameters: text¶ – the text in the request body

setTimeout(value)[source]

Parameters: timeout¶ – number of seconds to wait before closing the connection

setUrl(value)[source]

Parameters: url¶ – Url of the service

showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')

stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')

text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')

timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')

url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.cognitive.text.LanguageDetector module

class synapse.ml.cognitive.text.LanguageDetector.LanguageDetector(java_obj=None, AADToken=None, AADTokenCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='LanguageDetector_d673d9a511d7_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='LanguageDetector_d673d9a511d7_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters

AADToken¶ (object) – AAD Token used for authentication
batchSize¶ (int) – The max size of the buffer
concurrency¶ (int) – max number of concurrent calls
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
disableServiceLogs¶ (object) – disableServiceLogs option
errorCol¶ (str) – column to hold http errors
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
modelVersion¶ (object) – Version of the model
outputCol¶ (str) – The name of the output column
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
subscriptionKey¶ (object) – the API key to use
text¶ (object) – the text in the request body
timeout¶ (float) – number of seconds to wait before closing the connection
url¶ (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')

batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')

concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')

concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')

disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')

errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')

getAADToken()[source]

Returns: AAD Token used for authentication
Return type: AADToken

getBatchSize()[source]

Returns: The max size of the buffer
Return type: batchSize

getConcurrency()[source]

Returns: max number of concurrent calls
Return type: concurrency

getConcurrentTimeout()[source]

Returns: max number seconds to wait on futures if concurrency >= 1
Return type: concurrentTimeout

getDisableServiceLogs()[source]

Returns: disableServiceLogs option
Return type: disableServiceLogs

getErrorCol()[source]

Returns: column to hold http errors
Return type: errorCol

getHandler()[source]

Returns: Which strategy to use when handling requests
Return type: handler

static getJavaPackage()[source]: Returns package name String.

getLanguage()[source]

Returns: the language code of the text (optional for some services)
Return type: language

getModelVersion()[source]

Returns: Version of the model
Return type: modelVersion

getOutputCol()[source]

Returns: The name of the output column
Return type: outputCol

getShowStats()[source]

Returns: Whether to include detailed statistics in the response
Return type: showStats

getStringIndexType()[source]

Returns: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
Return type: stringIndexType

getSubscriptionKey()[source]

Returns: the API key to use
Return type: subscriptionKey

getText()[source]

Returns: the text in the request body
Return type: text

getTimeout()[source]

Returns: number of seconds to wait before closing the connection
Return type: timeout

getUrl()[source]

Returns: Url of the service
Return type: url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')

language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')

modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')

outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')

classmethod read()[source]: Returns an MLReader instance for this class.

setAADToken(value)[source]

Parameters: AADToken¶ – AAD Token used for authentication

setAADTokenCol(value)[source]

Parameters: AADToken¶ – AAD Token used for authentication

setBatchSize(value)[source]

Parameters: batchSize¶ – The max size of the buffer

setConcurrency(value)[source]

Parameters: concurrency¶ – max number of concurrent calls

setConcurrentTimeout(value)[source]

Parameters: concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]

setDefaultInternalEndpoint(value)[source]

setDisableServiceLogs(value)[source]

Parameters: disableServiceLogs¶ – disableServiceLogs option

setDisableServiceLogsCol(value)[source]

Parameters: disableServiceLogs¶ – disableServiceLogs option

setEndpoint(value)[source]

setErrorCol(value)[source]

Parameters: errorCol¶ – column to hold http errors

setHandler(value)[source]

Parameters: handler¶ – Which strategy to use when handling requests

setLanguage(value)[source]

Parameters: language¶ – the language code of the text (optional for some services)

setLanguageCol(value)[source]

Parameters: language¶ – the language code of the text (optional for some services)

setLinkedService(value)[source]

setLocation(value)[source]

setModelVersion(value)[source]

Parameters: modelVersion¶ – Version of the model

setModelVersionCol(value)[source]

Parameters: modelVersion¶ – Version of the model

setOutputCol(value)[source]

Parameters: outputCol¶ – The name of the output column

setParams(AADToken=None, AADTokenCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='LanguageDetector_d673d9a511d7_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='LanguageDetector_d673d9a511d7_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]: Set the (keyword only) parameters

setShowStats(value)[source]

Parameters: showStats¶ – Whether to include detailed statistics in the response

setShowStatsCol(value)[source]

Parameters: showStats¶ – Whether to include detailed statistics in the response

setStringIndexType(value)[source]

Parameters: stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setStringIndexTypeCol(value)[source]

Parameters: stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setSubscriptionKey(value)[source]

Parameters: subscriptionKey¶ – the API key to use

setSubscriptionKeyCol(value)[source]

Parameters: subscriptionKey¶ – the API key to use

setText(value)[source]

Parameters: text¶ – the text in the request body

setTextCol(value)[source]

Parameters: text¶ – the text in the request body

setTimeout(value)[source]

Parameters: timeout¶ – number of seconds to wait before closing the connection

setUrl(value)[source]

Parameters: url¶ – Url of the service

showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')

stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')

text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')

timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')

url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.cognitive.text.NER module

class synapse.ml.cognitive.text.NER.NER(java_obj=None, AADToken=None, AADTokenCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='NER_5faec32c2188_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='NER_5faec32c2188_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters

AADToken¶ (object) – AAD Token used for authentication
batchSize¶ (int) – The max size of the buffer
concurrency¶ (int) – max number of concurrent calls
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
disableServiceLogs¶ (object) – disableServiceLogs option
errorCol¶ (str) – column to hold http errors
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
modelVersion¶ (object) – Version of the model
outputCol¶ (str) – The name of the output column
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
subscriptionKey¶ (object) – the API key to use
text¶ (object) – the text in the request body
timeout¶ (float) – number of seconds to wait before closing the connection
url¶ (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')

batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')

concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')

concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')

disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')

errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')

getAADToken()[source]

Returns: AAD Token used for authentication
Return type: AADToken

getBatchSize()[source]

Returns: The max size of the buffer
Return type: batchSize

getConcurrency()[source]

Returns: max number of concurrent calls
Return type: concurrency

getConcurrentTimeout()[source]

Returns: max number seconds to wait on futures if concurrency >= 1
Return type: concurrentTimeout

getDisableServiceLogs()[source]

Returns: disableServiceLogs option
Return type: disableServiceLogs

getErrorCol()[source]

Returns: column to hold http errors
Return type: errorCol

getHandler()[source]

Returns: Which strategy to use when handling requests
Return type: handler

static getJavaPackage()[source]: Returns package name String.

getLanguage()[source]

Returns: the language code of the text (optional for some services)
Return type: language

getModelVersion()[source]

Returns: Version of the model
Return type: modelVersion

getOutputCol()[source]

Returns: The name of the output column
Return type: outputCol

getShowStats()[source]

Returns: Whether to include detailed statistics in the response
Return type: showStats

getStringIndexType()[source]

Returns: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
Return type: stringIndexType

getSubscriptionKey()[source]

Returns: the API key to use
Return type: subscriptionKey

getText()[source]

Returns: the text in the request body
Return type: text

getTimeout()[source]

Returns: number of seconds to wait before closing the connection
Return type: timeout

getUrl()[source]

Returns: Url of the service
Return type: url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')

language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')

modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')

outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')

classmethod read()[source]: Returns an MLReader instance for this class.

setAADToken(value)[source]

Parameters: AADToken¶ – AAD Token used for authentication

setAADTokenCol(value)[source]

Parameters: AADToken¶ – AAD Token used for authentication

setBatchSize(value)[source]

Parameters: batchSize¶ – The max size of the buffer

setConcurrency(value)[source]

Parameters: concurrency¶ – max number of concurrent calls

setConcurrentTimeout(value)[source]

Parameters: concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]

setDefaultInternalEndpoint(value)[source]

setDisableServiceLogs(value)[source]

Parameters: disableServiceLogs¶ – disableServiceLogs option

setDisableServiceLogsCol(value)[source]

Parameters: disableServiceLogs¶ – disableServiceLogs option

setEndpoint(value)[source]

setErrorCol(value)[source]

Parameters: errorCol¶ – column to hold http errors

setHandler(value)[source]

Parameters: handler¶ – Which strategy to use when handling requests

setLanguage(value)[source]

Parameters: language¶ – the language code of the text (optional for some services)

setLanguageCol(value)[source]

Parameters: language¶ – the language code of the text (optional for some services)

setLinkedService(value)[source]

setLocation(value)[source]

setModelVersion(value)[source]

Parameters: modelVersion¶ – Version of the model

setModelVersionCol(value)[source]

Parameters: modelVersion¶ – Version of the model

setOutputCol(value)[source]

Parameters: outputCol¶ – The name of the output column

setParams(AADToken=None, AADTokenCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='NER_5faec32c2188_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='NER_5faec32c2188_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]: Set the (keyword only) parameters

setShowStats(value)[source]

Parameters: showStats¶ – Whether to include detailed statistics in the response

setShowStatsCol(value)[source]

Parameters: showStats¶ – Whether to include detailed statistics in the response

setStringIndexType(value)[source]

Parameters: stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setStringIndexTypeCol(value)[source]

Parameters: stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setSubscriptionKey(value)[source]

Parameters: subscriptionKey¶ – the API key to use

setSubscriptionKeyCol(value)[source]

Parameters: subscriptionKey¶ – the API key to use

setText(value)[source]

Parameters: text¶ – the text in the request body

setTextCol(value)[source]

Parameters: text¶ – the text in the request body

setTimeout(value)[source]

Parameters: timeout¶ – number of seconds to wait before closing the connection

setUrl(value)[source]

Parameters: url¶ – Url of the service

showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')

stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')

text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')

timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')

url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.cognitive.text.PII module

class synapse.ml.cognitive.text.PII.PII(java_obj=None, AADToken=None, AADTokenCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, domain=None, domainCol=None, errorCol='PII_53078cdd5e7c_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='PII_53078cdd5e7c_output', piiCategories=None, piiCategoriesCol=None, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters

AADToken¶ (object) – AAD Token used for authentication
batchSize¶ (int) – The max size of the buffer
concurrency¶ (int) – max number of concurrent calls
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
disableServiceLogs¶ (object) – disableServiceLogs option
domain¶ (object) – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
errorCol¶ (str) – column to hold http errors
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
modelVersion¶ (object) – Version of the model
outputCol¶ (str) – The name of the output column
piiCategories¶ (object) – describes the PII categories to return
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
subscriptionKey¶ (object) – the API key to use
text¶ (object) – the text in the request body
timeout¶ (float) – number of seconds to wait before closing the connection
url¶ (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')

batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')

concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')

concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')

disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')

domain = Param(parent='undefined', name='domain', doc="ServiceParam: if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: 'PHI', 'none'.")

errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')

getAADToken()[source]

Returns: AAD Token used for authentication
Return type: AADToken

getBatchSize()[source]

Returns: The max size of the buffer
Return type: batchSize

getConcurrency()[source]

Returns: max number of concurrent calls
Return type: concurrency

getConcurrentTimeout()[source]

Returns: max number seconds to wait on futures if concurrency >= 1
Return type: concurrentTimeout

getDisableServiceLogs()[source]

Returns: disableServiceLogs option
Return type: disableServiceLogs

getDomain()[source]

Returns: if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
Return type: domain

getErrorCol()[source]

Returns: column to hold http errors
Return type: errorCol

getHandler()[source]

Returns: Which strategy to use when handling requests
Return type: handler

static getJavaPackage()[source]: Returns package name String.

getLanguage()[source]

Returns: the language code of the text (optional for some services)
Return type: language

getModelVersion()[source]

Returns: Version of the model
Return type: modelVersion

getOutputCol()[source]

Returns: The name of the output column
Return type: outputCol

getPiiCategories()[source]

Returns: describes the PII categories to return
Return type: piiCategories

getShowStats()[source]

Returns: Whether to include detailed statistics in the response
Return type: showStats

getStringIndexType()[source]

Returns: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
Return type: stringIndexType

getSubscriptionKey()[source]

Returns: the API key to use
Return type: subscriptionKey

getText()[source]

Returns: the text in the request body
Return type: text

getTimeout()[source]

Returns: number of seconds to wait before closing the connection
Return type: timeout

getUrl()[source]

Returns: Url of the service
Return type: url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')

language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')

modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')

outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')

piiCategories = Param(parent='undefined', name='piiCategories', doc='ServiceParam: describes the PII categories to return')

classmethod read()[source]: Returns an MLReader instance for this class.

setAADToken(value)[source]

Parameters: AADToken¶ – AAD Token used for authentication

setAADTokenCol(value)[source]

Parameters: AADToken¶ – AAD Token used for authentication

setBatchSize(value)[source]

Parameters: batchSize¶ – The max size of the buffer

setConcurrency(value)[source]

Parameters: concurrency¶ – max number of concurrent calls

setConcurrentTimeout(value)[source]

Parameters: concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]

setDefaultInternalEndpoint(value)[source]

setDisableServiceLogs(value)[source]

Parameters: disableServiceLogs¶ – disableServiceLogs option

setDisableServiceLogsCol(value)[source]

Parameters: disableServiceLogs¶ – disableServiceLogs option

setDomain(value)[source]

Parameters: domain¶ – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.

setDomainCol(value)[source]

Parameters: domain¶ – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.

setEndpoint(value)[source]

setErrorCol(value)[source]

Parameters: errorCol¶ – column to hold http errors

setHandler(value)[source]

Parameters: handler¶ – Which strategy to use when handling requests

setLanguage(value)[source]

Parameters: language¶ – the language code of the text (optional for some services)

setLanguageCol(value)[source]

Parameters: language¶ – the language code of the text (optional for some services)

setLinkedService(value)[source]

setLocation(value)[source]

setModelVersion(value)[source]

Parameters: modelVersion¶ – Version of the model

setModelVersionCol(value)[source]

Parameters: modelVersion¶ – Version of the model

setOutputCol(value)[source]

Parameters: outputCol¶ – The name of the output column

setParams(AADToken=None, AADTokenCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, domain=None, domainCol=None, errorCol='PII_53078cdd5e7c_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='PII_53078cdd5e7c_output', piiCategories=None, piiCategoriesCol=None, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]: Set the (keyword only) parameters

setPiiCategories(value)[source]

Parameters: piiCategories¶ – describes the PII categories to return

setPiiCategoriesCol(value)[source]

Parameters: piiCategories¶ – describes the PII categories to return

setShowStats(value)[source]

Parameters: showStats¶ – Whether to include detailed statistics in the response

setShowStatsCol(value)[source]

Parameters: showStats¶ – Whether to include detailed statistics in the response

setStringIndexType(value)[source]

Parameters: stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setStringIndexTypeCol(value)[source]

Parameters: stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setSubscriptionKey(value)[source]

Parameters: subscriptionKey¶ – the API key to use

setSubscriptionKeyCol(value)[source]

Parameters: subscriptionKey¶ – the API key to use

setText(value)[source]

Parameters: text¶ – the text in the request body

setTextCol(value)[source]

Parameters: text¶ – the text in the request body

setTimeout(value)[source]

Parameters: timeout¶ – number of seconds to wait before closing the connection

setUrl(value)[source]

Parameters: url¶ – Url of the service

showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')

stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')

text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')

timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')

url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.cognitive.text.TextAnalyze module

class synapse.ml.cognitive.text.TextAnalyze.TextAnalyze(java_obj=None, AADToken=None, AADTokenCol=None, backoffs=[100, 500, 1000], batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, entityLinkingParams={'model-version': 'latest'}, entityRecognitionParams={'model-version': 'latest'}, errorCol='TextAnalyze_bc5be416b014_error', includeEntityLinking=True, includeEntityRecognition=True, includeKeyPhraseExtraction=True, includePii=True, includeSentimentAnalysis=True, initialPollingDelay=300, keyPhraseExtractionParams={'model-version': 'latest'}, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='TextAnalyze_bc5be416b014_output', piiParams={'model-version': 'latest'}, pollingDelay=300, sentimentAnalysisParams={'model-version': 'latest'}, showStats=None, showStatsCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters

AADToken¶ (object) – AAD Token used for authentication
backoffs¶ (list) – array of backoffs to use in the handler
batchSize¶ (int) – The max size of the buffer
concurrency¶ (int) – max number of concurrent calls
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
disableServiceLogs¶ (object) – disableServiceLogs option
entityLinkingParams¶ (dict) – the parameters to pass to the entityLinking model
entityRecognitionParams¶ (dict) – the parameters to pass to the entity recognition model
errorCol¶ (str) – column to hold http errors
includeEntityLinking¶ (bool) – Whether to perform EntityLinking
includeEntityRecognition¶ (bool) – Whether to perform entity recognition
includeKeyPhraseExtraction¶ (bool) – Whether to perform EntityLinking
includePii¶ (bool) – Whether to perform PII Detection
includeSentimentAnalysis¶ (bool) – Whether to perform SentimentAnalysis
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
keyPhraseExtractionParams¶ (dict) – the parameters to pass to the keyPhraseExtraction model
language¶ (object) – the language code of the text (optional for some services)
maxPollingRetries¶ (int) – number of times to poll
modelVersion¶ (object) – Version of the model
outputCol¶ (str) – The name of the output column
piiParams¶ (dict) – the parameters to pass to the PII model
pollingDelay¶ (int) – number of milliseconds to wait between polling
sentimentAnalysisParams¶ (dict) – the parameters to pass to the sentimentAnalysis model
showStats¶ (object) – Whether to include detailed statistics in the response
subscriptionKey¶ (object) – the API key to use
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
text¶ (object) – the text in the request body
timeout¶ (float) – number of seconds to wait before closing the connection
url¶ (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')

backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')

batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')

concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')

concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')

disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')

entityLinkingParams = Param(parent='undefined', name='entityLinkingParams', doc='the parameters to pass to the entityLinking model')

entityRecognitionParams = Param(parent='undefined', name='entityRecognitionParams', doc='the parameters to pass to the entity recognition model')

errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')

getAADToken()[source]

Returns: AAD Token used for authentication
Return type: AADToken

getBackoffs()[source]

Returns: array of backoffs to use in the handler
Return type: backoffs

getBatchSize()[source]

Returns: The max size of the buffer
Return type: batchSize

getConcurrency()[source]

Returns: max number of concurrent calls
Return type: concurrency

getConcurrentTimeout()[source]

Returns: max number seconds to wait on futures if concurrency >= 1
Return type: concurrentTimeout

getDisableServiceLogs()[source]

Returns: disableServiceLogs option
Return type: disableServiceLogs

getEntityLinkingParams()[source]

Returns: the parameters to pass to the entityLinking model
Return type: entityLinkingParams

getEntityRecognitionParams()[source]

Returns: the parameters to pass to the entity recognition model
Return type: entityRecognitionParams

getErrorCol()[source]

Returns: column to hold http errors
Return type: errorCol

getIncludeEntityLinking()[source]

Returns: Whether to perform EntityLinking
Return type: includeEntityLinking

getIncludeEntityRecognition()[source]

Returns: Whether to perform entity recognition
Return type: includeEntityRecognition

getIncludeKeyPhraseExtraction()[source]

Returns: Whether to perform EntityLinking
Return type: includeKeyPhraseExtraction

getIncludePii()[source]

Returns: Whether to perform PII Detection
Return type: includePii

getIncludeSentimentAnalysis()[source]

Returns: Whether to perform SentimentAnalysis
Return type: includeSentimentAnalysis

getInitialPollingDelay()[source]

Returns: number of milliseconds to wait before first poll for result
Return type: initialPollingDelay

static getJavaPackage()[source]: Returns package name String.

getKeyPhraseExtractionParams()[source]

Returns: the parameters to pass to the keyPhraseExtraction model
Return type: keyPhraseExtractionParams

getLanguage()[source]

Returns: the language code of the text (optional for some services)
Return type: language

getMaxPollingRetries()[source]

Returns: number of times to poll
Return type: maxPollingRetries

getModelVersion()[source]

Returns: Version of the model
Return type: modelVersion

getOutputCol()[source]

Returns: The name of the output column
Return type: outputCol

getPiiParams()[source]

Returns: the parameters to pass to the PII model
Return type: piiParams

getPollingDelay()[source]

Returns: number of milliseconds to wait between polling
Return type: pollingDelay

getSentimentAnalysisParams()[source]

Returns: the parameters to pass to the sentimentAnalysis model
Return type: sentimentAnalysisParams

getShowStats()[source]

Returns: Whether to include detailed statistics in the response
Return type: showStats

getSubscriptionKey()[source]

Returns: the API key to use
Return type: subscriptionKey

getSuppressMaxRetriesException()[source]

Returns: set true to suppress the maxumimum retries exception and report in the error column
Return type: suppressMaxRetriesException

getText()[source]

Returns: the text in the request body
Return type: text

getTimeout()[source]

Returns: number of seconds to wait before closing the connection
Return type: timeout

getUrl()[source]

Returns: Url of the service
Return type: url

includeEntityLinking = Param(parent='undefined', name='includeEntityLinking', doc='Whether to perform EntityLinking')

includeEntityRecognition = Param(parent='undefined', name='includeEntityRecognition', doc='Whether to perform entity recognition')

includeKeyPhraseExtraction = Param(parent='undefined', name='includeKeyPhraseExtraction', doc='Whether to perform EntityLinking')

includePii = Param(parent='undefined', name='includePii', doc='Whether to perform PII Detection')

includeSentimentAnalysis = Param(parent='undefined', name='includeSentimentAnalysis', doc='Whether to perform SentimentAnalysis')

initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')

keyPhraseExtractionParams = Param(parent='undefined', name='keyPhraseExtractionParams', doc='the parameters to pass to the keyPhraseExtraction model')

language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')

maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')

modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')

outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')

piiParams = Param(parent='undefined', name='piiParams', doc='the parameters to pass to the PII model')

pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')

classmethod read()[source]: Returns an MLReader instance for this class.

sentimentAnalysisParams = Param(parent='undefined', name='sentimentAnalysisParams', doc='the parameters to pass to the sentimentAnalysis model')

setAADToken(value)[source]

Parameters: AADToken¶ – AAD Token used for authentication

setAADTokenCol(value)[source]

Parameters: AADToken¶ – AAD Token used for authentication

setBackoffs(value)[source]

Parameters: backoffs¶ – array of backoffs to use in the handler

setBatchSize(value)[source]

Parameters: batchSize¶ – The max size of the buffer

setConcurrency(value)[source]

Parameters: concurrency¶ – max number of concurrent calls

setConcurrentTimeout(value)[source]

Parameters: concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]

setDefaultInternalEndpoint(value)[source]

setDisableServiceLogs(value)[source]

Parameters: disableServiceLogs¶ – disableServiceLogs option

setDisableServiceLogsCol(value)[source]

Parameters: disableServiceLogs¶ – disableServiceLogs option

setEndpoint(value)[source]

setEntityLinkingParams(value)[source]

Parameters: entityLinkingParams¶ – the parameters to pass to the entityLinking model

setEntityRecognitionParams(value)[source]

Parameters: entityRecognitionParams¶ – the parameters to pass to the entity recognition model

setErrorCol(value)[source]

Parameters: errorCol¶ – column to hold http errors

setIncludeEntityLinking(value)[source]

Parameters: includeEntityLinking¶ – Whether to perform EntityLinking

setIncludeEntityRecognition(value)[source]

Parameters: includeEntityRecognition¶ – Whether to perform entity recognition

setIncludeKeyPhraseExtraction(value)[source]

Parameters: includeKeyPhraseExtraction¶ – Whether to perform EntityLinking

setIncludePii(value)[source]

Parameters: includePii¶ – Whether to perform PII Detection

setIncludeSentimentAnalysis(value)[source]

Parameters: includeSentimentAnalysis¶ – Whether to perform SentimentAnalysis

setInitialPollingDelay(value)[source]

Parameters: initialPollingDelay¶ – number of milliseconds to wait before first poll for result

setKeyPhraseExtractionParams(value)[source]

Parameters: keyPhraseExtractionParams¶ – the parameters to pass to the keyPhraseExtraction model

setLanguage(value)[source]

Parameters: language¶ – the language code of the text (optional for some services)

setLanguageCol(value)[source]

Parameters: language¶ – the language code of the text (optional for some services)

setLinkedService(value)[source]

setLocation(value)[source]

setMaxPollingRetries(value)[source]

Parameters: maxPollingRetries¶ – number of times to poll

setModelVersion(value)[source]

Parameters: modelVersion¶ – Version of the model

setModelVersionCol(value)[source]

Parameters: modelVersion¶ – Version of the model

setOutputCol(value)[source]

Parameters: outputCol¶ – The name of the output column

setParams(AADToken=None, AADTokenCol=None, backoffs=[100, 500, 1000], batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, entityLinkingParams={'model-version': 'latest'}, entityRecognitionParams={'model-version': 'latest'}, errorCol='TextAnalyze_bc5be416b014_error', includeEntityLinking=True, includeEntityRecognition=True, includeKeyPhraseExtraction=True, includePii=True, includeSentimentAnalysis=True, initialPollingDelay=300, keyPhraseExtractionParams={'model-version': 'latest'}, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='TextAnalyze_bc5be416b014_output', piiParams={'model-version': 'latest'}, pollingDelay=300, sentimentAnalysisParams={'model-version': 'latest'}, showStats=None, showStatsCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, text=None, textCol=None, timeout=60.0, url=None)[source]: Set the (keyword only) parameters

setPiiParams(value)[source]

Parameters: piiParams¶ – the parameters to pass to the PII model

setPollingDelay(value)[source]

Parameters: pollingDelay¶ – number of milliseconds to wait between polling

setSentimentAnalysisParams(value)[source]

Parameters: sentimentAnalysisParams¶ – the parameters to pass to the sentimentAnalysis model

setShowStats(value)[source]

Parameters: showStats¶ – Whether to include detailed statistics in the response

setShowStatsCol(value)[source]

Parameters: showStats¶ – Whether to include detailed statistics in the response

setSubscriptionKey(value)[source]

Parameters: subscriptionKey¶ – the API key to use

setSubscriptionKeyCol(value)[source]

Parameters: subscriptionKey¶ – the API key to use

setSuppressMaxRetriesException(value)[source]

Parameters: suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column

setText(value)[source]

Parameters: text¶ – the text in the request body

setTextCol(value)[source]

Parameters: text¶ – the text in the request body

setTimeout(value)[source]

Parameters: timeout¶ – number of seconds to wait before closing the connection

setUrl(value)[source]

Parameters: url¶ – Url of the service

showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')

suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')

text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')

timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')

url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.cognitive.text.TextSentiment module

class synapse.ml.cognitive.text.TextSentiment.TextSentiment(java_obj=None, AADToken=None, AADTokenCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='TextSentiment_3df38a3624ef_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, opinionMining=None, opinionMiningCol=None, outputCol='TextSentiment_3df38a3624ef_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters

AADToken¶ (object) – AAD Token used for authentication
batchSize¶ (int) – The max size of the buffer
concurrency¶ (int) – max number of concurrent calls
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
disableServiceLogs¶ (object) – disableServiceLogs option
errorCol¶ (str) – column to hold http errors
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
modelVersion¶ (object) – Version of the model
opinionMining¶ (object) – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
outputCol¶ (str) – The name of the output column
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
subscriptionKey¶ (object) – the API key to use
text¶ (object) – the text in the request body
timeout¶ (float) – number of seconds to wait before closing the connection
url¶ (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')

batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')

concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')

concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')

disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')

errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')

getAADToken()[source]

Returns: AAD Token used for authentication
Return type: AADToken

getBatchSize()[source]

Returns: The max size of the buffer
Return type: batchSize

getConcurrency()[source]

Returns: max number of concurrent calls
Return type: concurrency

getConcurrentTimeout()[source]

Returns: max number seconds to wait on futures if concurrency >= 1
Return type: concurrentTimeout

getDisableServiceLogs()[source]

Returns: disableServiceLogs option
Return type: disableServiceLogs

getErrorCol()[source]

Returns: column to hold http errors
Return type: errorCol

getHandler()[source]

Returns: Which strategy to use when handling requests
Return type: handler

static getJavaPackage()[source]: Returns package name String.

getLanguage()[source]

Returns: the language code of the text (optional for some services)
Return type: language

getModelVersion()[source]

Returns: Version of the model
Return type: modelVersion

getOpinionMining()[source]

Returns: if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
Return type: opinionMining

getOutputCol()[source]

Returns: The name of the output column
Return type: outputCol

getShowStats()[source]

Returns: Whether to include detailed statistics in the response
Return type: showStats

getStringIndexType()[source]

Returns: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
Return type: stringIndexType

getSubscriptionKey()[source]

Returns: the API key to use
Return type: subscriptionKey

getText()[source]

Returns: the text in the request body
Return type: text

getTimeout()[source]

Returns: number of seconds to wait before closing the connection
Return type: timeout

getUrl()[source]

Returns: Url of the service
Return type: url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')

language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')

modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')

opinionMining = Param(parent='undefined', name='opinionMining', doc='ServiceParam: if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.')

outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')

classmethod read()[source]: Returns an MLReader instance for this class.

setAADToken(value)[source]

Parameters: AADToken¶ – AAD Token used for authentication

setAADTokenCol(value)[source]

Parameters: AADToken¶ – AAD Token used for authentication

setBatchSize(value)[source]

Parameters: batchSize¶ – The max size of the buffer

setConcurrency(value)[source]

Parameters: concurrency¶ – max number of concurrent calls

setConcurrentTimeout(value)[source]

Parameters: concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1

setCustomServiceName(value)[source]

setDefaultInternalEndpoint(value)[source]

setDisableServiceLogs(value)[source]

Parameters: disableServiceLogs¶ – disableServiceLogs option

setDisableServiceLogsCol(value)[source]

Parameters: disableServiceLogs¶ – disableServiceLogs option

setEndpoint(value)[source]

setErrorCol(value)[source]

Parameters: errorCol¶ – column to hold http errors

setHandler(value)[source]

Parameters: handler¶ – Which strategy to use when handling requests

setLanguage(value)[source]

Parameters: language¶ – the language code of the text (optional for some services)

setLanguageCol(value)[source]

Parameters: language¶ – the language code of the text (optional for some services)

setLinkedService(value)[source]

setLocation(value)[source]

setModelVersion(value)[source]

Parameters: modelVersion¶ – Version of the model

setModelVersionCol(value)[source]

Parameters: modelVersion¶ – Version of the model

setOpinionMining(value)[source]

Parameters: opinionMining¶ – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.

setOpinionMiningCol(value)[source]

Parameters: opinionMining¶ – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.

setOutputCol(value)[source]

Parameters: outputCol¶ – The name of the output column

setParams(AADToken=None, AADTokenCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='TextSentiment_3df38a3624ef_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, opinionMining=None, opinionMiningCol=None, outputCol='TextSentiment_3df38a3624ef_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]: Set the (keyword only) parameters

setShowStats(value)[source]

Parameters: showStats¶ – Whether to include detailed statistics in the response

setShowStatsCol(value)[source]

Parameters: showStats¶ – Whether to include detailed statistics in the response

setStringIndexType(value)[source]

Parameters: stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setStringIndexTypeCol(value)[source]

Parameters: stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setSubscriptionKey(value)[source]

Parameters: subscriptionKey¶ – the API key to use

setSubscriptionKeyCol(value)[source]

Parameters: subscriptionKey¶ – the API key to use

setText(value)[source]

Parameters: text¶ – the text in the request body

setTextCol(value)[source]

Parameters: text¶ – the text in the request body

setTimeout(value)[source]

Parameters: timeout¶ – number of seconds to wait before closing the connection

setUrl(value)[source]

Parameters: url¶ – Url of the service

showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')

stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')

subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')

text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')

timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')

url = Param(parent='undefined', name='url', doc='Url of the service')

Module contents

SynapseML is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK), LightGBM and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of datasources.

SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. In this vein, SynapseML provides easy to use SparkML transformers for a wide variety of Microsoft Cognitive Services. For production grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.

SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+.