synapse.ml.services.text package

Submodules

synapse.ml.services.text.AnalyzeHealthText module

class synapse.ml.services.text.AnalyzeHealthText.AnalyzeHealthText(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, backoffs=[100, 500, 1000], batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='AnalyzeHealthText_3e023055f752_error', initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeHealthText_3e023055f752_output', pollingDelay=300, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer

Parameters:
  • AADToken (object) – AAD Token used for authentication

  • CustomAuthHeader (object) – A Custom Value for Authorization Header

  • backoffs (list) – array of backoffs to use in the handler

  • batchSize (int) – The max size of the buffer

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • disableServiceLogs (object) – disableServiceLogs option

  • errorCol (str) – column to hold http errors

  • initialPollingDelay (int) – number of milliseconds to wait before first poll for result

  • language (object) – the language code of the text (optional for some services)

  • maxPollingRetries (int) – number of times to poll

  • modelVersion (object) – Version of the model

  • outputCol (str) – The name of the output column

  • pollingDelay (int) – number of milliseconds to wait between polling

  • showStats (object) – Whether to include detailed statistics in the response

  • stringIndexType (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

  • subscriptionKey (object) – the API key to use

  • suppressMaxRetriesException (bool) – set true to suppress the maxumimum retries exception and report in the error column

  • text (object) – the text in the request body

  • timeout (float) – number of seconds to wait before closing the connection

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
getAADToken()[source]
Returns:

AAD Token used for authentication

Return type:

AADToken

getBackoffs()[source]
Returns:

array of backoffs to use in the handler

Return type:

backoffs

getBatchSize()[source]
Returns:

The max size of the buffer

Return type:

batchSize

getConcurrency()[source]
Returns:

max number of concurrent calls

Return type:

concurrency

getConcurrentTimeout()[source]
Returns:

max number seconds to wait on futures if concurrency >= 1

Return type:

concurrentTimeout

getCustomAuthHeader()[source]
Returns:

A Custom Value for Authorization Header

Return type:

CustomAuthHeader

getDisableServiceLogs()[source]
Returns:

disableServiceLogs option

Return type:

disableServiceLogs

getErrorCol()[source]
Returns:

column to hold http errors

Return type:

errorCol

getInitialPollingDelay()[source]
Returns:

number of milliseconds to wait before first poll for result

Return type:

initialPollingDelay

static getJavaPackage()[source]

Returns package name String.

getLanguage()[source]
Returns:

the language code of the text (optional for some services)

Return type:

language

getMaxPollingRetries()[source]
Returns:

number of times to poll

Return type:

maxPollingRetries

getModelVersion()[source]
Returns:

Version of the model

Return type:

modelVersion

getOutputCol()[source]
Returns:

The name of the output column

Return type:

outputCol

getPollingDelay()[source]
Returns:

number of milliseconds to wait between polling

Return type:

pollingDelay

getShowStats()[source]
Returns:

Whether to include detailed statistics in the response

Return type:

showStats

getStringIndexType()[source]
Returns:

Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

Return type:

stringIndexType

getSubscriptionKey()[source]
Returns:

the API key to use

Return type:

subscriptionKey

getSuppressMaxRetriesException()[source]
Returns:

set true to suppress the maxumimum retries exception and report in the error column

Return type:

suppressMaxRetriesException

getText()[source]
Returns:

the text in the request body

Return type:

text

getTimeout()[source]
Returns:

number of seconds to wait before closing the connection

Return type:

timeout

getUrl()[source]
Returns:

Url of the service

Return type:

url

initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
classmethod read()[source]

Returns an MLReader instance for this class.

setAADToken(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setBackoffs(value)[source]
Parameters:

backoffs – array of backoffs to use in the handler

setBatchSize(value)[source]
Parameters:

batchSize – The max size of the buffer

setConcurrency(value)[source]
Parameters:

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters:

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomAuthHeader(value)[source]
Parameters:

CustomAuthHeader – A Custom Value for Authorization Header

setCustomAuthHeaderCol(value)[source]
Parameters:

CustomAuthHeader – A Custom Value for Authorization Header

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setDisableServiceLogs(value)[source]
Parameters:

disableServiceLogs – disableServiceLogs option

setDisableServiceLogsCol(value)[source]
Parameters:

disableServiceLogs – disableServiceLogs option

setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters:

errorCol – column to hold http errors

setInitialPollingDelay(value)[source]
Parameters:

initialPollingDelay – number of milliseconds to wait before first poll for result

setLanguage(value)[source]
Parameters:

language – the language code of the text (optional for some services)

setLanguageCol(value)[source]
Parameters:

language – the language code of the text (optional for some services)

setLinkedService(value)[source]
setLocation(value)[source]
setMaxPollingRetries(value)[source]
Parameters:

maxPollingRetries – number of times to poll

setModelVersion(value)[source]
Parameters:

modelVersion – Version of the model

setModelVersionCol(value)[source]
Parameters:

modelVersion – Version of the model

setOutputCol(value)[source]
Parameters:

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, backoffs=[100, 500, 1000], batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='AnalyzeHealthText_3e023055f752_error', initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeHealthText_3e023055f752_output', pollingDelay=300, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, text=None, textCol=None, timeout=60.0, url=None)[source]

Set the (keyword only) parameters

setPollingDelay(value)[source]
Parameters:

pollingDelay – number of milliseconds to wait between polling

setShowStats(value)[source]
Parameters:

showStats – Whether to include detailed statistics in the response

setShowStatsCol(value)[source]
Parameters:

showStats – Whether to include detailed statistics in the response

setStringIndexType(value)[source]
Parameters:

stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setStringIndexTypeCol(value)[source]
Parameters:

stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setSubscriptionKey(value)[source]
Parameters:

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters:

subscriptionKey – the API key to use

setSuppressMaxRetriesException(value)[source]
Parameters:

suppressMaxRetriesException – set true to suppress the maxumimum retries exception and report in the error column

setText(value)[source]
Parameters:

text – the text in the request body

setTextCol(value)[source]
Parameters:

text – the text in the request body

setTimeout(value)[source]
Parameters:

timeout – number of seconds to wait before closing the connection

setUrl(value)[source]
Parameters:

url – Url of the service

showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.services.text.EntityDetector module

class synapse.ml.services.text.EntityDetector.EntityDetector(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='EntityDetector_0b2a4e8494ba_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='EntityDetector_0b2a4e8494ba_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer

Parameters:
  • AADToken (object) – AAD Token used for authentication

  • CustomAuthHeader (object) – A Custom Value for Authorization Header

  • batchSize (int) – The max size of the buffer

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • disableServiceLogs (object) – disableServiceLogs option

  • errorCol (str) – column to hold http errors

  • handler (object) – Which strategy to use when handling requests

  • language (object) – the language code of the text (optional for some services)

  • modelVersion (object) – Version of the model

  • outputCol (str) – The name of the output column

  • showStats (object) – Whether to include detailed statistics in the response

  • stringIndexType (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

  • subscriptionKey (object) – the API key to use

  • text (object) – the text in the request body

  • timeout (float) – number of seconds to wait before closing the connection

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
getAADToken()[source]
Returns:

AAD Token used for authentication

Return type:

AADToken

getBatchSize()[source]
Returns:

The max size of the buffer

Return type:

batchSize

getConcurrency()[source]
Returns:

max number of concurrent calls

Return type:

concurrency

getConcurrentTimeout()[source]
Returns:

max number seconds to wait on futures if concurrency >= 1

Return type:

concurrentTimeout

getCustomAuthHeader()[source]
Returns:

A Custom Value for Authorization Header

Return type:

CustomAuthHeader

getDisableServiceLogs()[source]
Returns:

disableServiceLogs option

Return type:

disableServiceLogs

getErrorCol()[source]
Returns:

column to hold http errors

Return type:

errorCol

getHandler()[source]
Returns:

Which strategy to use when handling requests

Return type:

handler

static getJavaPackage()[source]

Returns package name String.

getLanguage()[source]
Returns:

the language code of the text (optional for some services)

Return type:

language

getModelVersion()[source]
Returns:

Version of the model

Return type:

modelVersion

getOutputCol()[source]
Returns:

The name of the output column

Return type:

outputCol

getShowStats()[source]
Returns:

Whether to include detailed statistics in the response

Return type:

showStats

getStringIndexType()[source]
Returns:

Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

Return type:

stringIndexType

getSubscriptionKey()[source]
Returns:

the API key to use

Return type:

subscriptionKey

getText()[source]
Returns:

the text in the request body

Return type:

text

getTimeout()[source]
Returns:

number of seconds to wait before closing the connection

Return type:

timeout

getUrl()[source]
Returns:

Url of the service

Return type:

url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setAADToken(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setBatchSize(value)[source]
Parameters:

batchSize – The max size of the buffer

setConcurrency(value)[source]
Parameters:

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters:

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomAuthHeader(value)[source]
Parameters:

CustomAuthHeader – A Custom Value for Authorization Header

setCustomAuthHeaderCol(value)[source]
Parameters:

CustomAuthHeader – A Custom Value for Authorization Header

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setDisableServiceLogs(value)[source]
Parameters:

disableServiceLogs – disableServiceLogs option

setDisableServiceLogsCol(value)[source]
Parameters:

disableServiceLogs – disableServiceLogs option

setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters:

errorCol – column to hold http errors

setHandler(value)[source]
Parameters:

handler – Which strategy to use when handling requests

setLanguage(value)[source]
Parameters:

language – the language code of the text (optional for some services)

setLanguageCol(value)[source]
Parameters:

language – the language code of the text (optional for some services)

setLinkedService(value)[source]
setLocation(value)[source]
setModelVersion(value)[source]
Parameters:

modelVersion – Version of the model

setModelVersionCol(value)[source]
Parameters:

modelVersion – Version of the model

setOutputCol(value)[source]
Parameters:

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='EntityDetector_0b2a4e8494ba_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='EntityDetector_0b2a4e8494ba_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Set the (keyword only) parameters

setShowStats(value)[source]
Parameters:

showStats – Whether to include detailed statistics in the response

setShowStatsCol(value)[source]
Parameters:

showStats – Whether to include detailed statistics in the response

setStringIndexType(value)[source]
Parameters:

stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setStringIndexTypeCol(value)[source]
Parameters:

stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setSubscriptionKey(value)[source]
Parameters:

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters:

subscriptionKey – the API key to use

setText(value)[source]
Parameters:

text – the text in the request body

setTextCol(value)[source]
Parameters:

text – the text in the request body

setTimeout(value)[source]
Parameters:

timeout – number of seconds to wait before closing the connection

setUrl(value)[source]
Parameters:

url – Url of the service

showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.services.text.KeyPhraseExtractor module

class synapse.ml.services.text.KeyPhraseExtractor.KeyPhraseExtractor(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='KeyPhraseExtractor_9d795e02778d_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='KeyPhraseExtractor_9d795e02778d_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer

Parameters:
  • AADToken (object) – AAD Token used for authentication

  • CustomAuthHeader (object) – A Custom Value for Authorization Header

  • batchSize (int) – The max size of the buffer

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • disableServiceLogs (object) – disableServiceLogs option

  • errorCol (str) – column to hold http errors

  • handler (object) – Which strategy to use when handling requests

  • language (object) – the language code of the text (optional for some services)

  • modelVersion (object) – Version of the model

  • outputCol (str) – The name of the output column

  • showStats (object) – Whether to include detailed statistics in the response

  • stringIndexType (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

  • subscriptionKey (object) – the API key to use

  • text (object) – the text in the request body

  • timeout (float) – number of seconds to wait before closing the connection

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
getAADToken()[source]
Returns:

AAD Token used for authentication

Return type:

AADToken

getBatchSize()[source]
Returns:

The max size of the buffer

Return type:

batchSize

getConcurrency()[source]
Returns:

max number of concurrent calls

Return type:

concurrency

getConcurrentTimeout()[source]
Returns:

max number seconds to wait on futures if concurrency >= 1

Return type:

concurrentTimeout

getCustomAuthHeader()[source]
Returns:

A Custom Value for Authorization Header

Return type:

CustomAuthHeader

getDisableServiceLogs()[source]
Returns:

disableServiceLogs option

Return type:

disableServiceLogs

getErrorCol()[source]
Returns:

column to hold http errors

Return type:

errorCol

getHandler()[source]
Returns:

Which strategy to use when handling requests

Return type:

handler

static getJavaPackage()[source]

Returns package name String.

getLanguage()[source]
Returns:

the language code of the text (optional for some services)

Return type:

language

getModelVersion()[source]
Returns:

Version of the model

Return type:

modelVersion

getOutputCol()[source]
Returns:

The name of the output column

Return type:

outputCol

getShowStats()[source]
Returns:

Whether to include detailed statistics in the response

Return type:

showStats

getStringIndexType()[source]
Returns:

Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

Return type:

stringIndexType

getSubscriptionKey()[source]
Returns:

the API key to use

Return type:

subscriptionKey

getText()[source]
Returns:

the text in the request body

Return type:

text

getTimeout()[source]
Returns:

number of seconds to wait before closing the connection

Return type:

timeout

getUrl()[source]
Returns:

Url of the service

Return type:

url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setAADToken(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setBatchSize(value)[source]
Parameters:

batchSize – The max size of the buffer

setConcurrency(value)[source]
Parameters:

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters:

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomAuthHeader(value)[source]
Parameters:

CustomAuthHeader – A Custom Value for Authorization Header

setCustomAuthHeaderCol(value)[source]
Parameters:

CustomAuthHeader – A Custom Value for Authorization Header

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setDisableServiceLogs(value)[source]
Parameters:

disableServiceLogs – disableServiceLogs option

setDisableServiceLogsCol(value)[source]
Parameters:

disableServiceLogs – disableServiceLogs option

setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters:

errorCol – column to hold http errors

setHandler(value)[source]
Parameters:

handler – Which strategy to use when handling requests

setLanguage(value)[source]
Parameters:

language – the language code of the text (optional for some services)

setLanguageCol(value)[source]
Parameters:

language – the language code of the text (optional for some services)

setLinkedService(value)[source]
setLocation(value)[source]
setModelVersion(value)[source]
Parameters:

modelVersion – Version of the model

setModelVersionCol(value)[source]
Parameters:

modelVersion – Version of the model

setOutputCol(value)[source]
Parameters:

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='KeyPhraseExtractor_9d795e02778d_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='KeyPhraseExtractor_9d795e02778d_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Set the (keyword only) parameters

setShowStats(value)[source]
Parameters:

showStats – Whether to include detailed statistics in the response

setShowStatsCol(value)[source]
Parameters:

showStats – Whether to include detailed statistics in the response

setStringIndexType(value)[source]
Parameters:

stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setStringIndexTypeCol(value)[source]
Parameters:

stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setSubscriptionKey(value)[source]
Parameters:

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters:

subscriptionKey – the API key to use

setText(value)[source]
Parameters:

text – the text in the request body

setTextCol(value)[source]
Parameters:

text – the text in the request body

setTimeout(value)[source]
Parameters:

timeout – number of seconds to wait before closing the connection

setUrl(value)[source]
Parameters:

url – Url of the service

showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.services.text.LanguageDetector module

class synapse.ml.services.text.LanguageDetector.LanguageDetector(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='LanguageDetector_d20525a93bc1_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='LanguageDetector_d20525a93bc1_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer

Parameters:
  • AADToken (object) – AAD Token used for authentication

  • CustomAuthHeader (object) – A Custom Value for Authorization Header

  • batchSize (int) – The max size of the buffer

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • disableServiceLogs (object) – disableServiceLogs option

  • errorCol (str) – column to hold http errors

  • handler (object) – Which strategy to use when handling requests

  • language (object) – the language code of the text (optional for some services)

  • modelVersion (object) – Version of the model

  • outputCol (str) – The name of the output column

  • showStats (object) – Whether to include detailed statistics in the response

  • stringIndexType (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

  • subscriptionKey (object) – the API key to use

  • text (object) – the text in the request body

  • timeout (float) – number of seconds to wait before closing the connection

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
getAADToken()[source]
Returns:

AAD Token used for authentication

Return type:

AADToken

getBatchSize()[source]
Returns:

The max size of the buffer

Return type:

batchSize

getConcurrency()[source]
Returns:

max number of concurrent calls

Return type:

concurrency

getConcurrentTimeout()[source]
Returns:

max number seconds to wait on futures if concurrency >= 1

Return type:

concurrentTimeout

getCustomAuthHeader()[source]
Returns:

A Custom Value for Authorization Header

Return type:

CustomAuthHeader

getDisableServiceLogs()[source]
Returns:

disableServiceLogs option

Return type:

disableServiceLogs

getErrorCol()[source]
Returns:

column to hold http errors

Return type:

errorCol

getHandler()[source]
Returns:

Which strategy to use when handling requests

Return type:

handler

static getJavaPackage()[source]

Returns package name String.

getLanguage()[source]
Returns:

the language code of the text (optional for some services)

Return type:

language

getModelVersion()[source]
Returns:

Version of the model

Return type:

modelVersion

getOutputCol()[source]
Returns:

The name of the output column

Return type:

outputCol

getShowStats()[source]
Returns:

Whether to include detailed statistics in the response

Return type:

showStats

getStringIndexType()[source]
Returns:

Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

Return type:

stringIndexType

getSubscriptionKey()[source]
Returns:

the API key to use

Return type:

subscriptionKey

getText()[source]
Returns:

the text in the request body

Return type:

text

getTimeout()[source]
Returns:

number of seconds to wait before closing the connection

Return type:

timeout

getUrl()[source]
Returns:

Url of the service

Return type:

url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setAADToken(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setBatchSize(value)[source]
Parameters:

batchSize – The max size of the buffer

setConcurrency(value)[source]
Parameters:

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters:

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomAuthHeader(value)[source]
Parameters:

CustomAuthHeader – A Custom Value for Authorization Header

setCustomAuthHeaderCol(value)[source]
Parameters:

CustomAuthHeader – A Custom Value for Authorization Header

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setDisableServiceLogs(value)[source]
Parameters:

disableServiceLogs – disableServiceLogs option

setDisableServiceLogsCol(value)[source]
Parameters:

disableServiceLogs – disableServiceLogs option

setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters:

errorCol – column to hold http errors

setHandler(value)[source]
Parameters:

handler – Which strategy to use when handling requests

setLanguage(value)[source]
Parameters:

language – the language code of the text (optional for some services)

setLanguageCol(value)[source]
Parameters:

language – the language code of the text (optional for some services)

setLinkedService(value)[source]
setLocation(value)[source]
setModelVersion(value)[source]
Parameters:

modelVersion – Version of the model

setModelVersionCol(value)[source]
Parameters:

modelVersion – Version of the model

setOutputCol(value)[source]
Parameters:

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='LanguageDetector_d20525a93bc1_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='LanguageDetector_d20525a93bc1_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Set the (keyword only) parameters

setShowStats(value)[source]
Parameters:

showStats – Whether to include detailed statistics in the response

setShowStatsCol(value)[source]
Parameters:

showStats – Whether to include detailed statistics in the response

setStringIndexType(value)[source]
Parameters:

stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setStringIndexTypeCol(value)[source]
Parameters:

stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setSubscriptionKey(value)[source]
Parameters:

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters:

subscriptionKey – the API key to use

setText(value)[source]
Parameters:

text – the text in the request body

setTextCol(value)[source]
Parameters:

text – the text in the request body

setTimeout(value)[source]
Parameters:

timeout – number of seconds to wait before closing the connection

setUrl(value)[source]
Parameters:

url – Url of the service

showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.services.text.NER module

class synapse.ml.services.text.NER.NER(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='NER_6ad183376c32_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='NER_6ad183376c32_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer

Parameters:
  • AADToken (object) – AAD Token used for authentication

  • CustomAuthHeader (object) – A Custom Value for Authorization Header

  • batchSize (int) – The max size of the buffer

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • disableServiceLogs (object) – disableServiceLogs option

  • errorCol (str) – column to hold http errors

  • handler (object) – Which strategy to use when handling requests

  • language (object) – the language code of the text (optional for some services)

  • modelVersion (object) – Version of the model

  • outputCol (str) – The name of the output column

  • showStats (object) – Whether to include detailed statistics in the response

  • stringIndexType (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

  • subscriptionKey (object) – the API key to use

  • text (object) – the text in the request body

  • timeout (float) – number of seconds to wait before closing the connection

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
getAADToken()[source]
Returns:

AAD Token used for authentication

Return type:

AADToken

getBatchSize()[source]
Returns:

The max size of the buffer

Return type:

batchSize

getConcurrency()[source]
Returns:

max number of concurrent calls

Return type:

concurrency

getConcurrentTimeout()[source]
Returns:

max number seconds to wait on futures if concurrency >= 1

Return type:

concurrentTimeout

getCustomAuthHeader()[source]
Returns:

A Custom Value for Authorization Header

Return type:

CustomAuthHeader

getDisableServiceLogs()[source]
Returns:

disableServiceLogs option

Return type:

disableServiceLogs

getErrorCol()[source]
Returns:

column to hold http errors

Return type:

errorCol

getHandler()[source]
Returns:

Which strategy to use when handling requests

Return type:

handler

static getJavaPackage()[source]

Returns package name String.

getLanguage()[source]
Returns:

the language code of the text (optional for some services)

Return type:

language

getModelVersion()[source]
Returns:

Version of the model

Return type:

modelVersion

getOutputCol()[source]
Returns:

The name of the output column

Return type:

outputCol

getShowStats()[source]
Returns:

Whether to include detailed statistics in the response

Return type:

showStats

getStringIndexType()[source]
Returns:

Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

Return type:

stringIndexType

getSubscriptionKey()[source]
Returns:

the API key to use

Return type:

subscriptionKey

getText()[source]
Returns:

the text in the request body

Return type:

text

getTimeout()[source]
Returns:

number of seconds to wait before closing the connection

Return type:

timeout

getUrl()[source]
Returns:

Url of the service

Return type:

url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setAADToken(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setBatchSize(value)[source]
Parameters:

batchSize – The max size of the buffer

setConcurrency(value)[source]
Parameters:

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters:

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomAuthHeader(value)[source]
Parameters:

CustomAuthHeader – A Custom Value for Authorization Header

setCustomAuthHeaderCol(value)[source]
Parameters:

CustomAuthHeader – A Custom Value for Authorization Header

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setDisableServiceLogs(value)[source]
Parameters:

disableServiceLogs – disableServiceLogs option

setDisableServiceLogsCol(value)[source]
Parameters:

disableServiceLogs – disableServiceLogs option

setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters:

errorCol – column to hold http errors

setHandler(value)[source]
Parameters:

handler – Which strategy to use when handling requests

setLanguage(value)[source]
Parameters:

language – the language code of the text (optional for some services)

setLanguageCol(value)[source]
Parameters:

language – the language code of the text (optional for some services)

setLinkedService(value)[source]
setLocation(value)[source]
setModelVersion(value)[source]
Parameters:

modelVersion – Version of the model

setModelVersionCol(value)[source]
Parameters:

modelVersion – Version of the model

setOutputCol(value)[source]
Parameters:

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='NER_6ad183376c32_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='NER_6ad183376c32_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Set the (keyword only) parameters

setShowStats(value)[source]
Parameters:

showStats – Whether to include detailed statistics in the response

setShowStatsCol(value)[source]
Parameters:

showStats – Whether to include detailed statistics in the response

setStringIndexType(value)[source]
Parameters:

stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setStringIndexTypeCol(value)[source]
Parameters:

stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setSubscriptionKey(value)[source]
Parameters:

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters:

subscriptionKey – the API key to use

setText(value)[source]
Parameters:

text – the text in the request body

setTextCol(value)[source]
Parameters:

text – the text in the request body

setTimeout(value)[source]
Parameters:

timeout – number of seconds to wait before closing the connection

setUrl(value)[source]
Parameters:

url – Url of the service

showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.services.text.PII module

class synapse.ml.services.text.PII.PII(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, domain=None, domainCol=None, errorCol='PII_f399057c8da9_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='PII_f399057c8da9_output', piiCategories=None, piiCategoriesCol=None, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer

Parameters:
  • AADToken (object) – AAD Token used for authentication

  • CustomAuthHeader (object) – A Custom Value for Authorization Header

  • batchSize (int) – The max size of the buffer

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • disableServiceLogs (object) – disableServiceLogs option

  • domain (object) – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.

  • errorCol (str) – column to hold http errors

  • handler (object) – Which strategy to use when handling requests

  • language (object) – the language code of the text (optional for some services)

  • modelVersion (object) – Version of the model

  • outputCol (str) – The name of the output column

  • piiCategories (object) – describes the PII categories to return

  • showStats (object) – Whether to include detailed statistics in the response

  • stringIndexType (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

  • subscriptionKey (object) – the API key to use

  • text (object) – the text in the request body

  • timeout (float) – number of seconds to wait before closing the connection

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
domain = Param(parent='undefined', name='domain', doc="ServiceParam: if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: 'PHI', 'none'.")
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
getAADToken()[source]
Returns:

AAD Token used for authentication

Return type:

AADToken

getBatchSize()[source]
Returns:

The max size of the buffer

Return type:

batchSize

getConcurrency()[source]
Returns:

max number of concurrent calls

Return type:

concurrency

getConcurrentTimeout()[source]
Returns:

max number seconds to wait on futures if concurrency >= 1

Return type:

concurrentTimeout

getCustomAuthHeader()[source]
Returns:

A Custom Value for Authorization Header

Return type:

CustomAuthHeader

getDisableServiceLogs()[source]
Returns:

disableServiceLogs option

Return type:

disableServiceLogs

getDomain()[source]
Returns:

if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.

Return type:

domain

getErrorCol()[source]
Returns:

column to hold http errors

Return type:

errorCol

getHandler()[source]
Returns:

Which strategy to use when handling requests

Return type:

handler

static getJavaPackage()[source]

Returns package name String.

getLanguage()[source]
Returns:

the language code of the text (optional for some services)

Return type:

language

getModelVersion()[source]
Returns:

Version of the model

Return type:

modelVersion

getOutputCol()[source]
Returns:

The name of the output column

Return type:

outputCol

getPiiCategories()[source]
Returns:

describes the PII categories to return

Return type:

piiCategories

getShowStats()[source]
Returns:

Whether to include detailed statistics in the response

Return type:

showStats

getStringIndexType()[source]
Returns:

Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

Return type:

stringIndexType

getSubscriptionKey()[source]
Returns:

the API key to use

Return type:

subscriptionKey

getText()[source]
Returns:

the text in the request body

Return type:

text

getTimeout()[source]
Returns:

number of seconds to wait before closing the connection

Return type:

timeout

getUrl()[source]
Returns:

Url of the service

Return type:

url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
piiCategories = Param(parent='undefined', name='piiCategories', doc='ServiceParam: describes the PII categories to return')
classmethod read()[source]

Returns an MLReader instance for this class.

setAADToken(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setBatchSize(value)[source]
Parameters:

batchSize – The max size of the buffer

setConcurrency(value)[source]
Parameters:

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters:

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomAuthHeader(value)[source]
Parameters:

CustomAuthHeader – A Custom Value for Authorization Header

setCustomAuthHeaderCol(value)[source]
Parameters:

CustomAuthHeader – A Custom Value for Authorization Header

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setDisableServiceLogs(value)[source]
Parameters:

disableServiceLogs – disableServiceLogs option

setDisableServiceLogsCol(value)[source]
Parameters:

disableServiceLogs – disableServiceLogs option

setDomain(value)[source]
Parameters:

domain – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.

setDomainCol(value)[source]
Parameters:

domain – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.

setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters:

errorCol – column to hold http errors

setHandler(value)[source]
Parameters:

handler – Which strategy to use when handling requests

setLanguage(value)[source]
Parameters:

language – the language code of the text (optional for some services)

setLanguageCol(value)[source]
Parameters:

language – the language code of the text (optional for some services)

setLinkedService(value)[source]
setLocation(value)[source]
setModelVersion(value)[source]
Parameters:

modelVersion – Version of the model

setModelVersionCol(value)[source]
Parameters:

modelVersion – Version of the model

setOutputCol(value)[source]
Parameters:

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, domain=None, domainCol=None, errorCol='PII_f399057c8da9_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='PII_f399057c8da9_output', piiCategories=None, piiCategoriesCol=None, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Set the (keyword only) parameters

setPiiCategories(value)[source]
Parameters:

piiCategories – describes the PII categories to return

setPiiCategoriesCol(value)[source]
Parameters:

piiCategories – describes the PII categories to return

setShowStats(value)[source]
Parameters:

showStats – Whether to include detailed statistics in the response

setShowStatsCol(value)[source]
Parameters:

showStats – Whether to include detailed statistics in the response

setStringIndexType(value)[source]
Parameters:

stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setStringIndexTypeCol(value)[source]
Parameters:

stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setSubscriptionKey(value)[source]
Parameters:

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters:

subscriptionKey – the API key to use

setText(value)[source]
Parameters:

text – the text in the request body

setTextCol(value)[source]
Parameters:

text – the text in the request body

setTimeout(value)[source]
Parameters:

timeout – number of seconds to wait before closing the connection

setUrl(value)[source]
Parameters:

url – Url of the service

showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.services.text.TextAnalyze module

class synapse.ml.services.text.TextAnalyze.TextAnalyze(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, backoffs=[100, 500, 1000], batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, entityLinkingParams={'model-version': 'latest'}, entityRecognitionParams={'model-version': 'latest'}, errorCol='TextAnalyze_a0768b0f71b2_error', includeEntityLinking=True, includeEntityRecognition=True, includeKeyPhraseExtraction=True, includePii=True, includeSentimentAnalysis=True, initialPollingDelay=300, keyPhraseExtractionParams={'model-version': 'latest'}, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='TextAnalyze_a0768b0f71b2_output', piiParams={'model-version': 'latest'}, pollingDelay=300, sentimentAnalysisParams={'model-version': 'latest'}, showStats=None, showStatsCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer

Parameters:
  • AADToken (object) – AAD Token used for authentication

  • CustomAuthHeader (object) – A Custom Value for Authorization Header

  • backoffs (list) – array of backoffs to use in the handler

  • batchSize (int) – The max size of the buffer

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • disableServiceLogs (object) – disableServiceLogs option

  • entityLinkingParams (dict) – the parameters to pass to the entityLinking model

  • entityRecognitionParams (dict) – the parameters to pass to the entity recognition model

  • errorCol (str) – column to hold http errors

  • includeEntityLinking (bool) – Whether to perform EntityLinking

  • includeEntityRecognition (bool) – Whether to perform entity recognition

  • includeKeyPhraseExtraction (bool) – Whether to perform EntityLinking

  • includePii (bool) – Whether to perform PII Detection

  • includeSentimentAnalysis (bool) – Whether to perform SentimentAnalysis

  • initialPollingDelay (int) – number of milliseconds to wait before first poll for result

  • keyPhraseExtractionParams (dict) – the parameters to pass to the keyPhraseExtraction model

  • language (object) – the language code of the text (optional for some services)

  • maxPollingRetries (int) – number of times to poll

  • modelVersion (object) – Version of the model

  • outputCol (str) – The name of the output column

  • piiParams (dict) – the parameters to pass to the PII model

  • pollingDelay (int) – number of milliseconds to wait between polling

  • sentimentAnalysisParams (dict) – the parameters to pass to the sentimentAnalysis model

  • showStats (object) – Whether to include detailed statistics in the response

  • subscriptionKey (object) – the API key to use

  • suppressMaxRetriesException (bool) – set true to suppress the maxumimum retries exception and report in the error column

  • text (object) – the text in the request body

  • timeout (float) – number of seconds to wait before closing the connection

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
entityLinkingParams = Param(parent='undefined', name='entityLinkingParams', doc='the parameters to pass to the entityLinking model')
entityRecognitionParams = Param(parent='undefined', name='entityRecognitionParams', doc='the parameters to pass to the entity recognition model')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
getAADToken()[source]
Returns:

AAD Token used for authentication

Return type:

AADToken

getBackoffs()[source]
Returns:

array of backoffs to use in the handler

Return type:

backoffs

getBatchSize()[source]
Returns:

The max size of the buffer

Return type:

batchSize

getConcurrency()[source]
Returns:

max number of concurrent calls

Return type:

concurrency

getConcurrentTimeout()[source]
Returns:

max number seconds to wait on futures if concurrency >= 1

Return type:

concurrentTimeout

getCustomAuthHeader()[source]
Returns:

A Custom Value for Authorization Header

Return type:

CustomAuthHeader

getDisableServiceLogs()[source]
Returns:

disableServiceLogs option

Return type:

disableServiceLogs

getEntityLinkingParams()[source]
Returns:

the parameters to pass to the entityLinking model

Return type:

entityLinkingParams

getEntityRecognitionParams()[source]
Returns:

the parameters to pass to the entity recognition model

Return type:

entityRecognitionParams

getErrorCol()[source]
Returns:

column to hold http errors

Return type:

errorCol

getIncludeEntityLinking()[source]
Returns:

Whether to perform EntityLinking

Return type:

includeEntityLinking

getIncludeEntityRecognition()[source]
Returns:

Whether to perform entity recognition

Return type:

includeEntityRecognition

getIncludeKeyPhraseExtraction()[source]
Returns:

Whether to perform EntityLinking

Return type:

includeKeyPhraseExtraction

getIncludePii()[source]
Returns:

Whether to perform PII Detection

Return type:

includePii

getIncludeSentimentAnalysis()[source]
Returns:

Whether to perform SentimentAnalysis

Return type:

includeSentimentAnalysis

getInitialPollingDelay()[source]
Returns:

number of milliseconds to wait before first poll for result

Return type:

initialPollingDelay

static getJavaPackage()[source]

Returns package name String.

getKeyPhraseExtractionParams()[source]
Returns:

the parameters to pass to the keyPhraseExtraction model

Return type:

keyPhraseExtractionParams

getLanguage()[source]
Returns:

the language code of the text (optional for some services)

Return type:

language

getMaxPollingRetries()[source]
Returns:

number of times to poll

Return type:

maxPollingRetries

getModelVersion()[source]
Returns:

Version of the model

Return type:

modelVersion

getOutputCol()[source]
Returns:

The name of the output column

Return type:

outputCol

getPiiParams()[source]
Returns:

the parameters to pass to the PII model

Return type:

piiParams

getPollingDelay()[source]
Returns:

number of milliseconds to wait between polling

Return type:

pollingDelay

getSentimentAnalysisParams()[source]
Returns:

the parameters to pass to the sentimentAnalysis model

Return type:

sentimentAnalysisParams

getShowStats()[source]
Returns:

Whether to include detailed statistics in the response

Return type:

showStats

getSubscriptionKey()[source]
Returns:

the API key to use

Return type:

subscriptionKey

getSuppressMaxRetriesException()[source]
Returns:

set true to suppress the maxumimum retries exception and report in the error column

Return type:

suppressMaxRetriesException

getText()[source]
Returns:

the text in the request body

Return type:

text

getTimeout()[source]
Returns:

number of seconds to wait before closing the connection

Return type:

timeout

getUrl()[source]
Returns:

Url of the service

Return type:

url

includeEntityLinking = Param(parent='undefined', name='includeEntityLinking', doc='Whether to perform EntityLinking')
includeEntityRecognition = Param(parent='undefined', name='includeEntityRecognition', doc='Whether to perform entity recognition')
includeKeyPhraseExtraction = Param(parent='undefined', name='includeKeyPhraseExtraction', doc='Whether to perform EntityLinking')
includePii = Param(parent='undefined', name='includePii', doc='Whether to perform PII Detection')
includeSentimentAnalysis = Param(parent='undefined', name='includeSentimentAnalysis', doc='Whether to perform SentimentAnalysis')
initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
keyPhraseExtractionParams = Param(parent='undefined', name='keyPhraseExtractionParams', doc='the parameters to pass to the keyPhraseExtraction model')
language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
piiParams = Param(parent='undefined', name='piiParams', doc='the parameters to pass to the PII model')
pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
classmethod read()[source]

Returns an MLReader instance for this class.

sentimentAnalysisParams = Param(parent='undefined', name='sentimentAnalysisParams', doc='the parameters to pass to the sentimentAnalysis model')
setAADToken(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setBackoffs(value)[source]
Parameters:

backoffs – array of backoffs to use in the handler

setBatchSize(value)[source]
Parameters:

batchSize – The max size of the buffer

setConcurrency(value)[source]
Parameters:

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters:

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomAuthHeader(value)[source]
Parameters:

CustomAuthHeader – A Custom Value for Authorization Header

setCustomAuthHeaderCol(value)[source]
Parameters:

CustomAuthHeader – A Custom Value for Authorization Header

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setDisableServiceLogs(value)[source]
Parameters:

disableServiceLogs – disableServiceLogs option

setDisableServiceLogsCol(value)[source]
Parameters:

disableServiceLogs – disableServiceLogs option

setEndpoint(value)[source]
setEntityLinkingParams(value)[source]
Parameters:

entityLinkingParams – the parameters to pass to the entityLinking model

setEntityRecognitionParams(value)[source]
Parameters:

entityRecognitionParams – the parameters to pass to the entity recognition model

setErrorCol(value)[source]
Parameters:

errorCol – column to hold http errors

setIncludeEntityLinking(value)[source]
Parameters:

includeEntityLinking – Whether to perform EntityLinking

setIncludeEntityRecognition(value)[source]
Parameters:

includeEntityRecognition – Whether to perform entity recognition

setIncludeKeyPhraseExtraction(value)[source]
Parameters:

includeKeyPhraseExtraction – Whether to perform EntityLinking

setIncludePii(value)[source]
Parameters:

includePii – Whether to perform PII Detection

setIncludeSentimentAnalysis(value)[source]
Parameters:

includeSentimentAnalysis – Whether to perform SentimentAnalysis

setInitialPollingDelay(value)[source]
Parameters:

initialPollingDelay – number of milliseconds to wait before first poll for result

setKeyPhraseExtractionParams(value)[source]
Parameters:

keyPhraseExtractionParams – the parameters to pass to the keyPhraseExtraction model

setLanguage(value)[source]
Parameters:

language – the language code of the text (optional for some services)

setLanguageCol(value)[source]
Parameters:

language – the language code of the text (optional for some services)

setLinkedService(value)[source]
setLocation(value)[source]
setMaxPollingRetries(value)[source]
Parameters:

maxPollingRetries – number of times to poll

setModelVersion(value)[source]
Parameters:

modelVersion – Version of the model

setModelVersionCol(value)[source]
Parameters:

modelVersion – Version of the model

setOutputCol(value)[source]
Parameters:

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, backoffs=[100, 500, 1000], batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, entityLinkingParams={'model-version': 'latest'}, entityRecognitionParams={'model-version': 'latest'}, errorCol='TextAnalyze_a0768b0f71b2_error', includeEntityLinking=True, includeEntityRecognition=True, includeKeyPhraseExtraction=True, includePii=True, includeSentimentAnalysis=True, initialPollingDelay=300, keyPhraseExtractionParams={'model-version': 'latest'}, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='TextAnalyze_a0768b0f71b2_output', piiParams={'model-version': 'latest'}, pollingDelay=300, sentimentAnalysisParams={'model-version': 'latest'}, showStats=None, showStatsCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, text=None, textCol=None, timeout=60.0, url=None)[source]

Set the (keyword only) parameters

setPiiParams(value)[source]
Parameters:

piiParams – the parameters to pass to the PII model

setPollingDelay(value)[source]
Parameters:

pollingDelay – number of milliseconds to wait between polling

setSentimentAnalysisParams(value)[source]
Parameters:

sentimentAnalysisParams – the parameters to pass to the sentimentAnalysis model

setShowStats(value)[source]
Parameters:

showStats – Whether to include detailed statistics in the response

setShowStatsCol(value)[source]
Parameters:

showStats – Whether to include detailed statistics in the response

setSubscriptionKey(value)[source]
Parameters:

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters:

subscriptionKey – the API key to use

setSuppressMaxRetriesException(value)[source]
Parameters:

suppressMaxRetriesException – set true to suppress the maxumimum retries exception and report in the error column

setText(value)[source]
Parameters:

text – the text in the request body

setTextCol(value)[source]
Parameters:

text – the text in the request body

setTimeout(value)[source]
Parameters:

timeout – number of seconds to wait before closing the connection

setUrl(value)[source]
Parameters:

url – Url of the service

showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
url = Param(parent='undefined', name='url', doc='Url of the service')

synapse.ml.services.text.TextSentiment module

class synapse.ml.services.text.TextSentiment.TextSentiment(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='TextSentiment_76f69f73e53f_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, opinionMining=None, opinionMiningCol=None, outputCol='TextSentiment_76f69f73e53f_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Bases: ComplexParamsMixin, JavaMLReadable, JavaMLWritable, JavaTransformer

Parameters:
  • AADToken (object) – AAD Token used for authentication

  • CustomAuthHeader (object) – A Custom Value for Authorization Header

  • batchSize (int) – The max size of the buffer

  • concurrency (int) – max number of concurrent calls

  • concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1

  • disableServiceLogs (object) – disableServiceLogs option

  • errorCol (str) – column to hold http errors

  • handler (object) – Which strategy to use when handling requests

  • language (object) – the language code of the text (optional for some services)

  • modelVersion (object) – Version of the model

  • opinionMining (object) – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.

  • outputCol (str) – The name of the output column

  • showStats (object) – Whether to include detailed statistics in the response

  • stringIndexType (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

  • subscriptionKey (object) – the API key to use

  • text (object) – the text in the request body

  • timeout (float) – number of seconds to wait before closing the connection

  • url (str) – Url of the service

AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
getAADToken()[source]
Returns:

AAD Token used for authentication

Return type:

AADToken

getBatchSize()[source]
Returns:

The max size of the buffer

Return type:

batchSize

getConcurrency()[source]
Returns:

max number of concurrent calls

Return type:

concurrency

getConcurrentTimeout()[source]
Returns:

max number seconds to wait on futures if concurrency >= 1

Return type:

concurrentTimeout

getCustomAuthHeader()[source]
Returns:

A Custom Value for Authorization Header

Return type:

CustomAuthHeader

getDisableServiceLogs()[source]
Returns:

disableServiceLogs option

Return type:

disableServiceLogs

getErrorCol()[source]
Returns:

column to hold http errors

Return type:

errorCol

getHandler()[source]
Returns:

Which strategy to use when handling requests

Return type:

handler

static getJavaPackage()[source]

Returns package name String.

getLanguage()[source]
Returns:

the language code of the text (optional for some services)

Return type:

language

getModelVersion()[source]
Returns:

Version of the model

Return type:

modelVersion

getOpinionMining()[source]
Returns:

if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.

Return type:

opinionMining

getOutputCol()[source]
Returns:

The name of the output column

Return type:

outputCol

getShowStats()[source]
Returns:

Whether to include detailed statistics in the response

Return type:

showStats

getStringIndexType()[source]
Returns:

Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

Return type:

stringIndexType

getSubscriptionKey()[source]
Returns:

the API key to use

Return type:

subscriptionKey

getText()[source]
Returns:

the text in the request body

Return type:

text

getTimeout()[source]
Returns:

number of seconds to wait before closing the connection

Return type:

timeout

getUrl()[source]
Returns:

Url of the service

Return type:

url

handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
opinionMining = Param(parent='undefined', name='opinionMining', doc='ServiceParam: if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setAADToken(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setAADTokenCol(value)[source]
Parameters:

AADToken – AAD Token used for authentication

setBatchSize(value)[source]
Parameters:

batchSize – The max size of the buffer

setConcurrency(value)[source]
Parameters:

concurrency – max number of concurrent calls

setConcurrentTimeout(value)[source]
Parameters:

concurrentTimeout – max number seconds to wait on futures if concurrency >= 1

setCustomAuthHeader(value)[source]
Parameters:

CustomAuthHeader – A Custom Value for Authorization Header

setCustomAuthHeaderCol(value)[source]
Parameters:

CustomAuthHeader – A Custom Value for Authorization Header

setCustomServiceName(value)[source]
setDefaultInternalEndpoint(value)[source]
setDisableServiceLogs(value)[source]
Parameters:

disableServiceLogs – disableServiceLogs option

setDisableServiceLogsCol(value)[source]
Parameters:

disableServiceLogs – disableServiceLogs option

setEndpoint(value)[source]
setErrorCol(value)[source]
Parameters:

errorCol – column to hold http errors

setHandler(value)[source]
Parameters:

handler – Which strategy to use when handling requests

setLanguage(value)[source]
Parameters:

language – the language code of the text (optional for some services)

setLanguageCol(value)[source]
Parameters:

language – the language code of the text (optional for some services)

setLinkedService(value)[source]
setLocation(value)[source]
setModelVersion(value)[source]
Parameters:

modelVersion – Version of the model

setModelVersionCol(value)[source]
Parameters:

modelVersion – Version of the model

setOpinionMining(value)[source]
Parameters:

opinionMining – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.

setOpinionMiningCol(value)[source]
Parameters:

opinionMining – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.

setOutputCol(value)[source]
Parameters:

outputCol – The name of the output column

setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='TextSentiment_76f69f73e53f_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, opinionMining=None, opinionMiningCol=None, outputCol='TextSentiment_76f69f73e53f_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]

Set the (keyword only) parameters

setShowStats(value)[source]
Parameters:

showStats – Whether to include detailed statistics in the response

setShowStatsCol(value)[source]
Parameters:

showStats – Whether to include detailed statistics in the response

setStringIndexType(value)[source]
Parameters:

stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setStringIndexTypeCol(value)[source]
Parameters:

stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets

setSubscriptionKey(value)[source]
Parameters:

subscriptionKey – the API key to use

setSubscriptionKeyCol(value)[source]
Parameters:

subscriptionKey – the API key to use

setText(value)[source]
Parameters:

text – the text in the request body

setTextCol(value)[source]
Parameters:

text – the text in the request body

setTimeout(value)[source]
Parameters:

timeout – number of seconds to wait before closing the connection

setUrl(value)[source]
Parameters:

url – Url of the service

showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
url = Param(parent='undefined', name='url', doc='Url of the service')

Module contents

SynapseML is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK), LightGBM and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of datasources.

SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. In this vein, SynapseML provides easy to use SparkML transformers for a wide variety of Microsoft Cognitive Services. For production grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.

SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+.