synapse.ml.services.text package
Submodules
synapse.ml.services.text.AnalyzeHealthText module
- class synapse.ml.services.text.AnalyzeHealthText.AnalyzeHealthText(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, backoffs=[100, 500, 1000], batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='AnalyzeHealthText_1dce87557fa2_error', initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeHealthText_1dce87557fa2_output', pollingDelay=300, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
language¶ (object) – the language code of the text (optional for some services)
pollingDelay¶ (int) – number of milliseconds to wait between polling
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout¶ (float) – number of seconds to wait before closing the connection
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns
A Custom Value for Authorization Header
- Return type
CustomAuthHeader
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getShowStats()[source]
- Returns
Whether to include detailed statistics in the response
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setLanguage(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, backoffs=[100, 500, 1000], batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='AnalyzeHealthText_1dce87557fa2_error', initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeHealthText_1dce87557fa2_output', pollingDelay=300, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setShowStats(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setStringIndexType(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.services.text.EntityDetector module
- class synapse.ml.services.text.EntityDetector.EntityDetector(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='EntityDetector_667a29f1ec75_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='EntityDetector_667a29f1ec75_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
timeout¶ (float) – number of seconds to wait before closing the connection
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns
A Custom Value for Authorization Header
- Return type
CustomAuthHeader
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getShowStats()[source]
- Returns
Whether to include detailed statistics in the response
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setLanguage(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='EntityDetector_667a29f1ec75_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='EntityDetector_667a29f1ec75_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setStringIndexType(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.services.text.KeyPhraseExtractor module
- class synapse.ml.services.text.KeyPhraseExtractor.KeyPhraseExtractor(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='KeyPhraseExtractor_77e27fd24c9c_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='KeyPhraseExtractor_77e27fd24c9c_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
timeout¶ (float) – number of seconds to wait before closing the connection
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns
A Custom Value for Authorization Header
- Return type
CustomAuthHeader
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getShowStats()[source]
- Returns
Whether to include detailed statistics in the response
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setLanguage(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='KeyPhraseExtractor_77e27fd24c9c_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='KeyPhraseExtractor_77e27fd24c9c_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setStringIndexType(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.services.text.LanguageDetector module
- class synapse.ml.services.text.LanguageDetector.LanguageDetector(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='LanguageDetector_f4d36e19a734_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='LanguageDetector_f4d36e19a734_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
timeout¶ (float) – number of seconds to wait before closing the connection
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns
A Custom Value for Authorization Header
- Return type
CustomAuthHeader
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getShowStats()[source]
- Returns
Whether to include detailed statistics in the response
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setLanguage(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='LanguageDetector_f4d36e19a734_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='LanguageDetector_f4d36e19a734_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setStringIndexType(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.services.text.NER module
- class synapse.ml.services.text.NER.NER(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='NER_7fe98473e73d_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='NER_7fe98473e73d_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
timeout¶ (float) – number of seconds to wait before closing the connection
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns
A Custom Value for Authorization Header
- Return type
CustomAuthHeader
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getShowStats()[source]
- Returns
Whether to include detailed statistics in the response
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setLanguage(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='NER_7fe98473e73d_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='NER_7fe98473e73d_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setStringIndexType(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.services.text.PII module
- class synapse.ml.services.text.PII.PII(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, domain=None, domainCol=None, errorCol='PII_1c628ec3e727_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='PII_1c628ec3e727_output', piiCategories=None, piiCategoriesCol=None, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
domain¶ (object) – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
piiCategories¶ (object) – describes the PII categories to return
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
timeout¶ (float) – number of seconds to wait before closing the connection
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- domain = Param(parent='undefined', name='domain', doc="ServiceParam: if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: 'PHI', 'none'.")
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns
A Custom Value for Authorization Header
- Return type
CustomAuthHeader
- getDomain()[source]
- Returns
if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
- Return type
domain
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getPiiCategories()[source]
- Returns
describes the PII categories to return
- Return type
piiCategories
- getShowStats()[source]
- Returns
Whether to include detailed statistics in the response
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- piiCategories = Param(parent='undefined', name='piiCategories', doc='ServiceParam: describes the PII categories to return')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setDomain(value)[source]
- Parameters
domain¶ – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
- setDomainCol(value)[source]
- Parameters
domain¶ – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
- setLanguage(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, domain=None, domainCol=None, errorCol='PII_1c628ec3e727_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='PII_1c628ec3e727_output', piiCategories=None, piiCategoriesCol=None, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPiiCategoriesCol(value)[source]
- Parameters
piiCategories¶ – describes the PII categories to return
- setShowStats(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setStringIndexType(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.services.text.TextAnalyze module
- class synapse.ml.services.text.TextAnalyze.TextAnalyze(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, backoffs=[100, 500, 1000], batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, entityLinkingParams={'model-version': 'latest'}, entityRecognitionParams={'model-version': 'latest'}, errorCol='TextAnalyze_132ccce05ce0_error', includeEntityLinking=True, includeEntityRecognition=True, includeKeyPhraseExtraction=True, includePii=True, includeSentimentAnalysis=True, initialPollingDelay=300, keyPhraseExtractionParams={'model-version': 'latest'}, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='TextAnalyze_132ccce05ce0_output', piiParams={'model-version': 'latest'}, pollingDelay=300, sentimentAnalysisParams={'model-version': 'latest'}, showStats=None, showStatsCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
entityLinkingParams¶ (dict) – the parameters to pass to the entityLinking model
entityRecognitionParams¶ (dict) – the parameters to pass to the entity recognition model
includeEntityLinking¶ (bool) – Whether to perform EntityLinking
includeEntityRecognition¶ (bool) – Whether to perform entity recognition
includeKeyPhraseExtraction¶ (bool) – Whether to perform EntityLinking
includeSentimentAnalysis¶ (bool) – Whether to perform SentimentAnalysis
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
keyPhraseExtractionParams¶ (dict) – the parameters to pass to the keyPhraseExtraction model
language¶ (object) – the language code of the text (optional for some services)
pollingDelay¶ (int) – number of milliseconds to wait between polling
sentimentAnalysisParams¶ (dict) – the parameters to pass to the sentimentAnalysis model
showStats¶ (object) – Whether to include detailed statistics in the response
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout¶ (float) – number of seconds to wait before closing the connection
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- entityLinkingParams = Param(parent='undefined', name='entityLinkingParams', doc='the parameters to pass to the entityLinking model')
- entityRecognitionParams = Param(parent='undefined', name='entityRecognitionParams', doc='the parameters to pass to the entity recognition model')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns
A Custom Value for Authorization Header
- Return type
CustomAuthHeader
- getEntityLinkingParams()[source]
- Returns
the parameters to pass to the entityLinking model
- Return type
entityLinkingParams
- getEntityRecognitionParams()[source]
- Returns
the parameters to pass to the entity recognition model
- Return type
entityRecognitionParams
- getIncludeEntityLinking()[source]
- Returns
Whether to perform EntityLinking
- Return type
includeEntityLinking
- getIncludeEntityRecognition()[source]
- Returns
Whether to perform entity recognition
- Return type
includeEntityRecognition
- getIncludeKeyPhraseExtraction()[source]
- Returns
Whether to perform EntityLinking
- Return type
includeKeyPhraseExtraction
- getIncludeSentimentAnalysis()[source]
- Returns
Whether to perform SentimentAnalysis
- Return type
includeSentimentAnalysis
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getKeyPhraseExtractionParams()[source]
- Returns
the parameters to pass to the keyPhraseExtraction model
- Return type
keyPhraseExtractionParams
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSentimentAnalysisParams()[source]
- Returns
the parameters to pass to the sentimentAnalysis model
- Return type
sentimentAnalysisParams
- getShowStats()[source]
- Returns
Whether to include detailed statistics in the response
- Return type
showStats
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- includeEntityLinking = Param(parent='undefined', name='includeEntityLinking', doc='Whether to perform EntityLinking')
- includeEntityRecognition = Param(parent='undefined', name='includeEntityRecognition', doc='Whether to perform entity recognition')
- includeKeyPhraseExtraction = Param(parent='undefined', name='includeKeyPhraseExtraction', doc='Whether to perform EntityLinking')
- includePii = Param(parent='undefined', name='includePii', doc='Whether to perform PII Detection')
- includeSentimentAnalysis = Param(parent='undefined', name='includeSentimentAnalysis', doc='Whether to perform SentimentAnalysis')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- keyPhraseExtractionParams = Param(parent='undefined', name='keyPhraseExtractionParams', doc='the parameters to pass to the keyPhraseExtraction model')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- piiParams = Param(parent='undefined', name='piiParams', doc='the parameters to pass to the PII model')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- sentimentAnalysisParams = Param(parent='undefined', name='sentimentAnalysisParams', doc='the parameters to pass to the sentimentAnalysis model')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setEntityLinkingParams(value)[source]
- Parameters
entityLinkingParams¶ – the parameters to pass to the entityLinking model
- setEntityRecognitionParams(value)[source]
- Parameters
entityRecognitionParams¶ – the parameters to pass to the entity recognition model
- setIncludeEntityLinking(value)[source]
- Parameters
includeEntityLinking¶ – Whether to perform EntityLinking
- setIncludeEntityRecognition(value)[source]
- Parameters
includeEntityRecognition¶ – Whether to perform entity recognition
- setIncludeKeyPhraseExtraction(value)[source]
- Parameters
includeKeyPhraseExtraction¶ – Whether to perform EntityLinking
- setIncludeSentimentAnalysis(value)[source]
- Parameters
includeSentimentAnalysis¶ – Whether to perform SentimentAnalysis
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setKeyPhraseExtractionParams(value)[source]
- Parameters
keyPhraseExtractionParams¶ – the parameters to pass to the keyPhraseExtraction model
- setLanguage(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, backoffs=[100, 500, 1000], batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, entityLinkingParams={'model-version': 'latest'}, entityRecognitionParams={'model-version': 'latest'}, errorCol='TextAnalyze_132ccce05ce0_error', includeEntityLinking=True, includeEntityRecognition=True, includeKeyPhraseExtraction=True, includePii=True, includeSentimentAnalysis=True, initialPollingDelay=300, keyPhraseExtractionParams={'model-version': 'latest'}, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='TextAnalyze_132ccce05ce0_output', piiParams={'model-version': 'latest'}, pollingDelay=300, sentimentAnalysisParams={'model-version': 'latest'}, showStats=None, showStatsCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setSentimentAnalysisParams(value)[source]
- Parameters
sentimentAnalysisParams¶ – the parameters to pass to the sentimentAnalysis model
- setShowStats(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.services.text.TextSentiment module
- class synapse.ml.services.text.TextSentiment.TextSentiment(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='TextSentiment_8f34ea423d0f_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, opinionMining=None, opinionMiningCol=None, outputCol='TextSentiment_8f34ea423d0f_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
opinionMining¶ (object) – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
timeout¶ (float) – number of seconds to wait before closing the connection
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns
A Custom Value for Authorization Header
- Return type
CustomAuthHeader
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getOpinionMining()[source]
- Returns
if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
- Return type
opinionMining
- getShowStats()[source]
- Returns
Whether to include detailed statistics in the response
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- opinionMining = Param(parent='undefined', name='opinionMining', doc='ServiceParam: if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setLanguage(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setOpinionMining(value)[source]
- Parameters
opinionMining¶ – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
- setOpinionMiningCol(value)[source]
- Parameters
opinionMining¶ – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='TextSentiment_8f34ea423d0f_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, opinionMining=None, opinionMiningCol=None, outputCol='TextSentiment_8f34ea423d0f_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setStringIndexType(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
Module contents
SynapseML is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK), LightGBM and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of datasources.
SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. In this vein, SynapseML provides easy to use SparkML transformers for a wide variety of Microsoft Cognitive Services. For production grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.
SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+.