synapse.ml.services.language package
Submodules
synapse.ml.services.language.AnalyzeText module
- class synapse.ml.services.language.AnalyzeText.AnalyzeText(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, apiVersion=None, apiVersionCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, countryHint=None, countryHintCol=None, domain=None, domainCol=None, errorCol='AnalyzeText_4d8d8178d9b9_error', handler=None, kind=None, language=None, languageCol=None, loggingOptOut=None, loggingOptOutCol=None, modelVersion=None, modelVersionCol=None, opinionMining=None, opinionMiningCol=None, outputCol='AnalyzeText_4d8d8178d9b9_output', piiCategories=None, piiCategoriesCol=None, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
ComplexParamsMixin
,JavaMLReadable
,JavaMLWritable
,JavaTransformer
- Parameters:
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
countryHint¶ (object) – the countryHint for language detection
domain¶ (object) – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
opinionMining¶ (object) – opinionMining option for SentimentAnalysisTask
piiCategories¶ (object) – describes the PII categories to return
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
timeout¶ (float) – number of seconds to wait before closing the connection
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- apiVersion = Param(parent='undefined', name='apiVersion', doc='ServiceParam: version of the api')
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- countryHint = Param(parent='undefined', name='countryHint', doc='ServiceParam: the countryHint for language detection')
- domain = Param(parent='undefined', name='domain', doc="ServiceParam: if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: 'PHI', 'none'.")
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns:
max number seconds to wait on futures if concurrency >= 1
- Return type:
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns:
A Custom Value for Authorization Header
- Return type:
CustomAuthHeader
- getDomain()[source]
- Returns:
if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
- Return type:
domain
- getLanguage()[source]
- Returns:
the language code of the text (optional for some services)
- Return type:
language
- getOpinionMining()[source]
- Returns:
opinionMining option for SentimentAnalysisTask
- Return type:
opinionMining
- getPiiCategories()[source]
- Returns:
describes the PII categories to return
- Return type:
piiCategories
- getShowStats()[source]
- Returns:
Whether to include detailed statistics in the response
- Return type:
showStats
- getStringIndexType()[source]
- Returns:
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type:
stringIndexType
- getTimeout()[source]
- Returns:
number of seconds to wait before closing the connection
- Return type:
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- kind = Param(parent='undefined', name='kind', doc='Enumeration of supported Text Analysis tasks')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- loggingOptOut = Param(parent='undefined', name='loggingOptOut', doc='ServiceParam: loggingOptOut for task')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- opinionMining = Param(parent='undefined', name='opinionMining', doc='ServiceParam: opinionMining option for SentimentAnalysisTask')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- piiCategories = Param(parent='undefined', name='piiCategories', doc='ServiceParam: describes the PII categories to return')
- setConcurrentTimeout(value)[source]
- Parameters:
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters:
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters:
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setDomain(value)[source]
- Parameters:
domain¶ – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
- setDomainCol(value)[source]
- Parameters:
domain¶ – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
- setLanguage(value)[source]
- Parameters:
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters:
language¶ – the language code of the text (optional for some services)
- setOpinionMining(value)[source]
- Parameters:
opinionMining¶ – opinionMining option for SentimentAnalysisTask
- setOpinionMiningCol(value)[source]
- Parameters:
opinionMining¶ – opinionMining option for SentimentAnalysisTask
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, apiVersion=None, apiVersionCol=None, batchSize=10, concurrency=1, concurrentTimeout=None, countryHint=None, countryHintCol=None, domain=None, domainCol=None, errorCol='AnalyzeText_4d8d8178d9b9_error', handler=None, kind=None, language=None, languageCol=None, loggingOptOut=None, loggingOptOutCol=None, modelVersion=None, modelVersionCol=None, opinionMining=None, opinionMiningCol=None, outputCol='AnalyzeText_4d8d8178d9b9_output', piiCategories=None, piiCategoriesCol=None, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPiiCategories(value)[source]
- Parameters:
piiCategories¶ – describes the PII categories to return
- setPiiCategoriesCol(value)[source]
- Parameters:
piiCategories¶ – describes the PII categories to return
- setShowStats(value)[source]
- Parameters:
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters:
showStats¶ – Whether to include detailed statistics in the response
- setStringIndexType(value)[source]
- Parameters:
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters:
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters:
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
Module contents
SynapseML is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK), LightGBM and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of datasources.
SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. In this vein, SynapseML provides easy to use SparkML transformers for a wide variety of Microsoft Cognitive Services. For production grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.
SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+.