synapse.ml.cognitive package
Submodules
synapse.ml.cognitive.AddDocuments module
- class synapse.ml.cognitive.AddDocuments.AddDocuments(java_obj=None, actionCol='@search.action', batchSize=100, concurrency=1, concurrentTimeout=None, errorCol='AddDocuments_23d171c8e20c_error', handler=None, indexName=None, outputCol='AddDocuments_23d171c8e20c_output', serviceName=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
actionCol (str) – You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an ‘upsert’ where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn’t exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field ‘tags’ with value [‘budget’] and you execute a merge with value [‘economy’, ‘pool’] for ‘tags’, the final value of the ‘tags’ field will be [‘economy’, ‘pool’]. It will not be [‘budget’, ‘economy’, ‘pool’]. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null.
batchSize (int) – The max size of the buffer
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
indexName (str) –
outputCol (str) – The name of the output column
serviceName (str) –
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- actionCol = Param(parent='undefined', name='actionCol', doc=" You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an 'upsert' where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn't exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field 'tags' with value ['budget'] and you execute a merge with value ['economy', 'pool'] for 'tags', the final value of the 'tags' field will be ['economy', 'pool']. It will not be ['budget', 'economy', 'pool']. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null. ")
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getActionCol()[source]
- Returns
You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an ‘upsert’ where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn’t exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field ‘tags’ with value [‘budget’] and you execute a merge with value [‘economy’, ‘pool’] for ‘tags’, the final value of the ‘tags’ field will be [‘economy’, ‘pool’]. It will not be [‘budget’, ‘economy’, ‘pool’]. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null.
- Return type
actionCol
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- indexName = Param(parent='undefined', name='indexName', doc='')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- serviceName = Param(parent='undefined', name='serviceName', doc='')
- setActionCol(value)[source]
- Parameters
actionCol – You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an ‘upsert’ where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn’t exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field ‘tags’ with value [‘budget’] and you execute a merge with value [‘economy’, ‘pool’] for ‘tags’, the final value of the ‘tags’ field will be [‘economy’, ‘pool’]. It will not be [‘budget’, ‘economy’, ‘pool’]. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null.
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(actionCol='@search.action', batchSize=100, concurrency=1, concurrentTimeout=None, errorCol='AddDocuments_23d171c8e20c_error', handler=None, indexName=None, outputCol='AddDocuments_23d171c8e20c_output', serviceName=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeBusinessCards module
- class synapse.ml.cognitive.AnalyzeBusinessCards.AnalyzeBusinessCards(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeBusinessCards_78867d030dc8_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeBusinessCards_78867d030dc8_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
includeTextDetails (object) – Include text lines and element references in the result.
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
locale (object) – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
maxPollingRetries (int) – number of times to poll
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLocale()[source]
- Returns
Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- Return type
locale
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='ServiceParam: Include text lines and element references in the result.')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- locale = Param(parent='undefined', name='locale', doc='ServiceParam: Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setLocale(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setLocaleCol(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeBusinessCards_78867d030dc8_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeBusinessCards_78867d030dc8_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeCustomModel module
- class synapse.ml.cognitive.AnalyzeCustomModel.AnalyzeCustomModel(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeCustomModel_355269a761b5_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, maxPollingRetries=1000, modelId=None, modelIdCol=None, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeCustomModel_355269a761b5_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
includeTextDetails (object) – Include text lines and element references in the result.
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
maxPollingRetries (int) – number of times to poll
modelId (object) – Model identifier.
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='ServiceParam: Include text lines and element references in the result.')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelId = Param(parent='undefined', name='modelId', doc='ServiceParam: Model identifier.')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeCustomModel_355269a761b5_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, maxPollingRetries=1000, modelId=None, modelIdCol=None, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeCustomModel_355269a761b5_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeDocument module
- class synapse.ml.cognitive.AnalyzeDocument.AnalyzeDocument(java_obj=None, apiVersion=None, apiVersionCol=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeDocument_668b4d7ee574_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, outputCol='AnalyzeDocument_668b4d7ee574_output', pages=None, pagesCol=None, pollingDelay=300, prebuiltModelId=None, prebuiltModelIdCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
apiVersion (object) – version of the api
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
locale (object) – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
maxPollingRetries (int) – number of times to poll
outputCol (str) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
prebuiltModelId (object) – Prebuilt Model identifier for Form Recognizer V3.0, supported modelId: prebuilt-read, prebuilt-layout,prebuilt-document, prebuilt-businessCard, prebuilt-idDocument, prebuilt-invoice, prebuilt-receipt,or your custom modelId
stringIndexType (object) – Method used to compute string offset and length.
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- apiVersion = Param(parent='undefined', name='apiVersion', doc='ServiceParam: version of the api')
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLocale()[source]
- Returns
Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- Return type
locale
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getPrebuiltModelId()[source]
- Returns
Prebuilt Model identifier for Form Recognizer V3.0, supported modelId: prebuilt-read, prebuilt-layout,prebuilt-document, prebuilt-businessCard, prebuilt-idDocument, prebuilt-invoice, prebuilt-receipt,or your custom modelId
- Return type
prebuiltModelId
- getStringIndexType()[source]
- Returns
Method used to compute string offset and length.
- Return type
stringIndexType
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- locale = Param(parent='undefined', name='locale', doc='ServiceParam: Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- prebuiltModelId = Param(parent='undefined', name='prebuiltModelId', doc='ServiceParam: Prebuilt Model identifier for Form Recognizer V3.0, supported modelId: prebuilt-read, prebuilt-layout,prebuilt-document, prebuilt-businessCard, prebuilt-idDocument, prebuilt-invoice, prebuilt-receipt,or your custom modelId')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setLocale(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setLocaleCol(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(apiVersion=None, apiVersionCol=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeDocument_668b4d7ee574_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, outputCol='AnalyzeDocument_668b4d7ee574_output', pages=None, pagesCol=None, pollingDelay=300, prebuiltModelId=None, prebuiltModelIdCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setPrebuiltModelId(value)[source]
- Parameters
prebuiltModelId – Prebuilt Model identifier for Form Recognizer V3.0, supported modelId: prebuilt-read, prebuilt-layout,prebuilt-document, prebuilt-businessCard, prebuilt-idDocument, prebuilt-invoice, prebuilt-receipt,or your custom modelId
- setPrebuiltModelIdCol(value)[source]
- Parameters
prebuiltModelId – Prebuilt Model identifier for Form Recognizer V3.0, supported modelId: prebuilt-read, prebuilt-layout,prebuilt-document, prebuilt-businessCard, prebuilt-idDocument, prebuilt-invoice, prebuilt-receipt,or your custom modelId
- setStringIndexType(value)[source]
- Parameters
stringIndexType – Method used to compute string offset and length.
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType – Method used to compute string offset and length.
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Method used to compute string offset and length.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeIDDocuments module
- class synapse.ml.cognitive.AnalyzeIDDocuments.AnalyzeIDDocuments(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeIDDocuments_d43966693cb6_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeIDDocuments_d43966693cb6_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
includeTextDetails (object) – Include text lines and element references in the result.
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
maxPollingRetries (int) – number of times to poll
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='ServiceParam: Include text lines and element references in the result.')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeIDDocuments_d43966693cb6_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeIDDocuments_d43966693cb6_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeImage module
- class synapse.ml.cognitive.AnalyzeImage.AnalyzeImage(java_obj=None, concurrency=1, concurrentTimeout=None, details=None, detailsCol=None, errorCol='AnalyzeImage_0f35ae1afc62_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='AnalyzeImage_0f35ae1afc62_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None, visualFeatures=None, visualFeaturesCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
details (object) – what visual feature types to return
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – the language of the response (en if none given)
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
visualFeatures (object) – what visual feature types to return
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- details = Param(parent='undefined', name='details', doc='ServiceParam: what visual feature types to return')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language of the response (en if none given)')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguageCol(value)[source]
- Parameters
language – the language of the response (en if none given)
- setParams(concurrency=1, concurrentTimeout=None, details=None, detailsCol=None, errorCol='AnalyzeImage_0f35ae1afc62_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='AnalyzeImage_0f35ae1afc62_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None, visualFeatures=None, visualFeaturesCol=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- setVisualFeaturesCol(value)[source]
- Parameters
visualFeatures – what visual feature types to return
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- visualFeatures = Param(parent='undefined', name='visualFeatures', doc='ServiceParam: what visual feature types to return')
synapse.ml.cognitive.AnalyzeInvoices module
- class synapse.ml.cognitive.AnalyzeInvoices.AnalyzeInvoices(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeInvoices_43612dccb408_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeInvoices_43612dccb408_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
includeTextDetails (object) – Include text lines and element references in the result.
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
locale (object) – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
maxPollingRetries (int) – number of times to poll
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLocale()[source]
- Returns
Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- Return type
locale
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='ServiceParam: Include text lines and element references in the result.')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- locale = Param(parent='undefined', name='locale', doc='ServiceParam: Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setLocale(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setLocaleCol(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeInvoices_43612dccb408_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeInvoices_43612dccb408_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeLayout module
- class synapse.ml.cognitive.AnalyzeLayout.AnalyzeLayout(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeLayout_42cda16b2446_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeLayout_42cda16b2446_output', pages=None, pagesCol=None, pollingDelay=300, readingOrder=None, readingOrderCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
language (object) – The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
maxPollingRetries (int) – number of times to poll
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
readingOrder (object) – Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either ‘basic’ or ‘natural’. Will default to basic if not specified
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLanguage()[source]
- Returns
The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- Return type
language
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getReadingOrder()[source]
- Returns
Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either ‘basic’ or ‘natural’. Will default to basic if not specified
- Return type
readingOrder
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- language = Param(parent='undefined', name='language', doc='ServiceParam: The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- readingOrder = Param(parent='undefined', name='readingOrder', doc="ServiceParam: Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either 'basic' or 'natural'. Will default to basic if not specified")
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setLanguage(value)[source]
- Parameters
language – The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- setLanguageCol(value)[source]
- Parameters
language – The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeLayout_42cda16b2446_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeLayout_42cda16b2446_output', pages=None, pagesCol=None, pollingDelay=300, readingOrder=None, readingOrderCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setReadingOrder(value)[source]
- Parameters
readingOrder – Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either ‘basic’ or ‘natural’. Will default to basic if not specified
- setReadingOrderCol(value)[source]
- Parameters
readingOrder – Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either ‘basic’ or ‘natural’. Will default to basic if not specified
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeReceipts module
- class synapse.ml.cognitive.AnalyzeReceipts.AnalyzeReceipts(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeReceipts_cb89a48db0fa_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeReceipts_cb89a48db0fa_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
includeTextDetails (object) – Include text lines and element references in the result.
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
locale (object) – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
maxPollingRetries (int) – number of times to poll
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLocale()[source]
- Returns
Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- Return type
locale
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='ServiceParam: Include text lines and element references in the result.')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- locale = Param(parent='undefined', name='locale', doc='ServiceParam: Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setLocale(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setLocaleCol(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeReceipts_cb89a48db0fa_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeReceipts_cb89a48db0fa_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AzureSearchWriter module
synapse.ml.cognitive.BingImageSearch module
- class synapse.ml.cognitive.BingImageSearch.BingImageSearch(java_obj=None, aspect=None, aspectCol=None, color=None, colorCol=None, concurrency=1, concurrentTimeout=None, count=None, countCol=None, errorCol='BingImageSearch_a5631379bdbb_error', freshness=None, freshnessCol=None, handler=None, height=None, heightCol=None, imageContent=None, imageContentCol=None, imageType=None, imageTypeCol=None, license=None, licenseCol=None, maxFileSize=None, maxFileSizeCol=None, maxHeight=None, maxHeightCol=None, maxWidth=None, maxWidthCol=None, minFileSize=None, minFileSizeCol=None, minHeight=None, minHeightCol=None, minWidth=None, minWidthCol=None, mkt=None, mktCol=None, offset=None, offsetCol=None, outputCol='BingImageSearch_a5631379bdbb_output', q=None, qCol=None, size=None, sizeCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url='https://api.bing.microsoft.com/v7.0/images/search', width=None, widthCol=None)[source]
Bases:
synapse.ml.cognitive._BingImageSearch._BingImageSearch
synapse.ml.cognitive.BreakSentence module
- class synapse.ml.cognitive.BreakSentence.BreakSentence(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='BreakSentence_eda0bcf1ca97_error', handler=None, language=None, languageCol=None, outputCol='BreakSentence_eda0bcf1ca97_output', script=None, scriptCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
outputCol (str) – The name of the output column
script (object) – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
subscriptionKey (object) – the API key to use
subscriptionRegion (object) – the API region to use
text (object) – the string to translate
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- Return type
language
- getScript()[source]
- Returns
Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
- Return type
script
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- script = Param(parent='undefined', name='script', doc='ServiceParam: Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setLanguageCol(value)[source]
- Parameters
language – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='BreakSentence_eda0bcf1ca97_error', handler=None, language=None, languageCol=None, outputCol='BreakSentence_eda0bcf1ca97_output', script=None, scriptCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setScript(value)[source]
- Parameters
script – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
- setScriptCol(value)[source]
- Parameters
script – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.ConversationTranscription module
- class synapse.ml.cognitive.ConversationTranscription.ConversationTranscription(java_obj=None, audioDataCol=None, endpointId=None, extraFfmpegArgs=[], fileType=None, fileTypeCol=None, format=None, formatCol=None, language=None, languageCol=None, outputCol=None, participantsJson=None, participantsJsonCol=None, profanity=None, profanityCol=None, recordAudioData=False, recordedFileNameCol=None, streamIntermediateResults=True, subscriptionKey=None, subscriptionKeyCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
audioDataCol (str) – Column holding audio data, must be either ByteArrays or Strings representing file URIs
endpointId (str) – endpoint for custom speech models
extraFfmpegArgs (list) – extra arguments to for ffmpeg output decoding
fileType (object) – The file type of the sound files, supported types: wav, ogg, mp3
format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
language (object) – Identifies the spoken language that is being recognized.
outputCol (str) – The name of the output column
participantsJson (object) – a json representation of a list of conversation participants (email, language, user)
profanity (object) – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
recordAudioData (bool) – Whether to record audio data to a file location, for use only with m3u8 streams
recordedFileNameCol (str) – Column holding file names to write audio data to if ``recordAudioData’’ is set to true
streamIntermediateResults (bool) – Whether or not to immediately return itermediate results, or group in a sequence
subscriptionKey (object) – the API key to use
url (str) – Url of the service
- audioDataCol = Param(parent='undefined', name='audioDataCol', doc='Column holding audio data, must be either ByteArrays or Strings representing file URIs')
- endpointId = Param(parent='undefined', name='endpointId', doc='endpoint for custom speech models')
- extraFfmpegArgs = Param(parent='undefined', name='extraFfmpegArgs', doc='extra arguments to for ffmpeg output decoding')
- fileType = Param(parent='undefined', name='fileType', doc='ServiceParam: The file type of the sound files, supported types: wav, ogg, mp3')
- format = Param(parent='undefined', name='format', doc='ServiceParam: Specifies the result format. Accepted values are simple and detailed. Default is simple. ')
- getAudioDataCol()[source]
- Returns
Column holding audio data, must be either ByteArrays or Strings representing file URIs
- Return type
audioDataCol
- getExtraFfmpegArgs()[source]
- Returns
extra arguments to for ffmpeg output decoding
- Return type
extraFfmpegArgs
- getFileType()[source]
- Returns
The file type of the sound files, supported types: wav, ogg, mp3
- Return type
fileType
- getFormat()[source]
- Returns
Specifies the result format. Accepted values are simple and detailed. Default is simple.
- Return type
format
- getLanguage()[source]
- Returns
Identifies the spoken language that is being recognized.
- Return type
language
- getParticipantsJson()[source]
- Returns
a json representation of a list of conversation participants (email, language, user)
- Return type
participantsJson
- getProfanity()[source]
- Returns
Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- Return type
profanity
- getRecordAudioData()[source]
- Returns
Whether to record audio data to a file location, for use only with m3u8 streams
- Return type
recordAudioData
- getRecordedFileNameCol()[source]
- Returns
Column holding file names to write audio data to if ``recordAudioData’’ is set to true
- Return type
recordedFileNameCol
- getStreamIntermediateResults()[source]
- Returns
Whether or not to immediately return itermediate results, or group in a sequence
- Return type
streamIntermediateResults
- language = Param(parent='undefined', name='language', doc='ServiceParam: Identifies the spoken language that is being recognized. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- participantsJson = Param(parent='undefined', name='participantsJson', doc='ServiceParam: a json representation of a list of conversation participants (email, language, user)')
- profanity = Param(parent='undefined', name='profanity', doc='ServiceParam: Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked. ')
- recordAudioData = Param(parent='undefined', name='recordAudioData', doc='Whether to record audio data to a file location, for use only with m3u8 streams')
- recordedFileNameCol = Param(parent='undefined', name='recordedFileNameCol', doc="Column holding file names to write audio data to if ``recordAudioData'' is set to true")
- setAudioDataCol(value)[source]
- Parameters
audioDataCol – Column holding audio data, must be either ByteArrays or Strings representing file URIs
- setExtraFfmpegArgs(value)[source]
- Parameters
extraFfmpegArgs – extra arguments to for ffmpeg output decoding
- setFileType(value)[source]
- Parameters
fileType – The file type of the sound files, supported types: wav, ogg, mp3
- setFileTypeCol(value)[source]
- Parameters
fileType – The file type of the sound files, supported types: wav, ogg, mp3
- setFormat(value)[source]
- Parameters
format – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setFormatCol(value)[source]
- Parameters
format – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setLanguage(value)[source]
- Parameters
language – Identifies the spoken language that is being recognized.
- setLanguageCol(value)[source]
- Parameters
language – Identifies the spoken language that is being recognized.
- setParams(audioDataCol=None, endpointId=None, extraFfmpegArgs=[], fileType=None, fileTypeCol=None, format=None, formatCol=None, language=None, languageCol=None, outputCol=None, participantsJson=None, participantsJsonCol=None, profanity=None, profanityCol=None, recordAudioData=False, recordedFileNameCol=None, streamIntermediateResults=True, subscriptionKey=None, subscriptionKeyCol=None, url=None)[source]
Set the (keyword only) parameters
- setParticipantsJson(value)[source]
- Parameters
participantsJson – a json representation of a list of conversation participants (email, language, user)
- setParticipantsJsonCol(value)[source]
- Parameters
participantsJson – a json representation of a list of conversation participants (email, language, user)
- setProfanity(value)[source]
- Parameters
profanity – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setProfanityCol(value)[source]
- Parameters
profanity – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setRecordAudioData(value)[source]
- Parameters
recordAudioData – Whether to record audio data to a file location, for use only with m3u8 streams
- setRecordedFileNameCol(value)[source]
- Parameters
recordedFileNameCol – Column holding file names to write audio data to if ``recordAudioData’’ is set to true
- setStreamIntermediateResults(value)[source]
- Parameters
streamIntermediateResults – Whether or not to immediately return itermediate results, or group in a sequence
- streamIntermediateResults = Param(parent='undefined', name='streamIntermediateResults', doc='Whether or not to immediately return itermediate results, or group in a sequence')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DescribeImage module
- class synapse.ml.cognitive.DescribeImage.DescribeImage(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='DescribeImage_cac8d6bc9fcc_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, maxCandidates=None, maxCandidatesCol=None, outputCol='DescribeImage_cac8d6bc9fcc_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – Language of image description
maxCandidates (object) – Maximum candidate descriptions to return
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getMaxCandidates()[source]
- Returns
Maximum candidate descriptions to return
- Return type
maxCandidates
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- language = Param(parent='undefined', name='language', doc='ServiceParam: Language of image description')
- maxCandidates = Param(parent='undefined', name='maxCandidates', doc='ServiceParam: Maximum candidate descriptions to return')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setMaxCandidates(value)[source]
- Parameters
maxCandidates – Maximum candidate descriptions to return
- setMaxCandidatesCol(value)[source]
- Parameters
maxCandidates – Maximum candidate descriptions to return
- setParams(concurrency=1, concurrentTimeout=None, errorCol='DescribeImage_cac8d6bc9fcc_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, maxCandidates=None, maxCandidatesCol=None, outputCol='DescribeImage_cac8d6bc9fcc_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.Detect module
- class synapse.ml.cognitive.Detect.Detect(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='Detect_852ec33b4f2f_error', handler=None, outputCol='Detect_852ec33b4f2f_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
subscriptionRegion (object) – the API region to use
text (object) – the string to translate
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='Detect_852ec33b4f2f_error', handler=None, outputCol='Detect_852ec33b4f2f_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DetectAnomalies module
- class synapse.ml.cognitive.DetectAnomalies.DetectAnomalies(java_obj=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectAnomalies_bf669301e314_error', granularity=None, granularityCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectAnomalies_bf669301e314_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
errorCol (str) – column to hold http errors
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
handler (object) – Which strategy to use when handling requests
imputeFixedValue (object) – Optional argument, fixed value to use when imputeMode is set to “fixed”
imputeMode (object) – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
outputCol (str) – The name of the output column
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customInterval = Param(parent='undefined', name='customInterval', doc='ServiceParam: Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5. ')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomInterval()[source]
- Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
customInterval
- getGranularity()[source]
- Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
granularity
- getImputeFixedValue()[source]
- Returns
Optional argument, fixed value to use when imputeMode is set to “fixed”
- Return type
imputeFixedValue
- getImputeMode()[source]
- Returns
Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- Return type
imputeMode
- getMaxAnomalyRatio()[source]
- Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
maxAnomalyRatio
- getPeriod()[source]
- Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
period
- getSensitivity()[source]
- Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
sensitivity
- getSeries()[source]
- Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
series
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- granularity = Param(parent='undefined', name='granularity', doc='ServiceParam: Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid. ')
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imputeFixedValue = Param(parent='undefined', name='imputeFixedValue', doc='ServiceParam: Optional argument, fixed value to use when imputeMode is set to "fixed" ')
- imputeMode = Param(parent='undefined', name='imputeMode', doc='ServiceParam: Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill ')
- maxAnomalyRatio = Param(parent='undefined', name='maxAnomalyRatio', doc='ServiceParam: Optional argument, advanced model parameter, max anomaly ratio in a time series. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- period = Param(parent='undefined', name='period', doc='ServiceParam: Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically. ')
- sensitivity = Param(parent='undefined', name='sensitivity', doc='ServiceParam: Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted ')
- series = Param(parent='undefined', name='series', doc='ServiceParam: Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned. ')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setCustomInterval(value)[source]
- Parameters
customInterval – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setCustomIntervalCol(value)[source]
- Parameters
customInterval – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setGranularity(value)[source]
- Parameters
granularity – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setGranularityCol(value)[source]
- Parameters
granularity – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setImputeFixedValue(value)[source]
- Parameters
imputeFixedValue – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeFixedValueCol(value)[source]
- Parameters
imputeFixedValue – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeMode(value)[source]
- Parameters
imputeMode – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setImputeModeCol(value)[source]
- Parameters
imputeMode – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setMaxAnomalyRatio(value)[source]
- Parameters
maxAnomalyRatio – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setMaxAnomalyRatioCol(value)[source]
- Parameters
maxAnomalyRatio – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setParams(concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectAnomalies_bf669301e314_error', granularity=None, granularityCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectAnomalies_bf669301e314_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPeriod(value)[source]
- Parameters
period – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setPeriodCol(value)[source]
- Parameters
period – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setSensitivity(value)[source]
- Parameters
sensitivity – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSensitivityCol(value)[source]
- Parameters
sensitivity – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSeries(value)[source]
- Parameters
series – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setSeriesCol(value)[source]
- Parameters
series – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DetectFace module
- class synapse.ml.cognitive.DetectFace.DetectFace(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='DetectFace_ec61c3846cd1_error', handler=None, imageUrl=None, imageUrlCol=None, outputCol='DetectFace_ec61c3846cd1_output', returnFaceAttributes=None, returnFaceAttributesCol=None, returnFaceId=None, returnFaceIdCol=None, returnFaceLandmarks=None, returnFaceLandmarksCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
imageUrl (object) – the url of the image to use
outputCol (str) – The name of the output column
returnFaceAttributes (object) – Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
returnFaceId (object) – Return faceIds of the detected faces or not. The default value is true
returnFaceLandmarks (object) – Return face landmarks of the detected faces or not. The default value is false.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getReturnFaceAttributes()[source]
- Returns
Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
- Return type
returnFaceAttributes
- getReturnFaceId()[source]
- Returns
Return faceIds of the detected faces or not. The default value is true
- Return type
returnFaceId
- getReturnFaceLandmarks()[source]
- Returns
Return face landmarks of the detected faces or not. The default value is false.
- Return type
returnFaceLandmarks
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- returnFaceAttributes = Param(parent='undefined', name='returnFaceAttributes', doc='ServiceParam: Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.')
- returnFaceId = Param(parent='undefined', name='returnFaceId', doc='ServiceParam: Return faceIds of the detected faces or not. The default value is true')
- returnFaceLandmarks = Param(parent='undefined', name='returnFaceLandmarks', doc='ServiceParam: Return face landmarks of the detected faces or not. The default value is false.')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='DetectFace_ec61c3846cd1_error', handler=None, imageUrl=None, imageUrlCol=None, outputCol='DetectFace_ec61c3846cd1_output', returnFaceAttributes=None, returnFaceAttributesCol=None, returnFaceId=None, returnFaceIdCol=None, returnFaceLandmarks=None, returnFaceLandmarksCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setReturnFaceAttributes(value)[source]
- Parameters
returnFaceAttributes – Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
- setReturnFaceAttributesCol(value)[source]
- Parameters
returnFaceAttributes – Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
- setReturnFaceId(value)[source]
- Parameters
returnFaceId – Return faceIds of the detected faces or not. The default value is true
- setReturnFaceIdCol(value)[source]
- Parameters
returnFaceId – Return faceIds of the detected faces or not. The default value is true
- setReturnFaceLandmarks(value)[source]
- Parameters
returnFaceLandmarks – Return face landmarks of the detected faces or not. The default value is false.
- setReturnFaceLandmarksCol(value)[source]
- Parameters
returnFaceLandmarks – Return face landmarks of the detected faces or not. The default value is false.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DetectLastAnomaly module
- class synapse.ml.cognitive.DetectLastAnomaly.DetectLastAnomaly(java_obj=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectLastAnomaly_49ae8cd4adb7_error', granularity=None, granularityCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectLastAnomaly_49ae8cd4adb7_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
errorCol (str) – column to hold http errors
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
handler (object) – Which strategy to use when handling requests
imputeFixedValue (object) – Optional argument, fixed value to use when imputeMode is set to “fixed”
imputeMode (object) – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
outputCol (str) – The name of the output column
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customInterval = Param(parent='undefined', name='customInterval', doc='ServiceParam: Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5. ')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomInterval()[source]
- Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
customInterval
- getGranularity()[source]
- Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
granularity
- getImputeFixedValue()[source]
- Returns
Optional argument, fixed value to use when imputeMode is set to “fixed”
- Return type
imputeFixedValue
- getImputeMode()[source]
- Returns
Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- Return type
imputeMode
- getMaxAnomalyRatio()[source]
- Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
maxAnomalyRatio
- getPeriod()[source]
- Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
period
- getSensitivity()[source]
- Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
sensitivity
- getSeries()[source]
- Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
series
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- granularity = Param(parent='undefined', name='granularity', doc='ServiceParam: Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid. ')
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imputeFixedValue = Param(parent='undefined', name='imputeFixedValue', doc='ServiceParam: Optional argument, fixed value to use when imputeMode is set to "fixed" ')
- imputeMode = Param(parent='undefined', name='imputeMode', doc='ServiceParam: Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill ')
- maxAnomalyRatio = Param(parent='undefined', name='maxAnomalyRatio', doc='ServiceParam: Optional argument, advanced model parameter, max anomaly ratio in a time series. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- period = Param(parent='undefined', name='period', doc='ServiceParam: Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically. ')
- sensitivity = Param(parent='undefined', name='sensitivity', doc='ServiceParam: Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted ')
- series = Param(parent='undefined', name='series', doc='ServiceParam: Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned. ')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setCustomInterval(value)[source]
- Parameters
customInterval – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setCustomIntervalCol(value)[source]
- Parameters
customInterval – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setGranularity(value)[source]
- Parameters
granularity – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setGranularityCol(value)[source]
- Parameters
granularity – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setImputeFixedValue(value)[source]
- Parameters
imputeFixedValue – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeFixedValueCol(value)[source]
- Parameters
imputeFixedValue – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeMode(value)[source]
- Parameters
imputeMode – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setImputeModeCol(value)[source]
- Parameters
imputeMode – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setMaxAnomalyRatio(value)[source]
- Parameters
maxAnomalyRatio – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setMaxAnomalyRatioCol(value)[source]
- Parameters
maxAnomalyRatio – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setParams(concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectLastAnomaly_49ae8cd4adb7_error', granularity=None, granularityCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectLastAnomaly_49ae8cd4adb7_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPeriod(value)[source]
- Parameters
period – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setPeriodCol(value)[source]
- Parameters
period – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setSensitivity(value)[source]
- Parameters
sensitivity – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSensitivityCol(value)[source]
- Parameters
sensitivity – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSeries(value)[source]
- Parameters
series – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setSeriesCol(value)[source]
- Parameters
series – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DetectMultivariateAnomaly module
- class synapse.ml.cognitive.DetectMultivariateAnomaly.DetectMultivariateAnomaly(java_obj=None, backoffs=[100, 500, 1000], connectionString=None, containerName=None, endTime=None, endpoint=None, errorCol='DetectMultivariateAnomaly_df81e7f9071e_error', initialPollingDelay=300, inputCols=None, intermediateSaveDir=None, maxPollingRetries=1000, modelId=None, outputCol='DetectMultivariateAnomaly_df81e7f9071e_output', pollingDelay=300, sasToken=None, startTime=None, storageKey=None, storageName=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timestampCol='timestamp', url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaModel
- Parameters
backoffs (list) – array of backoffs to use in the handler
connectionString (str) – Connection String for your storage account used for uploading files.
containerName (str) – Container that will be used to upload files to.
endTime (str) – A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
endpoint (str) – End Point for your storage account used for uploading files.
errorCol (str) – column to hold http errors
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
inputCols (list) – The names of the input columns
intermediateSaveDir (str) – Directory name of which you want to save the intermediate data produced while training.
maxPollingRetries (int) – number of times to poll
modelId (str) – Format - uuid. Model identifier.
outputCol (str) – The name of the output column
pollingDelay (int) – number of milliseconds to wait between polling
sasToken (str) – SAS Token for your storage account used for uploading files.
startTime (str) – A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
storageKey (str) – Storage Key for your storage account used for uploading files.
storageName (str) – Storage Name for your storage account used for uploading files.
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timestampCol (str) – Timestamp column name
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- connectionString = Param(parent='undefined', name='connectionString', doc='Connection String for your storage account used for uploading files.')
- containerName = Param(parent='undefined', name='containerName', doc='Container that will be used to upload files to.')
- endTime = Param(parent='undefined', name='endTime', doc='A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.')
- endpoint = Param(parent='undefined', name='endpoint', doc='End Point for your storage account used for uploading files.')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConnectionString()[source]
- Returns
Connection String for your storage account used for uploading files.
- Return type
connectionString
- getContainerName()[source]
- Returns
Container that will be used to upload files to.
- Return type
containerName
- getEndTime()[source]
- Returns
A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- Return type
endTime
- getEndpoint()[source]
- Returns
End Point for your storage account used for uploading files.
- Return type
endpoint
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getIntermediateSaveDir()[source]
- Returns
Directory name of which you want to save the intermediate data produced while training.
- Return type
intermediateSaveDir
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSasToken()[source]
- Returns
SAS Token for your storage account used for uploading files.
- Return type
sasToken
- getStartTime()[source]
- Returns
A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- Return type
startTime
- getStorageKey()[source]
- Returns
Storage Key for your storage account used for uploading files.
- Return type
storageKey
- getStorageName()[source]
- Returns
Storage Name for your storage account used for uploading files.
- Return type
storageName
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- inputCols = Param(parent='undefined', name='inputCols', doc='The names of the input columns')
- intermediateSaveDir = Param(parent='undefined', name='intermediateSaveDir', doc='Directory name of which you want to save the intermediate data produced while training.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelId = Param(parent='undefined', name='modelId', doc='Format - uuid. Model identifier.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- sasToken = Param(parent='undefined', name='sasToken', doc='SAS Token for your storage account used for uploading files.')
- setConnectionString(value)[source]
- Parameters
connectionString – Connection String for your storage account used for uploading files.
- setContainerName(value)[source]
- Parameters
containerName – Container that will be used to upload files to.
- setEndTime(value)[source]
- Parameters
endTime – A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- setEndpoint(value)[source]
- Parameters
endpoint – End Point for your storage account used for uploading files.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setIntermediateSaveDir(value)[source]
- Parameters
intermediateSaveDir – Directory name of which you want to save the intermediate data produced while training.
- setParams(backoffs=[100, 500, 1000], connectionString=None, containerName=None, endTime=None, endpoint=None, errorCol='DetectMultivariateAnomaly_df81e7f9071e_error', initialPollingDelay=300, inputCols=None, intermediateSaveDir=None, maxPollingRetries=1000, modelId=None, outputCol='DetectMultivariateAnomaly_df81e7f9071e_output', pollingDelay=300, sasToken=None, startTime=None, storageKey=None, storageName=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timestampCol='timestamp', url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSasToken(value)[source]
- Parameters
sasToken – SAS Token for your storage account used for uploading files.
- setStartTime(value)[source]
- Parameters
startTime – A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- setStorageKey(value)[source]
- Parameters
storageKey – Storage Key for your storage account used for uploading files.
- setStorageName(value)[source]
- Parameters
storageName – Storage Name for your storage account used for uploading files.
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- startTime = Param(parent='undefined', name='startTime', doc='A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.')
- storageKey = Param(parent='undefined', name='storageKey', doc='Storage Key for your storage account used for uploading files.')
- storageName = Param(parent='undefined', name='storageName', doc='Storage Name for your storage account used for uploading files.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timestampCol = Param(parent='undefined', name='timestampCol', doc='Timestamp column name')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DictionaryExamples module
- class synapse.ml.cognitive.DictionaryExamples.DictionaryExamples(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='DictionaryExamples_556e40a8dc6e_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryExamples_556e40a8dc6e_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, textAndTranslation=None, textAndTranslationCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
fromLanguage (object) – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
handler (object) – Which strategy to use when handling requests
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
subscriptionRegion (object) – the API region to use
textAndTranslation (object) – A string specifying the translated text previously returned by the Dictionary lookup operation.
timeout (float) – number of seconds to wait before closing the connection
toLanguage (object) – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fromLanguage = Param(parent='undefined', name='fromLanguage', doc='ServiceParam: Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFromLanguage()[source]
- Returns
Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- Return type
fromLanguage
- getTextAndTranslation()[source]
- Returns
A string specifying the translated text previously returned by the Dictionary lookup operation.
- Return type
textAndTranslation
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getToLanguage()[source]
- Returns
Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- Return type
toLanguage
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFromLanguage(value)[source]
- Parameters
fromLanguage – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- setFromLanguageCol(value)[source]
- Parameters
fromLanguage – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='DictionaryExamples_556e40a8dc6e_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryExamples_556e40a8dc6e_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, textAndTranslation=None, textAndTranslationCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]
Set the (keyword only) parameters
- setTextAndTranslation(value)[source]
- Parameters
textAndTranslation – A string specifying the translated text previously returned by the Dictionary lookup operation.
- setTextAndTranslationCol(value)[source]
- Parameters
textAndTranslation – A string specifying the translated text previously returned by the Dictionary lookup operation.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- setToLanguage(value)[source]
- Parameters
toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- setToLanguageCol(value)[source]
- Parameters
toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- textAndTranslation = Param(parent='undefined', name='textAndTranslation', doc='ServiceParam: A string specifying the translated text previously returned by the Dictionary lookup operation.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- toLanguage = Param(parent='undefined', name='toLanguage', doc='ServiceParam: Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DictionaryLookup module
- class synapse.ml.cognitive.DictionaryLookup.DictionaryLookup(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='DictionaryLookup_ca9d06ff4409_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryLookup_ca9d06ff4409_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
fromLanguage (object) – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
handler (object) – Which strategy to use when handling requests
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
subscriptionRegion (object) – the API region to use
text (object) – the string to translate
timeout (float) – number of seconds to wait before closing the connection
toLanguage (object) – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fromLanguage = Param(parent='undefined', name='fromLanguage', doc='ServiceParam: Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFromLanguage()[source]
- Returns
Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- Return type
fromLanguage
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getToLanguage()[source]
- Returns
Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- Return type
toLanguage
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFromLanguage(value)[source]
- Parameters
fromLanguage – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- setFromLanguageCol(value)[source]
- Parameters
fromLanguage – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='DictionaryLookup_ca9d06ff4409_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryLookup_ca9d06ff4409_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- setToLanguage(value)[source]
- Parameters
toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- setToLanguageCol(value)[source]
- Parameters
toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- toLanguage = Param(parent='undefined', name='toLanguage', doc='ServiceParam: Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DocumentTranslator module
- class synapse.ml.cognitive.DocumentTranslator.DocumentTranslator(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='DocumentTranslator_05248ebcc0a0_error', filterPrefix=None, filterPrefixCol=None, filterSuffix=None, filterSuffixCol=None, initialPollingDelay=300, maxPollingRetries=1000, outputCol='DocumentTranslator_05248ebcc0a0_output', pollingDelay=300, serviceName=None, sourceLanguage=None, sourceLanguageCol=None, sourceStorageSource=None, sourceStorageSourceCol=None, sourceUrl=None, sourceUrlCol=None, storageType=None, storageTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, targets=None, targetsCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
filterPrefix (object) – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.
filterSuffix (object) – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
maxPollingRetries (int) – number of times to poll
outputCol (str) – The name of the output column
pollingDelay (int) – number of milliseconds to wait between polling
serviceName (str) –
sourceLanguage (object) – Language code. If none is specified, we will perform auto detect on the document.
sourceStorageSource (object) – Storage source of source input.
sourceUrl (object) – Location of the folder / container or single file with your documents.
storageType (object) – Storage type of the input documents source string. Required for single document translation only.
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
targets (object) – Destination for the finished translated documents.
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- filterPrefix = Param(parent='undefined', name='filterPrefix', doc='ServiceParam: A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.')
- filterSuffix = Param(parent='undefined', name='filterSuffix', doc='ServiceParam: A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFilterPrefix()[source]
- Returns
A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.
- Return type
filterPrefix
- getFilterSuffix()[source]
- Returns
A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
- Return type
filterSuffix
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSourceLanguage()[source]
- Returns
Language code. If none is specified, we will perform auto detect on the document.
- Return type
sourceLanguage
- getSourceStorageSource()[source]
- Returns
Storage source of source input.
- Return type
sourceStorageSource
- getSourceUrl()[source]
- Returns
Location of the folder / container or single file with your documents.
- Return type
sourceUrl
- getStorageType()[source]
- Returns
Storage type of the input documents source string. Required for single document translation only.
- Return type
storageType
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTargets()[source]
- Returns
Destination for the finished translated documents.
- Return type
targets
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- serviceName = Param(parent='undefined', name='serviceName', doc='')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFilterPrefix(value)[source]
- Parameters
filterPrefix – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.
- setFilterPrefixCol(value)[source]
- Parameters
filterPrefix – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.
- setFilterSuffix(value)[source]
- Parameters
filterSuffix – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
- setFilterSuffixCol(value)[source]
- Parameters
filterSuffix – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='DocumentTranslator_05248ebcc0a0_error', filterPrefix=None, filterPrefixCol=None, filterSuffix=None, filterSuffixCol=None, initialPollingDelay=300, maxPollingRetries=1000, outputCol='DocumentTranslator_05248ebcc0a0_output', pollingDelay=300, serviceName=None, sourceLanguage=None, sourceLanguageCol=None, sourceStorageSource=None, sourceStorageSourceCol=None, sourceUrl=None, sourceUrlCol=None, storageType=None, storageTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, targets=None, targetsCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSourceLanguage(value)[source]
- Parameters
sourceLanguage – Language code. If none is specified, we will perform auto detect on the document.
- setSourceLanguageCol(value)[source]
- Parameters
sourceLanguage – Language code. If none is specified, we will perform auto detect on the document.
- setSourceStorageSource(value)[source]
- Parameters
sourceStorageSource – Storage source of source input.
- setSourceStorageSourceCol(value)[source]
- Parameters
sourceStorageSource – Storage source of source input.
- setSourceUrl(value)[source]
- Parameters
sourceUrl – Location of the folder / container or single file with your documents.
- setSourceUrlCol(value)[source]
- Parameters
sourceUrl – Location of the folder / container or single file with your documents.
- setStorageType(value)[source]
- Parameters
storageType – Storage type of the input documents source string. Required for single document translation only.
- setStorageTypeCol(value)[source]
- Parameters
storageType – Storage type of the input documents source string. Required for single document translation only.
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTargetsCol(value)[source]
- Parameters
targets – Destination for the finished translated documents.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- sourceLanguage = Param(parent='undefined', name='sourceLanguage', doc='ServiceParam: Language code. If none is specified, we will perform auto detect on the document.')
- sourceStorageSource = Param(parent='undefined', name='sourceStorageSource', doc='ServiceParam: Storage source of source input.')
- sourceUrl = Param(parent='undefined', name='sourceUrl', doc='ServiceParam: Location of the folder / container or single file with your documents.')
- storageType = Param(parent='undefined', name='storageType', doc='ServiceParam: Storage type of the input documents source string. Required for single document translation only.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- targets = Param(parent='undefined', name='targets', doc='ServiceParam: Destination for the finished translated documents.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.EntityDetector module
- class synapse.ml.cognitive.EntityDetector.EntityDetector(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='EntityDetector_7958cc13daf9_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='EntityDetector_7958cc13daf9_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
showStats (object) – if set to true, response will contain input and document level statistics.
stringIndexType (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getShowStats()[source]
- Returns
if set to true, response will contain input and document level statistics.
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='EntityDetector_7958cc13daf9_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='EntityDetector_7958cc13daf9_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setShowStatsCol(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setStringIndexType(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: if set to true, response will contain input and document level statistics.')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.EntityDetectorSDK module
- class synapse.ml.cognitive.EntityDetectorSDK.EntityDetectorSDK(java_obj=None, batchSize=5, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='Error', includeStatistics=None, includeStatisticsCol=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
batchSize (int) – The max size of the buffer
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
disableServiceLogs (object) – disableServiceLogs option
errorCol (str) – column to hold http errors
includeStatistics (object) – includeStatistics option
language (object) – the language code of the text (optional for some services)
modelVersion (object) – modelVersion option
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- includeStatistics = Param(parent='undefined', name='includeStatistics', doc='ServiceParam: includeStatistics option')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: modelVersion option')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(batchSize=5, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='Error', includeStatistics=None, includeStatisticsCol=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.EntityDetectorV2 module
- class synapse.ml.cognitive.EntityDetectorV2.EntityDetectorV2(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='EntityDetectorV2_87ab05d95641_error', handler=None, language=None, languageCol=None, outputCol='EntityDetectorV2_87ab05d95641_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(concurrency=1, concurrentTimeout=None, errorCol='EntityDetectorV2_87ab05d95641_error', handler=None, language=None, languageCol=None, outputCol='EntityDetectorV2_87ab05d95641_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.FindSimilarFace module
- class synapse.ml.cognitive.FindSimilarFace.FindSimilarFace(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='FindSimilarFace_14c9314727f8_error', faceId=None, faceIdCol=None, faceIds=None, faceIdsCol=None, faceListId=None, faceListIdCol=None, handler=None, largeFaceListId=None, largeFaceListIdCol=None, maxNumOfCandidatesReturned=None, maxNumOfCandidatesReturnedCol=None, mode=None, modeCol=None, outputCol='FindSimilarFace_14c9314727f8_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
faceId (object) – faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
faceIds (object) – An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
faceListId (object) – An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
handler (object) – Which strategy to use when handling requests
largeFaceListId (object) – An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
maxNumOfCandidatesReturned (object) – Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
mode (object) – Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- faceId = Param(parent='undefined', name='faceId', doc='ServiceParam: faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.')
- faceIds = Param(parent='undefined', name='faceIds', doc='ServiceParam: An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.')
- faceListId = Param(parent='undefined', name='faceListId', doc='ServiceParam: An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFaceId()[source]
- Returns
faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
- Return type
faceId
- getFaceIds()[source]
- Returns
An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- Return type
faceIds
- getFaceListId()[source]
- Returns
An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- Return type
faceListId
- getLargeFaceListId()[source]
- Returns
An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- Return type
largeFaceListId
- getMaxNumOfCandidatesReturned()[source]
- Returns
Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
- Return type
maxNumOfCandidatesReturned
- getMode()[source]
- Returns
Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
- Return type
mode
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- largeFaceListId = Param(parent='undefined', name='largeFaceListId', doc='ServiceParam: An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.')
- maxNumOfCandidatesReturned = Param(parent='undefined', name='maxNumOfCandidatesReturned', doc='ServiceParam: Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.')
- mode = Param(parent='undefined', name='mode', doc="ServiceParam: Optional parameter. Similar face searching mode. It can be 'matchPerson' or 'matchFace'. It defaults to 'matchPerson'.")
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFaceId(value)[source]
- Parameters
faceId – faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
- setFaceIdCol(value)[source]
- Parameters
faceId – faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
- setFaceIds(value)[source]
- Parameters
faceIds – An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setFaceIdsCol(value)[source]
- Parameters
faceIds – An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setFaceListId(value)[source]
- Parameters
faceListId – An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setFaceListIdCol(value)[source]
- Parameters
faceListId – An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setLargeFaceListId(value)[source]
- Parameters
largeFaceListId – An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setLargeFaceListIdCol(value)[source]
- Parameters
largeFaceListId – An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setMaxNumOfCandidatesReturned(value)[source]
- Parameters
maxNumOfCandidatesReturned – Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
- setMaxNumOfCandidatesReturnedCol(value)[source]
- Parameters
maxNumOfCandidatesReturned – Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
- setMode(value)[source]
- Parameters
mode – Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
- setModeCol(value)[source]
- Parameters
mode – Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='FindSimilarFace_14c9314727f8_error', faceId=None, faceIdCol=None, faceIds=None, faceIdsCol=None, faceListId=None, faceListIdCol=None, handler=None, largeFaceListId=None, largeFaceListIdCol=None, maxNumOfCandidatesReturned=None, maxNumOfCandidatesReturnedCol=None, mode=None, modeCol=None, outputCol='FindSimilarFace_14c9314727f8_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.FitMultivariateAnomaly module
- class synapse.ml.cognitive.FitMultivariateAnomaly.FitMultivariateAnomaly(java_obj=None, alignMode=None, backoffs=[100, 500, 1000], connectionString=None, containerName=None, diagnosticsInfo=None, displayName=None, endTime=None, endpoint=None, errorCol='FitMultivariateAnomaly_a9311a34e5b0_error', fillNAMethod=None, initialPollingDelay=300, inputCols=None, intermediateSaveDir=None, maxPollingRetries=1000, outputCol='FitMultivariateAnomaly_a9311a34e5b0_output', paddingValue=None, pollingDelay=300, sasToken=None, slidingWindow=None, startTime=None, storageKey=None, storageName=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timestampCol='timestamp', url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaEstimator
- Parameters
alignMode (str) – An optional field, indicates how we align different variables into the same time-range which is required by the model.{Inner, Outer}
backoffs (list) – array of backoffs to use in the handler
connectionString (str) – Connection String for your storage account used for uploading files.
containerName (str) – Container that will be used to upload files to.
diagnosticsInfo (object) – diagnosticsInfo for training a multivariate anomaly detection model
displayName (str) – optional field, name of the model
endTime (str) – A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
endpoint (str) – End Point for your storage account used for uploading files.
errorCol (str) – column to hold http errors
fillNAMethod (str) – An optional field, indicates how missed values will be filled with. Can not be set to NotFill, when alignMode is Outer.{Previous, Subsequent, Linear, Zero, Fixed}
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
inputCols (list) – The names of the input columns
intermediateSaveDir (str) – Directory name of which you want to save the intermediate data produced while training.
maxPollingRetries (int) – number of times to poll
outputCol (str) – The name of the output column
paddingValue (int) – optional field, is only useful if FillNAMethod is set to Fixed.
pollingDelay (int) – number of milliseconds to wait between polling
sasToken (str) – SAS Token for your storage account used for uploading files.
slidingWindow (int) – An optional field, indicates how many history points will be used to determine the anomaly score of one subsequent point.
startTime (str) – A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
storageKey (str) – Storage Key for your storage account used for uploading files.
storageName (str) – Storage Name for your storage account used for uploading files.
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timestampCol (str) – Timestamp column name
url (str) – Url of the service
- alignMode = Param(parent='undefined', name='alignMode', doc='An optional field, indicates how we align different variables into the same time-range which is required by the model.{Inner, Outer}')
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- connectionString = Param(parent='undefined', name='connectionString', doc='Connection String for your storage account used for uploading files.')
- containerName = Param(parent='undefined', name='containerName', doc='Container that will be used to upload files to.')
- diagnosticsInfo = Param(parent='undefined', name='diagnosticsInfo', doc='diagnosticsInfo for training a multivariate anomaly detection model')
- displayName = Param(parent='undefined', name='displayName', doc='optional field, name of the model')
- endTime = Param(parent='undefined', name='endTime', doc='A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.')
- endpoint = Param(parent='undefined', name='endpoint', doc='End Point for your storage account used for uploading files.')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fillNAMethod = Param(parent='undefined', name='fillNAMethod', doc='An optional field, indicates how missed values will be filled with. Can not be set to NotFill, when alignMode is Outer.{Previous, Subsequent, Linear, Zero, Fixed}')
- getAlignMode()[source]
- Returns
An optional field, indicates how we align different variables into the same time-range which is required by the model.{Inner, Outer}
- Return type
alignMode
- getConnectionString()[source]
- Returns
Connection String for your storage account used for uploading files.
- Return type
connectionString
- getContainerName()[source]
- Returns
Container that will be used to upload files to.
- Return type
containerName
- getDiagnosticsInfo()[source]
- Returns
diagnosticsInfo for training a multivariate anomaly detection model
- Return type
diagnosticsInfo
- getEndTime()[source]
- Returns
A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- Return type
endTime
- getEndpoint()[source]
- Returns
End Point for your storage account used for uploading files.
- Return type
endpoint
- getFillNAMethod()[source]
- Returns
An optional field, indicates how missed values will be filled with. Can not be set to NotFill, when alignMode is Outer.{Previous, Subsequent, Linear, Zero, Fixed}
- Return type
fillNAMethod
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getIntermediateSaveDir()[source]
- Returns
Directory name of which you want to save the intermediate data produced while training.
- Return type
intermediateSaveDir
- getPaddingValue()[source]
- Returns
optional field, is only useful if FillNAMethod is set to Fixed.
- Return type
paddingValue
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSasToken()[source]
- Returns
SAS Token for your storage account used for uploading files.
- Return type
sasToken
- getSlidingWindow()[source]
- Returns
An optional field, indicates how many history points will be used to determine the anomaly score of one subsequent point.
- Return type
slidingWindow
- getStartTime()[source]
- Returns
A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- Return type
startTime
- getStorageKey()[source]
- Returns
Storage Key for your storage account used for uploading files.
- Return type
storageKey
- getStorageName()[source]
- Returns
Storage Name for your storage account used for uploading files.
- Return type
storageName
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- inputCols = Param(parent='undefined', name='inputCols', doc='The names of the input columns')
- intermediateSaveDir = Param(parent='undefined', name='intermediateSaveDir', doc='Directory name of which you want to save the intermediate data produced while training.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- paddingValue = Param(parent='undefined', name='paddingValue', doc='optional field, is only useful if FillNAMethod is set to Fixed.')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- sasToken = Param(parent='undefined', name='sasToken', doc='SAS Token for your storage account used for uploading files.')
- setAlignMode(value)[source]
- Parameters
alignMode – An optional field, indicates how we align different variables into the same time-range which is required by the model.{Inner, Outer}
- setConnectionString(value)[source]
- Parameters
connectionString – Connection String for your storage account used for uploading files.
- setContainerName(value)[source]
- Parameters
containerName – Container that will be used to upload files to.
- setDiagnosticsInfo(value)[source]
- Parameters
diagnosticsInfo – diagnosticsInfo for training a multivariate anomaly detection model
- setEndTime(value)[source]
- Parameters
endTime – A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- setEndpoint(value)[source]
- Parameters
endpoint – End Point for your storage account used for uploading files.
- setFillNAMethod(value)[source]
- Parameters
fillNAMethod – An optional field, indicates how missed values will be filled with. Can not be set to NotFill, when alignMode is Outer.{Previous, Subsequent, Linear, Zero, Fixed}
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setIntermediateSaveDir(value)[source]
- Parameters
intermediateSaveDir – Directory name of which you want to save the intermediate data produced while training.
- setPaddingValue(value)[source]
- Parameters
paddingValue – optional field, is only useful if FillNAMethod is set to Fixed.
- setParams(alignMode=None, backoffs=[100, 500, 1000], connectionString=None, containerName=None, diagnosticsInfo=None, displayName=None, endTime=None, endpoint=None, errorCol='FitMultivariateAnomaly_a9311a34e5b0_error', fillNAMethod=None, initialPollingDelay=300, inputCols=None, intermediateSaveDir=None, maxPollingRetries=1000, outputCol='FitMultivariateAnomaly_a9311a34e5b0_output', paddingValue=None, pollingDelay=300, sasToken=None, slidingWindow=None, startTime=None, storageKey=None, storageName=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timestampCol='timestamp', url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSasToken(value)[source]
- Parameters
sasToken – SAS Token for your storage account used for uploading files.
- setSlidingWindow(value)[source]
- Parameters
slidingWindow – An optional field, indicates how many history points will be used to determine the anomaly score of one subsequent point.
- setStartTime(value)[source]
- Parameters
startTime – A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- setStorageKey(value)[source]
- Parameters
storageKey – Storage Key for your storage account used for uploading files.
- setStorageName(value)[source]
- Parameters
storageName – Storage Name for your storage account used for uploading files.
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- slidingWindow = Param(parent='undefined', name='slidingWindow', doc='An optional field, indicates how many history points will be used to determine the anomaly score of one subsequent point.')
- startTime = Param(parent='undefined', name='startTime', doc='A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.')
- storageKey = Param(parent='undefined', name='storageKey', doc='Storage Key for your storage account used for uploading files.')
- storageName = Param(parent='undefined', name='storageName', doc='Storage Name for your storage account used for uploading files.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timestampCol = Param(parent='undefined', name='timestampCol', doc='Timestamp column name')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.FormOntologyLearner module
- class synapse.ml.cognitive.FormOntologyLearner.FormOntologyLearner(java_obj=None, inputCol=None, outputCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaEstimator
- Parameters
- inputCol = Param(parent='undefined', name='inputCol', doc='The name of the input column')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
synapse.ml.cognitive.FormOntologyTransformer module
- class synapse.ml.cognitive.FormOntologyTransformer.FormOntologyTransformer(java_obj=None, inputCol=None, ontology=None, outputCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaModel
- Parameters
- inputCol = Param(parent='undefined', name='inputCol', doc='The name of the input column')
- ontology = Param(parent='undefined', name='ontology', doc='The ontology to cast values to')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
synapse.ml.cognitive.GenerateThumbnails module
- class synapse.ml.cognitive.GenerateThumbnails.GenerateThumbnails(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='GenerateThumbnails_d864601fc100_error', handler=None, height=None, heightCol=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, outputCol='GenerateThumbnails_d864601fc100_output', smartCropping=None, smartCroppingCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None, width=None, widthCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
height (object) – the desired height of the image
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
outputCol (str) – The name of the output column
smartCropping (object) – whether to intelligently crop the image
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
width (object) – the desired width of the image
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getSmartCropping()[source]
- Returns
whether to intelligently crop the image
- Return type
smartCropping
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- height = Param(parent='undefined', name='height', doc='ServiceParam: the desired height of the image')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='GenerateThumbnails_d864601fc100_error', handler=None, height=None, heightCol=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, outputCol='GenerateThumbnails_d864601fc100_output', smartCropping=None, smartCroppingCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None, width=None, widthCol=None)[source]
Set the (keyword only) parameters
- setSmartCroppingCol(value)[source]
- Parameters
smartCropping – whether to intelligently crop the image
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- smartCropping = Param(parent='undefined', name='smartCropping', doc='ServiceParam: whether to intelligently crop the image')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- width = Param(parent='undefined', name='width', doc='ServiceParam: the desired width of the image')
synapse.ml.cognitive.GetCustomModel module
- class synapse.ml.cognitive.GetCustomModel.GetCustomModel(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='GetCustomModel_ba92f2acdbbb_error', handler=None, includeKeys=None, includeKeysCol=None, modelId=None, modelIdCol=None, outputCol='GetCustomModel_ba92f2acdbbb_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
includeKeys (object) – Include list of extracted keys in model information.
modelId (object) – Model identifier.
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeKeys()[source]
- Returns
Include list of extracted keys in model information.
- Return type
includeKeys
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- includeKeys = Param(parent='undefined', name='includeKeys', doc='ServiceParam: Include list of extracted keys in model information.')
- modelId = Param(parent='undefined', name='modelId', doc='ServiceParam: Model identifier.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeKeys(value)[source]
- Parameters
includeKeys – Include list of extracted keys in model information.
- setIncludeKeysCol(value)[source]
- Parameters
includeKeys – Include list of extracted keys in model information.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='GetCustomModel_ba92f2acdbbb_error', handler=None, includeKeys=None, includeKeysCol=None, modelId=None, modelIdCol=None, outputCol='GetCustomModel_ba92f2acdbbb_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.GroupFaces module
- class synapse.ml.cognitive.GroupFaces.GroupFaces(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='GroupFaces_aaf3c4aaf003_error', faceIds=None, faceIdsCol=None, handler=None, outputCol='GroupFaces_aaf3c4aaf003_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
faceIds (object) – Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
handler (object) – Which strategy to use when handling requests
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- faceIds = Param(parent='undefined', name='faceIds', doc='ServiceParam: Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFaceIds()[source]
- Returns
Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
- Return type
faceIds
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFaceIds(value)[source]
- Parameters
faceIds – Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
- setFaceIdsCol(value)[source]
- Parameters
faceIds – Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='GroupFaces_aaf3c4aaf003_error', faceIds=None, faceIdsCol=None, handler=None, outputCol='GroupFaces_aaf3c4aaf003_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.HealthcareSDK module
- class synapse.ml.cognitive.HealthcareSDK.HealthcareSDK(java_obj=None, batchSize=5, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='Error', includeStatistics=None, includeStatisticsCol=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
batchSize (int) – The max size of the buffer
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
disableServiceLogs (object) – disableServiceLogs option
errorCol (str) – column to hold http errors
includeStatistics (object) – includeStatistics option
language (object) – the language code of the text (optional for some services)
modelVersion (object) – modelVersion option
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- includeStatistics = Param(parent='undefined', name='includeStatistics', doc='ServiceParam: includeStatistics option')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: modelVersion option')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(batchSize=5, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='Error', includeStatistics=None, includeStatisticsCol=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.IdentifyFaces module
- class synapse.ml.cognitive.IdentifyFaces.IdentifyFaces(java_obj=None, concurrency=1, concurrentTimeout=None, confidenceThreshold=None, confidenceThresholdCol=None, errorCol='IdentifyFaces_b41be4f19b96_error', faceIds=None, faceIdsCol=None, handler=None, largePersonGroupId=None, largePersonGroupIdCol=None, maxNumOfCandidatesReturned=None, maxNumOfCandidatesReturnedCol=None, outputCol='IdentifyFaces_b41be4f19b96_output', personGroupId=None, personGroupIdCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
confidenceThreshold (object) – Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
errorCol (str) – column to hold http errors
faceIds (object) – Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
handler (object) – Which strategy to use when handling requests
largePersonGroupId (object) – largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
maxNumOfCandidatesReturned (object) – The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
outputCol (str) – The name of the output column
personGroupId (object) – personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- confidenceThreshold = Param(parent='undefined', name='confidenceThreshold', doc='ServiceParam: Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- faceIds = Param(parent='undefined', name='faceIds', doc='ServiceParam: Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10]. ')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getConfidenceThreshold()[source]
- Returns
Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
- Return type
confidenceThreshold
- getFaceIds()[source]
- Returns
Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
- Return type
faceIds
- getLargePersonGroupId()[source]
- Returns
largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- Return type
largePersonGroupId
- getMaxNumOfCandidatesReturned()[source]
- Returns
The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
- Return type
maxNumOfCandidatesReturned
- getPersonGroupId()[source]
- Returns
personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- Return type
personGroupId
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- largePersonGroupId = Param(parent='undefined', name='largePersonGroupId', doc='ServiceParam: largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.')
- maxNumOfCandidatesReturned = Param(parent='undefined', name='maxNumOfCandidatesReturned', doc='ServiceParam: The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- personGroupId = Param(parent='undefined', name='personGroupId', doc='ServiceParam: personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setConfidenceThreshold(value)[source]
- Parameters
confidenceThreshold – Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
- setConfidenceThresholdCol(value)[source]
- Parameters
confidenceThreshold – Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
- setFaceIds(value)[source]
- Parameters
faceIds – Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
- setFaceIdsCol(value)[source]
- Parameters
faceIds – Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
- setLargePersonGroupId(value)[source]
- Parameters
largePersonGroupId – largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setLargePersonGroupIdCol(value)[source]
- Parameters
largePersonGroupId – largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setMaxNumOfCandidatesReturned(value)[source]
- Parameters
maxNumOfCandidatesReturned – The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
- setMaxNumOfCandidatesReturnedCol(value)[source]
- Parameters
maxNumOfCandidatesReturned – The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
- setParams(concurrency=1, concurrentTimeout=None, confidenceThreshold=None, confidenceThresholdCol=None, errorCol='IdentifyFaces_b41be4f19b96_error', faceIds=None, faceIdsCol=None, handler=None, largePersonGroupId=None, largePersonGroupIdCol=None, maxNumOfCandidatesReturned=None, maxNumOfCandidatesReturnedCol=None, outputCol='IdentifyFaces_b41be4f19b96_output', personGroupId=None, personGroupIdCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPersonGroupId(value)[source]
- Parameters
personGroupId – personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setPersonGroupIdCol(value)[source]
- Parameters
personGroupId – personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.KeyPhraseExtractor module
- class synapse.ml.cognitive.KeyPhraseExtractor.KeyPhraseExtractor(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='KeyPhraseExtractor_cc81a1eef0f1_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='KeyPhraseExtractor_cc81a1eef0f1_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
showStats (object) – if set to true, response will contain input and document level statistics.
stringIndexType (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getShowStats()[source]
- Returns
if set to true, response will contain input and document level statistics.
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='KeyPhraseExtractor_cc81a1eef0f1_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='KeyPhraseExtractor_cc81a1eef0f1_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setShowStatsCol(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setStringIndexType(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: if set to true, response will contain input and document level statistics.')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.KeyPhraseExtractorSDK module
- class synapse.ml.cognitive.KeyPhraseExtractorSDK.KeyPhraseExtractorSDK(java_obj=None, batchSize=5, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='Error', includeStatistics=None, includeStatisticsCol=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
batchSize (int) – The max size of the buffer
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
disableServiceLogs (object) – disableServiceLogs option
errorCol (str) – column to hold http errors
includeStatistics (object) – includeStatistics option
language (object) – the language code of the text (optional for some services)
modelVersion (object) – modelVersion option
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- includeStatistics = Param(parent='undefined', name='includeStatistics', doc='ServiceParam: includeStatistics option')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: modelVersion option')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(batchSize=5, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='Error', includeStatistics=None, includeStatisticsCol=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.KeyPhraseExtractorV2 module
- class synapse.ml.cognitive.KeyPhraseExtractorV2.KeyPhraseExtractorV2(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='KeyPhraseExtractorV2_f07e4dfc4c82_error', handler=None, language=None, languageCol=None, outputCol='KeyPhraseExtractorV2_f07e4dfc4c82_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(concurrency=1, concurrentTimeout=None, errorCol='KeyPhraseExtractorV2_f07e4dfc4c82_error', handler=None, language=None, languageCol=None, outputCol='KeyPhraseExtractorV2_f07e4dfc4c82_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.LanguageDetector module
- class synapse.ml.cognitive.LanguageDetector.LanguageDetector(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='LanguageDetector_e5c9240a65e9_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='LanguageDetector_e5c9240a65e9_output', showStats=None, showStatsCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
showStats (object) – if set to true, response will contain input and document level statistics.
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getShowStats()[source]
- Returns
if set to true, response will contain input and document level statistics.
- Return type
showStats
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='LanguageDetector_e5c9240a65e9_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='LanguageDetector_e5c9240a65e9_output', showStats=None, showStatsCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setShowStatsCol(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: if set to true, response will contain input and document level statistics.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.LanguageDetectorSDK module
- class synapse.ml.cognitive.LanguageDetectorSDK.LanguageDetectorSDK(java_obj=None, batchSize=5, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='Error', includeStatistics=None, includeStatisticsCol=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
batchSize (int) – The max size of the buffer
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
disableServiceLogs (object) – disableServiceLogs option
errorCol (str) – column to hold http errors
includeStatistics (object) – includeStatistics option
language (object) – the language code of the text (optional for some services)
modelVersion (object) – modelVersion option
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- includeStatistics = Param(parent='undefined', name='includeStatistics', doc='ServiceParam: includeStatistics option')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: modelVersion option')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(batchSize=5, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='Error', includeStatistics=None, includeStatisticsCol=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.LanguageDetectorV2 module
- class synapse.ml.cognitive.LanguageDetectorV2.LanguageDetectorV2(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='LanguageDetectorV2_a03d227bff97_error', handler=None, language=None, languageCol=None, outputCol='LanguageDetectorV2_a03d227bff97_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(concurrency=1, concurrentTimeout=None, errorCol='LanguageDetectorV2_a03d227bff97_error', handler=None, language=None, languageCol=None, outputCol='LanguageDetectorV2_a03d227bff97_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.ListCustomModels module
- class synapse.ml.cognitive.ListCustomModels.ListCustomModels(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='ListCustomModels_6e1ef12926b2_error', handler=None, op=None, opCol=None, outputCol='ListCustomModels_6e1ef12926b2_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
op (object) – Specify whether to return summary or full list of models.
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- op = Param(parent='undefined', name='op', doc='ServiceParam: Specify whether to return summary or full list of models.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='ListCustomModels_6e1ef12926b2_error', handler=None, op=None, opCol=None, outputCol='ListCustomModels_6e1ef12926b2_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.NER module
- class synapse.ml.cognitive.NER.NER(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='NER_ed9225701f0e_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='NER_ed9225701f0e_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
showStats (object) – if set to true, response will contain input and document level statistics.
stringIndexType (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getShowStats()[source]
- Returns
if set to true, response will contain input and document level statistics.
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='NER_ed9225701f0e_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='NER_ed9225701f0e_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setShowStatsCol(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setStringIndexType(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: if set to true, response will contain input and document level statistics.')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.NERSDK module
- class synapse.ml.cognitive.NERSDK.NERSDK(java_obj=None, batchSize=5, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='Error', includeStatistics=None, includeStatisticsCol=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
batchSize (int) – The max size of the buffer
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
disableServiceLogs (object) – disableServiceLogs option
errorCol (str) – column to hold http errors
includeStatistics (object) – includeStatistics option
language (object) – the language code of the text (optional for some services)
modelVersion (object) – modelVersion option
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- includeStatistics = Param(parent='undefined', name='includeStatistics', doc='ServiceParam: includeStatistics option')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: modelVersion option')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(batchSize=5, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='Error', includeStatistics=None, includeStatisticsCol=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.NERV2 module
- class synapse.ml.cognitive.NERV2.NERV2(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='NERV2_8929facac6a6_error', handler=None, language=None, languageCol=None, outputCol='NERV2_8929facac6a6_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(concurrency=1, concurrentTimeout=None, errorCol='NERV2_8929facac6a6_error', handler=None, language=None, languageCol=None, outputCol='NERV2_8929facac6a6_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.OCR module
- class synapse.ml.cognitive.OCR.OCR(java_obj=None, concurrency=1, concurrentTimeout=None, detectOrientation=None, detectOrientationCol=None, errorCol='OCR_7dcd8ae74e88_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='OCR_7dcd8ae74e88_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
detectOrientation (object) – whether to detect image orientation prior to processing
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – the language to use
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- detectOrientation = Param(parent='undefined', name='detectOrientation', doc='ServiceParam: whether to detect image orientation prior to processing')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getDetectOrientation()[source]
- Returns
whether to detect image orientation prior to processing
- Return type
detectOrientation
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language to use')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setDetectOrientation(value)[source]
- Parameters
detectOrientation – whether to detect image orientation prior to processing
- setDetectOrientationCol(value)[source]
- Parameters
detectOrientation – whether to detect image orientation prior to processing
- setParams(concurrency=1, concurrentTimeout=None, detectOrientation=None, detectOrientationCol=None, errorCol='OCR_7dcd8ae74e88_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='OCR_7dcd8ae74e88_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.OpenAICompletion module
- class synapse.ml.cognitive.OpenAICompletion.OpenAICompletion(java_obj=None, apiVersion=None, apiVersionCol=None, batchIndexPrompt=None, batchIndexPromptCol=None, batchPrompt=None, batchPromptCol=None, bestOf=None, bestOfCol=None, cacheLevel=None, cacheLevelCol=None, concurrency=1, concurrentTimeout=None, deploymentName=None, deploymentNameCol=None, echo=None, echoCol=None, errorCol='OpenAPICompletion_9fade8566b4a_error', frequencyPenalty=None, frequencyPenaltyCol=None, handler=None, indexPrompt=None, indexPromptCol=None, logProbs=None, logProbsCol=None, maxTokens=None, maxTokensCol=None, model=None, modelCol=None, n=None, nCol=None, outputCol='OpenAPICompletion_9fade8566b4a_output', presencePenalty=None, presencePenaltyCol=None, prompt=None, promptCol=None, stop=None, stopCol=None, subscriptionKey=None, subscriptionKeyCol=None, temperature=None, temperatureCol=None, timeout=60.0, topP=None, topPCol=None, url=None, user=None, userCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
apiVersion (object) – version of the api
batchIndexPrompt (object) – Sequence of index sequences to complete
batchPrompt (object) – Sequence of prompts to complete
bestOf (object) – How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
cacheLevel (object) – can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
deploymentName (object) – The name of the deployment
echo (object) – Echo back the prompt in addition to the completion
errorCol (str) – column to hold http errors
frequencyPenalty (object) – How much to penalize new tokens based on whether they appear in the text so far. Increases the model’s likelihood to talk about new topics.
handler (object) – Which strategy to use when handling requests
indexPrompt (object) – Sequence of indexes to complete
logProbs (object) – Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
maxTokens (object) – The maximum number of tokens to generate. Has minimum of 0.
model (object) – The name of the model to use
n (object) – How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
outputCol (str) – The name of the output column
presencePenalty (object) – How much to penalize new tokens based on their existing frequency in the text so far. Decreases the model’s likelihood to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
prompt (object) – The text to complete
stop (object) – A sequence which indicates the end of the current document.
subscriptionKey (object) – the API key to use
temperature (object) – What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
timeout (float) – number of seconds to wait before closing the connection
topP (object) – An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
url (str) – Url of the service
user (object) – The ID of the end-user, for use in tracking and rate-limiting.
- apiVersion = Param(parent='undefined', name='apiVersion', doc='ServiceParam: version of the api')
- batchIndexPrompt = Param(parent='undefined', name='batchIndexPrompt', doc='ServiceParam: Sequence of index sequences to complete')
- batchPrompt = Param(parent='undefined', name='batchPrompt', doc='ServiceParam: Sequence of prompts to complete')
- bestOf = Param(parent='undefined', name='bestOf', doc='ServiceParam: How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.')
- cacheLevel = Param(parent='undefined', name='cacheLevel', doc='ServiceParam: can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- deploymentName = Param(parent='undefined', name='deploymentName', doc='ServiceParam: The name of the deployment')
- echo = Param(parent='undefined', name='echo', doc='ServiceParam: Echo back the prompt in addition to the completion')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- frequencyPenalty = Param(parent='undefined', name='frequencyPenalty', doc="ServiceParam: How much to penalize new tokens based on whether they appear in the text so far. Increases the model's likelihood to talk about new topics.")
- getBatchIndexPrompt()[source]
- Returns
Sequence of index sequences to complete
- Return type
batchIndexPrompt
- getBestOf()[source]
- Returns
How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
- Return type
bestOf
- getCacheLevel()[source]
- Returns
can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
- Return type
cacheLevel
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFrequencyPenalty()[source]
- Returns
How much to penalize new tokens based on whether they appear in the text so far. Increases the model’s likelihood to talk about new topics.
- Return type
frequencyPenalty
- getLogProbs()[source]
- Returns
Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
- Return type
logProbs
- getMaxTokens()[source]
- Returns
The maximum number of tokens to generate. Has minimum of 0.
- Return type
maxTokens
- getN()[source]
- Returns
How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
- Return type
n
- getPresencePenalty()[source]
- Returns
How much to penalize new tokens based on their existing frequency in the text so far. Decreases the model’s likelihood to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
- Return type
presencePenalty
- getStop()[source]
- Returns
A sequence which indicates the end of the current document.
- Return type
stop
- getTemperature()[source]
- Returns
What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
- Return type
temperature
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getTopP()[source]
- Returns
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
- Return type
topP
- getUser()[source]
- Returns
The ID of the end-user, for use in tracking and rate-limiting.
- Return type
user
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- indexPrompt = Param(parent='undefined', name='indexPrompt', doc='ServiceParam: Sequence of indexes to complete')
- logProbs = Param(parent='undefined', name='logProbs', doc='ServiceParam: Include the log probabilities on the `logprobs` most likely tokens, as well the chosen tokens. So for example, if `logprobs` is 10, the API will return a list of the 10 most likely tokens. If `logprobs` is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.')
- maxTokens = Param(parent='undefined', name='maxTokens', doc='ServiceParam: The maximum number of tokens to generate. Has minimum of 0.')
- model = Param(parent='undefined', name='model', doc='ServiceParam: The name of the model to use')
- n = Param(parent='undefined', name='n', doc='ServiceParam: How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- presencePenalty = Param(parent='undefined', name='presencePenalty', doc="ServiceParam: How much to penalize new tokens based on their existing frequency in the text so far. Decreases the model's likelihood to repeat the same line verbatim. Has minimum of -2 and maximum of 2.")
- prompt = Param(parent='undefined', name='prompt', doc='ServiceParam: The text to complete')
- setBatchIndexPrompt(value)[source]
- Parameters
batchIndexPrompt – Sequence of index sequences to complete
- setBatchIndexPromptCol(value)[source]
- Parameters
batchIndexPrompt – Sequence of index sequences to complete
- setBestOf(value)[source]
- Parameters
bestOf – How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
- setBestOfCol(value)[source]
- Parameters
bestOf – How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
- setCacheLevel(value)[source]
- Parameters
cacheLevel – can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
- setCacheLevelCol(value)[source]
- Parameters
cacheLevel – can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFrequencyPenalty(value)[source]
- Parameters
frequencyPenalty – How much to penalize new tokens based on whether they appear in the text so far. Increases the model’s likelihood to talk about new topics.
- setFrequencyPenaltyCol(value)[source]
- Parameters
frequencyPenalty – How much to penalize new tokens based on whether they appear in the text so far. Increases the model’s likelihood to talk about new topics.
- setLogProbs(value)[source]
- Parameters
logProbs – Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
- setLogProbsCol(value)[source]
- Parameters
logProbs – Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
- setMaxTokens(value)[source]
- Parameters
maxTokens – The maximum number of tokens to generate. Has minimum of 0.
- setMaxTokensCol(value)[source]
- Parameters
maxTokens – The maximum number of tokens to generate. Has minimum of 0.
- setN(value)[source]
- Parameters
n – How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
- setNCol(value)[source]
- Parameters
n – How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
- setParams(apiVersion=None, apiVersionCol=None, batchIndexPrompt=None, batchIndexPromptCol=None, batchPrompt=None, batchPromptCol=None, bestOf=None, bestOfCol=None, cacheLevel=None, cacheLevelCol=None, concurrency=1, concurrentTimeout=None, deploymentName=None, deploymentNameCol=None, echo=None, echoCol=None, errorCol='OpenAPICompletion_9fade8566b4a_error', frequencyPenalty=None, frequencyPenaltyCol=None, handler=None, indexPrompt=None, indexPromptCol=None, logProbs=None, logProbsCol=None, maxTokens=None, maxTokensCol=None, model=None, modelCol=None, n=None, nCol=None, outputCol='OpenAPICompletion_9fade8566b4a_output', presencePenalty=None, presencePenaltyCol=None, prompt=None, promptCol=None, stop=None, stopCol=None, subscriptionKey=None, subscriptionKeyCol=None, temperature=None, temperatureCol=None, timeout=60.0, topP=None, topPCol=None, url=None, user=None, userCol=None)[source]
Set the (keyword only) parameters
- setPresencePenalty(value)[source]
- Parameters
presencePenalty – How much to penalize new tokens based on their existing frequency in the text so far. Decreases the model’s likelihood to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
- setPresencePenaltyCol(value)[source]
- Parameters
presencePenalty – How much to penalize new tokens based on their existing frequency in the text so far. Decreases the model’s likelihood to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
- setStop(value)[source]
- Parameters
stop – A sequence which indicates the end of the current document.
- setStopCol(value)[source]
- Parameters
stop – A sequence which indicates the end of the current document.
- setTemperature(value)[source]
- Parameters
temperature – What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
- setTemperatureCol(value)[source]
- Parameters
temperature – What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- setTopP(value)[source]
- Parameters
topP – An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
- setTopPCol(value)[source]
- Parameters
topP – An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
- setUser(value)[source]
- Parameters
user – The ID of the end-user, for use in tracking and rate-limiting.
- setUserCol(value)[source]
- Parameters
user – The ID of the end-user, for use in tracking and rate-limiting.
- stop = Param(parent='undefined', name='stop', doc='ServiceParam: A sequence which indicates the end of the current document.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- temperature = Param(parent='undefined', name='temperature', doc='ServiceParam: What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or `top_p` but not both. Minimum of 0 and maximum of 2 allowed.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- topP = Param(parent='undefined', name='topP', doc='ServiceParam: An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend using this or `temperature` but not both. Minimum of 0 and maximum of 1 allowed.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- user = Param(parent='undefined', name='user', doc='ServiceParam: The ID of the end-user, for use in tracking and rate-limiting.')
synapse.ml.cognitive.PII module
- class synapse.ml.cognitive.PII.PII(java_obj=None, concurrency=1, concurrentTimeout=None, domain=None, domainCol=None, errorCol='PII_de2b592e6f4e_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='PII_de2b592e6f4e_output', piiCategories=None, piiCategoriesCol=None, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
domain (object) – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
piiCategories (object) – describes the PII categories to return
showStats (object) – if set to true, response will contain input and document level statistics.
stringIndexType (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- domain = Param(parent='undefined', name='domain', doc="ServiceParam: if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: 'PHI', 'none'.")
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getDomain()[source]
- Returns
if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
- Return type
domain
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getPiiCategories()[source]
- Returns
describes the PII categories to return
- Return type
piiCategories
- getShowStats()[source]
- Returns
if set to true, response will contain input and document level statistics.
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- piiCategories = Param(parent='undefined', name='piiCategories', doc='ServiceParam: describes the PII categories to return')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setDomain(value)[source]
- Parameters
domain – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
- setDomainCol(value)[source]
- Parameters
domain – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setParams(concurrency=1, concurrentTimeout=None, domain=None, domainCol=None, errorCol='PII_de2b592e6f4e_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='PII_de2b592e6f4e_output', piiCategories=None, piiCategoriesCol=None, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPiiCategoriesCol(value)[source]
- Parameters
piiCategories – describes the PII categories to return
- setShowStats(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setShowStatsCol(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setStringIndexType(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: if set to true, response will contain input and document level statistics.')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.PIISDK module
- class synapse.ml.cognitive.PIISDK.PIISDK(java_obj=None, batchSize=5, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='Error', includeStatistics=None, includeStatisticsCol=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
batchSize (int) – The max size of the buffer
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
disableServiceLogs (object) – disableServiceLogs option
errorCol (str) – column to hold http errors
includeStatistics (object) – includeStatistics option
language (object) – the language code of the text (optional for some services)
modelVersion (object) – modelVersion option
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- includeStatistics = Param(parent='undefined', name='includeStatistics', doc='ServiceParam: includeStatistics option')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: modelVersion option')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(batchSize=5, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='Error', includeStatistics=None, includeStatisticsCol=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.ReadImage module
- class synapse.ml.cognitive.ReadImage.ReadImage(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='ReadImage_d7542375d0c7_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, outputCol='ReadImage_d7542375d0c7_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
language (object) – IThe BCP-47 language code of the text in the document. Currently, only English (en), Dutch (nl), French (fr), German (de), Italian (it), Portuguese (pt), and Spanish (es) are supported. Read supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
maxPollingRetries (int) – number of times to poll
outputCol (str) – The name of the output column
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLanguage()[source]
- Returns
IThe BCP-47 language code of the text in the document. Currently, only English (en), Dutch (nl), French (fr), German (de), Italian (it), Portuguese (pt), and Spanish (es) are supported. Read supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- Return type
language
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- language = Param(parent='undefined', name='language', doc='ServiceParam: IThe BCP-47 language code of the text in the document. Currently, only English (en), Dutch (nl), French (fr), German (de), Italian (it), Portuguese (pt), and Spanish (es) are supported. Read supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setLanguage(value)[source]
- Parameters
language – IThe BCP-47 language code of the text in the document. Currently, only English (en), Dutch (nl), French (fr), German (de), Italian (it), Portuguese (pt), and Spanish (es) are supported. Read supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- setLanguageCol(value)[source]
- Parameters
language – IThe BCP-47 language code of the text in the document. Currently, only English (en), Dutch (nl), French (fr), German (de), Italian (it), Portuguese (pt), and Spanish (es) are supported. Read supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='ReadImage_d7542375d0c7_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, outputCol='ReadImage_d7542375d0c7_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.RecognizeDomainSpecificContent module
- class synapse.ml.cognitive.RecognizeDomainSpecificContent.RecognizeDomainSpecificContent(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='RecognizeDomainSpecificContent_924c6e4f205a_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, model=None, modelCol=None, outputCol='RecognizeDomainSpecificContent_924c6e4f205a_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
model (object) – the domain specific model: celebrities, landmarks
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- model = Param(parent='undefined', name='model', doc='ServiceParam: the domain specific model: celebrities, landmarks')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='RecognizeDomainSpecificContent_924c6e4f205a_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, model=None, modelCol=None, outputCol='RecognizeDomainSpecificContent_924c6e4f205a_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.RecognizeText module
- class synapse.ml.cognitive.RecognizeText.RecognizeText(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='RecognizeText_b9888bc3c0c3_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, maxPollingRetries=1000, mode=None, modeCol=None, outputCol='RecognizeText_b9888bc3c0c3_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
maxPollingRetries (int) – number of times to poll
mode (object) – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
outputCol (str) – The name of the output column
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getMode()[source]
- Returns
If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
- Return type
mode
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- mode = Param(parent='undefined', name='mode', doc="ServiceParam: If this parameter is set to 'Printed', printed text recognition is performed. If 'Handwritten' is specified, handwriting recognition is performed")
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setMode(value)[source]
- Parameters
mode – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
- setModeCol(value)[source]
- Parameters
mode – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='RecognizeText_b9888bc3c0c3_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, maxPollingRetries=1000, mode=None, modeCol=None, outputCol='RecognizeText_b9888bc3c0c3_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.SimpleDetectAnomalies module
- class synapse.ml.cognitive.SimpleDetectAnomalies.SimpleDetectAnomalies(java_obj=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='SimpleDetectAnomalies_0e210d55647d_error', granularity=None, granularityCol=None, groupbyCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='SimpleDetectAnomalies_0e210d55647d_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, timestampCol='timestamp', url=None, valueCol='value')[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
errorCol (str) – column to hold http errors
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
groupbyCol (str) – column that groups the series
handler (object) – Which strategy to use when handling requests
imputeFixedValue (object) – Optional argument, fixed value to use when imputeMode is set to “fixed”
imputeMode (object) – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
outputCol (str) – The name of the output column
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
timestampCol (str) – column representing the time of the series
url (str) – Url of the service
valueCol (str) – column representing the value of the series
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customInterval = Param(parent='undefined', name='customInterval', doc='ServiceParam: Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5. ')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomInterval()[source]
- Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
customInterval
- getGranularity()[source]
- Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
granularity
- getImputeFixedValue()[source]
- Returns
Optional argument, fixed value to use when imputeMode is set to “fixed”
- Return type
imputeFixedValue
- getImputeMode()[source]
- Returns
Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- Return type
imputeMode
- getMaxAnomalyRatio()[source]
- Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
maxAnomalyRatio
- getPeriod()[source]
- Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
period
- getSensitivity()[source]
- Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
sensitivity
- getSeries()[source]
- Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
series
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getTimestampCol()[source]
- Returns
column representing the time of the series
- Return type
timestampCol
- granularity = Param(parent='undefined', name='granularity', doc='ServiceParam: Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid. ')
- groupbyCol = Param(parent='undefined', name='groupbyCol', doc='column that groups the series')
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imputeFixedValue = Param(parent='undefined', name='imputeFixedValue', doc='ServiceParam: Optional argument, fixed value to use when imputeMode is set to "fixed" ')
- imputeMode = Param(parent='undefined', name='imputeMode', doc='ServiceParam: Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill ')
- maxAnomalyRatio = Param(parent='undefined', name='maxAnomalyRatio', doc='ServiceParam: Optional argument, advanced model parameter, max anomaly ratio in a time series. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- period = Param(parent='undefined', name='period', doc='ServiceParam: Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically. ')
- sensitivity = Param(parent='undefined', name='sensitivity', doc='ServiceParam: Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted ')
- series = Param(parent='undefined', name='series', doc='ServiceParam: Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned. ')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setCustomInterval(value)[source]
- Parameters
customInterval – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setCustomIntervalCol(value)[source]
- Parameters
customInterval – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setGranularity(value)[source]
- Parameters
granularity – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setGranularityCol(value)[source]
- Parameters
granularity – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setImputeFixedValue(value)[source]
- Parameters
imputeFixedValue – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeFixedValueCol(value)[source]
- Parameters
imputeFixedValue – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeMode(value)[source]
- Parameters
imputeMode – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setImputeModeCol(value)[source]
- Parameters
imputeMode – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setMaxAnomalyRatio(value)[source]
- Parameters
maxAnomalyRatio – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setMaxAnomalyRatioCol(value)[source]
- Parameters
maxAnomalyRatio – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setParams(concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='SimpleDetectAnomalies_0e210d55647d_error', granularity=None, granularityCol=None, groupbyCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='SimpleDetectAnomalies_0e210d55647d_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, timestampCol='timestamp', url=None, valueCol='value')[source]
Set the (keyword only) parameters
- setPeriod(value)[source]
- Parameters
period – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setPeriodCol(value)[source]
- Parameters
period – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setSensitivity(value)[source]
- Parameters
sensitivity – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSensitivityCol(value)[source]
- Parameters
sensitivity – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSeries(value)[source]
- Parameters
series – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setSeriesCol(value)[source]
- Parameters
series – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- setTimestampCol(value)[source]
- Parameters
timestampCol – column representing the time of the series
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- timestampCol = Param(parent='undefined', name='timestampCol', doc='column representing the time of the series')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- valueCol = Param(parent='undefined', name='valueCol', doc='column representing the value of the series')
synapse.ml.cognitive.SpeechToText module
- class synapse.ml.cognitive.SpeechToText.SpeechToText(java_obj=None, audioData=None, audioDataCol=None, concurrency=1, concurrentTimeout=None, errorCol='SpeechToText_6c8ebf2ab54f_error', format=None, formatCol=None, handler=None, language=None, languageCol=None, outputCol='SpeechToText_6c8ebf2ab54f_output', profanity=None, profanityCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
audioData (object) – The data sent to the service must be a .wav files
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
handler (object) – Which strategy to use when handling requests
language (object) – Identifies the spoken language that is being recognized.
outputCol (str) – The name of the output column
profanity (object) – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- audioData = Param(parent='undefined', name='audioData', doc='ServiceParam: The data sent to the service must be a .wav files ')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- format = Param(parent='undefined', name='format', doc='ServiceParam: Specifies the result format. Accepted values are simple and detailed. Default is simple. ')
- getAudioData()[source]
- Returns
The data sent to the service must be a .wav files
- Return type
audioData
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFormat()[source]
- Returns
Specifies the result format. Accepted values are simple and detailed. Default is simple.
- Return type
format
- getLanguage()[source]
- Returns
Identifies the spoken language that is being recognized.
- Return type
language
- getProfanity()[source]
- Returns
Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- Return type
profanity
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: Identifies the spoken language that is being recognized. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- profanity = Param(parent='undefined', name='profanity', doc='ServiceParam: Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked. ')
- setAudioData(value)[source]
- Parameters
audioData – The data sent to the service must be a .wav files
- setAudioDataCol(value)[source]
- Parameters
audioData – The data sent to the service must be a .wav files
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFormat(value)[source]
- Parameters
format – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setFormatCol(value)[source]
- Parameters
format – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setLanguage(value)[source]
- Parameters
language – Identifies the spoken language that is being recognized.
- setLanguageCol(value)[source]
- Parameters
language – Identifies the spoken language that is being recognized.
- setParams(audioData=None, audioDataCol=None, concurrency=1, concurrentTimeout=None, errorCol='SpeechToText_6c8ebf2ab54f_error', format=None, formatCol=None, handler=None, language=None, languageCol=None, outputCol='SpeechToText_6c8ebf2ab54f_output', profanity=None, profanityCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setProfanity(value)[source]
- Parameters
profanity – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setProfanityCol(value)[source]
- Parameters
profanity – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.SpeechToTextSDK module
- class synapse.ml.cognitive.SpeechToTextSDK.SpeechToTextSDK(java_obj=None, audioDataCol=None, endpointId=None, extraFfmpegArgs=[], fileType=None, fileTypeCol=None, format=None, formatCol=None, language=None, languageCol=None, outputCol=None, participantsJson=None, participantsJsonCol=None, profanity=None, profanityCol=None, recordAudioData=False, recordedFileNameCol=None, streamIntermediateResults=True, subscriptionKey=None, subscriptionKeyCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
audioDataCol (str) – Column holding audio data, must be either ByteArrays or Strings representing file URIs
endpointId (str) – endpoint for custom speech models
extraFfmpegArgs (list) – extra arguments to for ffmpeg output decoding
fileType (object) – The file type of the sound files, supported types: wav, ogg, mp3
format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
language (object) – Identifies the spoken language that is being recognized.
outputCol (str) – The name of the output column
participantsJson (object) – a json representation of a list of conversation participants (email, language, user)
profanity (object) – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
recordAudioData (bool) – Whether to record audio data to a file location, for use only with m3u8 streams
recordedFileNameCol (str) – Column holding file names to write audio data to if ``recordAudioData’’ is set to true
streamIntermediateResults (bool) – Whether or not to immediately return itermediate results, or group in a sequence
subscriptionKey (object) – the API key to use
url (str) – Url of the service
- audioDataCol = Param(parent='undefined', name='audioDataCol', doc='Column holding audio data, must be either ByteArrays or Strings representing file URIs')
- endpointId = Param(parent='undefined', name='endpointId', doc='endpoint for custom speech models')
- extraFfmpegArgs = Param(parent='undefined', name='extraFfmpegArgs', doc='extra arguments to for ffmpeg output decoding')
- fileType = Param(parent='undefined', name='fileType', doc='ServiceParam: The file type of the sound files, supported types: wav, ogg, mp3')
- format = Param(parent='undefined', name='format', doc='ServiceParam: Specifies the result format. Accepted values are simple and detailed. Default is simple. ')
- getAudioDataCol()[source]
- Returns
Column holding audio data, must be either ByteArrays or Strings representing file URIs
- Return type
audioDataCol
- getExtraFfmpegArgs()[source]
- Returns
extra arguments to for ffmpeg output decoding
- Return type
extraFfmpegArgs
- getFileType()[source]
- Returns
The file type of the sound files, supported types: wav, ogg, mp3
- Return type
fileType
- getFormat()[source]
- Returns
Specifies the result format. Accepted values are simple and detailed. Default is simple.
- Return type
format
- getLanguage()[source]
- Returns
Identifies the spoken language that is being recognized.
- Return type
language
- getParticipantsJson()[source]
- Returns
a json representation of a list of conversation participants (email, language, user)
- Return type
participantsJson
- getProfanity()[source]
- Returns
Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- Return type
profanity
- getRecordAudioData()[source]
- Returns
Whether to record audio data to a file location, for use only with m3u8 streams
- Return type
recordAudioData
- getRecordedFileNameCol()[source]
- Returns
Column holding file names to write audio data to if ``recordAudioData’’ is set to true
- Return type
recordedFileNameCol
- getStreamIntermediateResults()[source]
- Returns
Whether or not to immediately return itermediate results, or group in a sequence
- Return type
streamIntermediateResults
- language = Param(parent='undefined', name='language', doc='ServiceParam: Identifies the spoken language that is being recognized. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- participantsJson = Param(parent='undefined', name='participantsJson', doc='ServiceParam: a json representation of a list of conversation participants (email, language, user)')
- profanity = Param(parent='undefined', name='profanity', doc='ServiceParam: Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked. ')
- recordAudioData = Param(parent='undefined', name='recordAudioData', doc='Whether to record audio data to a file location, for use only with m3u8 streams')
- recordedFileNameCol = Param(parent='undefined', name='recordedFileNameCol', doc="Column holding file names to write audio data to if ``recordAudioData'' is set to true")
- setAudioDataCol(value)[source]
- Parameters
audioDataCol – Column holding audio data, must be either ByteArrays or Strings representing file URIs
- setExtraFfmpegArgs(value)[source]
- Parameters
extraFfmpegArgs – extra arguments to for ffmpeg output decoding
- setFileType(value)[source]
- Parameters
fileType – The file type of the sound files, supported types: wav, ogg, mp3
- setFileTypeCol(value)[source]
- Parameters
fileType – The file type of the sound files, supported types: wav, ogg, mp3
- setFormat(value)[source]
- Parameters
format – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setFormatCol(value)[source]
- Parameters
format – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setLanguage(value)[source]
- Parameters
language – Identifies the spoken language that is being recognized.
- setLanguageCol(value)[source]
- Parameters
language – Identifies the spoken language that is being recognized.
- setParams(audioDataCol=None, endpointId=None, extraFfmpegArgs=[], fileType=None, fileTypeCol=None, format=None, formatCol=None, language=None, languageCol=None, outputCol=None, participantsJson=None, participantsJsonCol=None, profanity=None, profanityCol=None, recordAudioData=False, recordedFileNameCol=None, streamIntermediateResults=True, subscriptionKey=None, subscriptionKeyCol=None, url=None)[source]
Set the (keyword only) parameters
- setParticipantsJson(value)[source]
- Parameters
participantsJson – a json representation of a list of conversation participants (email, language, user)
- setParticipantsJsonCol(value)[source]
- Parameters
participantsJson – a json representation of a list of conversation participants (email, language, user)
- setProfanity(value)[source]
- Parameters
profanity – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setProfanityCol(value)[source]
- Parameters
profanity – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setRecordAudioData(value)[source]
- Parameters
recordAudioData – Whether to record audio data to a file location, for use only with m3u8 streams
- setRecordedFileNameCol(value)[source]
- Parameters
recordedFileNameCol – Column holding file names to write audio data to if ``recordAudioData’’ is set to true
- setStreamIntermediateResults(value)[source]
- Parameters
streamIntermediateResults – Whether or not to immediately return itermediate results, or group in a sequence
- streamIntermediateResults = Param(parent='undefined', name='streamIntermediateResults', doc='Whether or not to immediately return itermediate results, or group in a sequence')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.TagImage module
- class synapse.ml.cognitive.TagImage.TagImage(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='TagImage_58a3c3e64af1_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='TagImage_58a3c3e64af1_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – The desired language for output generation.
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- language = Param(parent='undefined', name='language', doc='ServiceParam: The desired language for output generation.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='TagImage_58a3c3e64af1_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='TagImage_58a3c3e64af1_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.TextAnalyze module
- class synapse.ml.cognitive.TextAnalyze.TextAnalyze(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, entityLinkingTasks=[], entityRecognitionPiiTasks=[], entityRecognitionTasks=[], errorCol='TextAnalyze_1d5d6204eac4_error', initialPollingDelay=300, keyPhraseExtractionTasks=[], language=None, languageCol=None, maxPollingRetries=1000, outputCol='TextAnalyze_1d5d6204eac4_output', pollingDelay=300, sentimentAnalysisTasks=[], subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
entityLinkingTasks (object) – the entity linking tasks to perform on submitted documents
entityRecognitionPiiTasks (object) – the entity recognition pii tasks to perform on submitted documents
entityRecognitionTasks (object) – the entity recognition tasks to perform on submitted documents
errorCol (str) – column to hold http errors
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
keyPhraseExtractionTasks (object) – the key phrase extraction tasks to perform on submitted documents
language (object) – the language code of the text (optional for some services)
maxPollingRetries (int) – number of times to poll
outputCol (str) – The name of the output column
pollingDelay (int) – number of milliseconds to wait between polling
sentimentAnalysisTasks (object) – the sentiment analysis tasks to perform on submitted documents
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- entityLinkingTasks = Param(parent='undefined', name='entityLinkingTasks', doc='the entity linking tasks to perform on submitted documents')
- entityRecognitionPiiTasks = Param(parent='undefined', name='entityRecognitionPiiTasks', doc='the entity recognition pii tasks to perform on submitted documents')
- entityRecognitionTasks = Param(parent='undefined', name='entityRecognitionTasks', doc='the entity recognition tasks to perform on submitted documents')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getEntityLinkingTasks()[source]
- Returns
the entity linking tasks to perform on submitted documents
- Return type
entityLinkingTasks
- getEntityRecognitionPiiTasks()[source]
- Returns
the entity recognition pii tasks to perform on submitted documents
- Return type
entityRecognitionPiiTasks
- getEntityRecognitionTasks()[source]
- Returns
the entity recognition tasks to perform on submitted documents
- Return type
entityRecognitionTasks
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getKeyPhraseExtractionTasks()[source]
- Returns
the key phrase extraction tasks to perform on submitted documents
- Return type
keyPhraseExtractionTasks
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSentimentAnalysisTasks()[source]
- Returns
the sentiment analysis tasks to perform on submitted documents
- Return type
sentimentAnalysisTasks
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- keyPhraseExtractionTasks = Param(parent='undefined', name='keyPhraseExtractionTasks', doc='the key phrase extraction tasks to perform on submitted documents')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- sentimentAnalysisTasks = Param(parent='undefined', name='sentimentAnalysisTasks', doc='the sentiment analysis tasks to perform on submitted documents')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setEntityLinkingTasks(value)[source]
- Parameters
entityLinkingTasks – the entity linking tasks to perform on submitted documents
- setEntityRecognitionPiiTasks(value)[source]
- Parameters
entityRecognitionPiiTasks – the entity recognition pii tasks to perform on submitted documents
- setEntityRecognitionTasks(value)[source]
- Parameters
entityRecognitionTasks – the entity recognition tasks to perform on submitted documents
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setKeyPhraseExtractionTasks(value)[source]
- Parameters
keyPhraseExtractionTasks – the key phrase extraction tasks to perform on submitted documents
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, entityLinkingTasks=[], entityRecognitionPiiTasks=[], entityRecognitionTasks=[], errorCol='TextAnalyze_1d5d6204eac4_error', initialPollingDelay=300, keyPhraseExtractionTasks=[], language=None, languageCol=None, maxPollingRetries=1000, outputCol='TextAnalyze_1d5d6204eac4_output', pollingDelay=300, sentimentAnalysisTasks=[], subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSentimentAnalysisTasks(value)[source]
- Parameters
sentimentAnalysisTasks – the sentiment analysis tasks to perform on submitted documents
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.TextSentiment module
- class synapse.ml.cognitive.TextSentiment.TextSentiment(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='TextSentiment_2de9769f5a5b_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, opinionMining=None, opinionMiningCol=None, outputCol='TextSentiment_2de9769f5a5b_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
opinionMining (object) – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
outputCol (str) – The name of the output column
showStats (object) – if set to true, response will contain input and document level statistics.
stringIndexType (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getOpinionMining()[source]
- Returns
if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
- Return type
opinionMining
- getShowStats()[source]
- Returns
if set to true, response will contain input and document level statistics.
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- opinionMining = Param(parent='undefined', name='opinionMining', doc='ServiceParam: if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setOpinionMining(value)[source]
- Parameters
opinionMining – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
- setOpinionMiningCol(value)[source]
- Parameters
opinionMining – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='TextSentiment_2de9769f5a5b_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, opinionMining=None, opinionMiningCol=None, outputCol='TextSentiment_2de9769f5a5b_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setShowStatsCol(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setStringIndexType(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: if set to true, response will contain input and document level statistics.')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.TextSentimentSDK module
- class synapse.ml.cognitive.TextSentimentSDK.TextSentimentSDK(java_obj=None, batchSize=5, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='Error', includeOpinionMining=None, includeOpinionMiningCol=None, includeStatistics=None, includeStatisticsCol=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
batchSize (int) – The max size of the buffer
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
disableServiceLogs (object) – disableServiceLogs option
errorCol (str) – column to hold http errors
includeOpinionMining (object) – includeOpinionMining option
includeStatistics (object) – includeStatistics option
language (object) – the language code of the text (optional for some services)
modelVersion (object) – modelVersion option
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeOpinionMining()[source]
- Returns
includeOpinionMining option
- Return type
includeOpinionMining
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- includeOpinionMining = Param(parent='undefined', name='includeOpinionMining', doc='ServiceParam: includeOpinionMining option')
- includeStatistics = Param(parent='undefined', name='includeStatistics', doc='ServiceParam: includeStatistics option')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: modelVersion option')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeOpinionMining(value)[source]
- Parameters
includeOpinionMining – includeOpinionMining option
- setIncludeOpinionMiningCol(value)[source]
- Parameters
includeOpinionMining – includeOpinionMining option
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(batchSize=5, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='Error', includeOpinionMining=None, includeOpinionMiningCol=None, includeStatistics=None, includeStatisticsCol=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.TextSentimentV2 module
- class synapse.ml.cognitive.TextSentimentV2.TextSentimentV2(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='TextSentimentV2_a0a158ac81a0_error', handler=None, language=None, languageCol=None, outputCol='TextSentimentV2_a0a158ac81a0_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(concurrency=1, concurrentTimeout=None, errorCol='TextSentimentV2_a0a158ac81a0_error', handler=None, language=None, languageCol=None, outputCol='TextSentimentV2_a0a158ac81a0_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.TextToSpeech module
- class synapse.ml.cognitive.TextToSpeech.TextToSpeech(java_obj=None, errorCol='TextToSpeech_d11ce307e91e_errors', language=None, languageCol=None, locale=None, localeCol=None, outputFileCol=None, outputFormat=None, outputFormatCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, url=None, voiceName=None, voiceNameCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
errorCol (str) – column to hold http errors
language (object) – The name of the language used for synthesis
locale (object) – The locale of the input text
outputFileCol (str) – The location of the saved file as an HDFS compliant URI
outputFormat (object) – The format for the output audio can be one of ArraySeq(Raw8Khz8BitMonoMULaw, Riff16Khz16KbpsMonoSiren, Audio16Khz16KbpsMonoSiren, Audio16Khz32KBitRateMonoMp3, Audio16Khz128KBitRateMonoMp3, Audio16Khz64KBitRateMonoMp3, Audio24Khz48KBitRateMonoMp3, Audio24Khz96KBitRateMonoMp3, Audio24Khz160KBitRateMonoMp3, Raw16Khz16BitMonoTrueSilk, Riff16Khz16BitMonoPcm, Riff8Khz16BitMonoPcm, Riff24Khz16BitMonoPcm, Riff8Khz8BitMonoMULaw, Raw16Khz16BitMonoPcm, Raw24Khz16BitMonoPcm, Raw8Khz16BitMonoPcm, Ogg16Khz16BitMonoOpus, Ogg24Khz16BitMonoOpus)
subscriptionKey (object) – the API key to use
text (object) – The text to synthesize
url (str) – Url of the service
voiceName (object) – The name of the voice used for synthesis
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getOutputFileCol()[source]
- Returns
The location of the saved file as an HDFS compliant URI
- Return type
outputFileCol
- getOutputFormat()[source]
- Returns
The format for the output audio can be one of ArraySeq(Raw8Khz8BitMonoMULaw, Riff16Khz16KbpsMonoSiren, Audio16Khz16KbpsMonoSiren, Audio16Khz32KBitRateMonoMp3, Audio16Khz128KBitRateMonoMp3, Audio16Khz64KBitRateMonoMp3, Audio24Khz48KBitRateMonoMp3, Audio24Khz96KBitRateMonoMp3, Audio24Khz160KBitRateMonoMp3, Raw16Khz16BitMonoTrueSilk, Riff16Khz16BitMonoPcm, Riff8Khz16BitMonoPcm, Riff24Khz16BitMonoPcm, Riff8Khz8BitMonoMULaw, Raw16Khz16BitMonoPcm, Raw24Khz16BitMonoPcm, Raw8Khz16BitMonoPcm, Ogg16Khz16BitMonoOpus, Ogg24Khz16BitMonoOpus)
- Return type
outputFormat
- language = Param(parent='undefined', name='language', doc='ServiceParam: The name of the language used for synthesis')
- locale = Param(parent='undefined', name='locale', doc='ServiceParam: The locale of the input text')
- outputFileCol = Param(parent='undefined', name='outputFileCol', doc='The location of the saved file as an HDFS compliant URI')
- outputFormat = Param(parent='undefined', name='outputFormat', doc='ServiceParam: The format for the output audio can be one of ArraySeq(Raw8Khz8BitMonoMULaw, Riff16Khz16KbpsMonoSiren, Audio16Khz16KbpsMonoSiren, Audio16Khz32KBitRateMonoMp3, Audio16Khz128KBitRateMonoMp3, Audio16Khz64KBitRateMonoMp3, Audio24Khz48KBitRateMonoMp3, Audio24Khz96KBitRateMonoMp3, Audio24Khz160KBitRateMonoMp3, Raw16Khz16BitMonoTrueSilk, Riff16Khz16BitMonoPcm, Riff8Khz16BitMonoPcm, Riff24Khz16BitMonoPcm, Riff8Khz8BitMonoMULaw, Raw16Khz16BitMonoPcm, Raw24Khz16BitMonoPcm, Raw8Khz16BitMonoPcm, Ogg16Khz16BitMonoOpus, Ogg24Khz16BitMonoOpus)')
- setOutputFileCol(value)[source]
- Parameters
outputFileCol – The location of the saved file as an HDFS compliant URI
- setOutputFormat(value)[source]
- Parameters
outputFormat – The format for the output audio can be one of ArraySeq(Raw8Khz8BitMonoMULaw, Riff16Khz16KbpsMonoSiren, Audio16Khz16KbpsMonoSiren, Audio16Khz32KBitRateMonoMp3, Audio16Khz128KBitRateMonoMp3, Audio16Khz64KBitRateMonoMp3, Audio24Khz48KBitRateMonoMp3, Audio24Khz96KBitRateMonoMp3, Audio24Khz160KBitRateMonoMp3, Raw16Khz16BitMonoTrueSilk, Riff16Khz16BitMonoPcm, Riff8Khz16BitMonoPcm, Riff24Khz16BitMonoPcm, Riff8Khz8BitMonoMULaw, Raw16Khz16BitMonoPcm, Raw24Khz16BitMonoPcm, Raw8Khz16BitMonoPcm, Ogg16Khz16BitMonoOpus, Ogg24Khz16BitMonoOpus)
- setOutputFormatCol(value)[source]
- Parameters
outputFormat – The format for the output audio can be one of ArraySeq(Raw8Khz8BitMonoMULaw, Riff16Khz16KbpsMonoSiren, Audio16Khz16KbpsMonoSiren, Audio16Khz32KBitRateMonoMp3, Audio16Khz128KBitRateMonoMp3, Audio16Khz64KBitRateMonoMp3, Audio24Khz48KBitRateMonoMp3, Audio24Khz96KBitRateMonoMp3, Audio24Khz160KBitRateMonoMp3, Raw16Khz16BitMonoTrueSilk, Riff16Khz16BitMonoPcm, Riff8Khz16BitMonoPcm, Riff24Khz16BitMonoPcm, Riff8Khz8BitMonoMULaw, Raw16Khz16BitMonoPcm, Raw24Khz16BitMonoPcm, Raw8Khz16BitMonoPcm, Ogg16Khz16BitMonoOpus, Ogg24Khz16BitMonoOpus)
- setParams(errorCol='TextToSpeech_d11ce307e91e_errors', language=None, languageCol=None, locale=None, localeCol=None, outputFileCol=None, outputFormat=None, outputFormatCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, url=None, voiceName=None, voiceNameCol=None)[source]
Set the (keyword only) parameters
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: The text to synthesize')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- voiceName = Param(parent='undefined', name='voiceName', doc='ServiceParam: The name of the voice used for synthesis')
synapse.ml.cognitive.Translate module
- class synapse.ml.cognitive.Translate.Translate(java_obj=None, allowFallback=None, allowFallbackCol=None, category=None, categoryCol=None, concurrency=1, concurrentTimeout=None, errorCol='Translate_b910fdcd035e_error', fromLanguage=None, fromLanguageCol=None, fromScript=None, fromScriptCol=None, handler=None, includeAlignment=None, includeAlignmentCol=None, includeSentenceLength=None, includeSentenceLengthCol=None, outputCol='Translate_b910fdcd035e_output', profanityAction=None, profanityActionCol=None, profanityMarker=None, profanityMarkerCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, suggestedFrom=None, suggestedFromCol=None, text=None, textCol=None, textType=None, textTypeCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, toScript=None, toScriptCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
allowFallback (object) – Specifies that the service is allowed to fall back to a general system when a custom system does not exist.
category (object) – A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
fromLanguage (object) – Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.
fromScript (object) – Specifies the script of the input text.
handler (object) – Which strategy to use when handling requests
includeAlignment (object) – Specifies whether to include alignment projection from source text to translated text.
includeSentenceLength (object) – Specifies whether to include sentence boundaries for the input text and the translated text.
outputCol (str) – The name of the output column
profanityAction (object) – Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.
profanityMarker (object) – Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.
subscriptionKey (object) – the API key to use
subscriptionRegion (object) – the API region to use
suggestedFrom (object) – Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.
text (object) – the string to translate
textType (object) – Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.
timeout (float) – number of seconds to wait before closing the connection
toLanguage (object) – Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.
toScript (object) – Specifies the script of the translated text.
url (str) – Url of the service
- allowFallback = Param(parent='undefined', name='allowFallback', doc='ServiceParam: Specifies that the service is allowed to fall back to a general system when a custom system does not exist. ')
- category = Param(parent='undefined', name='category', doc='ServiceParam: A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fromLanguage = Param(parent='undefined', name='fromLanguage', doc='ServiceParam: Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.')
- fromScript = Param(parent='undefined', name='fromScript', doc='ServiceParam: Specifies the script of the input text.')
- getAllowFallback()[source]
- Returns
Specifies that the service is allowed to fall back to a general system when a custom system does not exist.
- Return type
allowFallback
- getCategory()[source]
- Returns
A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.
- Return type
category
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFromLanguage()[source]
- Returns
Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.
- Return type
fromLanguage
- getIncludeAlignment()[source]
- Returns
Specifies whether to include alignment projection from source text to translated text.
- Return type
includeAlignment
- getIncludeSentenceLength()[source]
- Returns
Specifies whether to include sentence boundaries for the input text and the translated text.
- Return type
includeSentenceLength
- getProfanityAction()[source]
- Returns
Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.
- Return type
profanityAction
- getProfanityMarker()[source]
- Returns
Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.
- Return type
profanityMarker
- getSuggestedFrom()[source]
- Returns
Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.
- Return type
suggestedFrom
- getTextType()[source]
- Returns
Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.
- Return type
textType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getToLanguage()[source]
- Returns
Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.
- Return type
toLanguage
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- includeAlignment = Param(parent='undefined', name='includeAlignment', doc='ServiceParam: Specifies whether to include alignment projection from source text to translated text.')
- includeSentenceLength = Param(parent='undefined', name='includeSentenceLength', doc='ServiceParam: Specifies whether to include sentence boundaries for the input text and the translated text. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- profanityAction = Param(parent='undefined', name='profanityAction', doc='ServiceParam: Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted. ')
- profanityMarker = Param(parent='undefined', name='profanityMarker', doc='ServiceParam: Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.')
- setAllowFallback(value)[source]
- Parameters
allowFallback – Specifies that the service is allowed to fall back to a general system when a custom system does not exist.
- setAllowFallbackCol(value)[source]
- Parameters
allowFallback – Specifies that the service is allowed to fall back to a general system when a custom system does not exist.
- setCategory(value)[source]
- Parameters
category – A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.
- setCategoryCol(value)[source]
- Parameters
category – A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFromLanguage(value)[source]
- Parameters
fromLanguage – Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.
- setFromLanguageCol(value)[source]
- Parameters
fromLanguage – Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.
- setIncludeAlignment(value)[source]
- Parameters
includeAlignment – Specifies whether to include alignment projection from source text to translated text.
- setIncludeAlignmentCol(value)[source]
- Parameters
includeAlignment – Specifies whether to include alignment projection from source text to translated text.
- setIncludeSentenceLength(value)[source]
- Parameters
includeSentenceLength – Specifies whether to include sentence boundaries for the input text and the translated text.
- setIncludeSentenceLengthCol(value)[source]
- Parameters
includeSentenceLength – Specifies whether to include sentence boundaries for the input text and the translated text.
- setParams(allowFallback=None, allowFallbackCol=None, category=None, categoryCol=None, concurrency=1, concurrentTimeout=None, errorCol='Translate_b910fdcd035e_error', fromLanguage=None, fromLanguageCol=None, fromScript=None, fromScriptCol=None, handler=None, includeAlignment=None, includeAlignmentCol=None, includeSentenceLength=None, includeSentenceLengthCol=None, outputCol='Translate_b910fdcd035e_output', profanityAction=None, profanityActionCol=None, profanityMarker=None, profanityMarkerCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, suggestedFrom=None, suggestedFromCol=None, text=None, textCol=None, textType=None, textTypeCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, toScript=None, toScriptCol=None, url=None)[source]
Set the (keyword only) parameters
- setProfanityAction(value)[source]
- Parameters
profanityAction – Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.
- setProfanityActionCol(value)[source]
- Parameters
profanityAction – Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.
- setProfanityMarker(value)[source]
- Parameters
profanityMarker – Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.
- setProfanityMarkerCol(value)[source]
- Parameters
profanityMarker – Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.
- setSuggestedFrom(value)[source]
- Parameters
suggestedFrom – Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.
- setSuggestedFromCol(value)[source]
- Parameters
suggestedFrom – Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.
- setTextType(value)[source]
- Parameters
textType – Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.
- setTextTypeCol(value)[source]
- Parameters
textType – Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- setToLanguage(value)[source]
- Parameters
toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.
- setToLanguageCol(value)[source]
- Parameters
toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- suggestedFrom = Param(parent='undefined', name='suggestedFrom', doc="ServiceParam: Specifies a fallback language if the language of the input text can't be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.")
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- textType = Param(parent='undefined', name='textType', doc='ServiceParam: Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- toLanguage = Param(parent='undefined', name='toLanguage', doc="ServiceParam: Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It's possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.")
- toScript = Param(parent='undefined', name='toScript', doc='ServiceParam: Specifies the script of the translated text.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.Transliterate module
- class synapse.ml.cognitive.Transliterate.Transliterate(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='Transliterate_0f6748921d49_error', fromScript=None, fromScriptCol=None, handler=None, language=None, languageCol=None, outputCol='Transliterate_0f6748921d49_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toScript=None, toScriptCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
fromScript (object) – Specifies the script of the input text.
handler (object) – Which strategy to use when handling requests
language (object) – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
subscriptionRegion (object) – the API region to use
text (object) – the string to translate
timeout (float) – number of seconds to wait before closing the connection
toScript (object) – Specifies the script of the translated text.
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fromScript = Param(parent='undefined', name='fromScript', doc='ServiceParam: Specifies the script of the input text.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setLanguageCol(value)[source]
- Parameters
language – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='Transliterate_0f6748921d49_error', fromScript=None, fromScriptCol=None, handler=None, language=None, languageCol=None, outputCol='Transliterate_0f6748921d49_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toScript=None, toScriptCol=None, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- toScript = Param(parent='undefined', name='toScript', doc='ServiceParam: Specifies the script of the translated text.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.VerifyFaces module
- class synapse.ml.cognitive.VerifyFaces.VerifyFaces(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='VerifyFaces_e076a02a2b51_error', faceId=None, faceIdCol=None, faceId1=None, faceId1Col=None, faceId2=None, faceId2Col=None, handler=None, largePersonGroupId=None, largePersonGroupIdCol=None, outputCol='VerifyFaces_e076a02a2b51_output', personGroupId=None, personGroupIdCol=None, personId=None, personIdCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
faceId (object) – faceId of the face, comes from Face - Detect.
faceId1 (object) – faceId of one face, comes from Face - Detect.
faceId2 (object) – faceId of another face, comes from Face - Detect.
handler (object) – Which strategy to use when handling requests
largePersonGroupId (object) – Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
outputCol (str) – The name of the output column
personGroupId (object) – Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
personId (object) – Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- faceId = Param(parent='undefined', name='faceId', doc='ServiceParam: faceId of the face, comes from Face - Detect.')
- faceId1 = Param(parent='undefined', name='faceId1', doc='ServiceParam: faceId of one face, comes from Face - Detect.')
- faceId2 = Param(parent='undefined', name='faceId2', doc='ServiceParam: faceId of another face, comes from Face - Detect.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLargePersonGroupId()[source]
- Returns
Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- Return type
largePersonGroupId
- getPersonGroupId()[source]
- Returns
Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- Return type
personGroupId
- getPersonId()[source]
- Returns
Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
- Return type
personId
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- largePersonGroupId = Param(parent='undefined', name='largePersonGroupId', doc='ServiceParam: Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- personGroupId = Param(parent='undefined', name='personGroupId', doc='ServiceParam: Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.')
- personId = Param(parent='undefined', name='personId', doc='ServiceParam: Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFaceId2Col(value)[source]
- Parameters
faceId2 – faceId of another face, comes from Face - Detect.
- setLargePersonGroupId(value)[source]
- Parameters
largePersonGroupId – Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setLargePersonGroupIdCol(value)[source]
- Parameters
largePersonGroupId – Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='VerifyFaces_e076a02a2b51_error', faceId=None, faceIdCol=None, faceId1=None, faceId1Col=None, faceId2=None, faceId2Col=None, handler=None, largePersonGroupId=None, largePersonGroupIdCol=None, outputCol='VerifyFaces_e076a02a2b51_output', personGroupId=None, personGroupIdCol=None, personId=None, personIdCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPersonGroupId(value)[source]
- Parameters
personGroupId – Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setPersonGroupIdCol(value)[source]
- Parameters
personGroupId – Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setPersonId(value)[source]
- Parameters
personId – Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
- setPersonIdCol(value)[source]
- Parameters
personId – Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
Module contents
SynapseML is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK), LightGBM and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of datasources.
SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. In this vein, SynapseML provides easy to use SparkML transformers for a wide variety of Microsoft Cognitive Services. For production grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.
SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+.