synapse.ml.cognitive package
Submodules
synapse.ml.cognitive.AddDocuments module
- class synapse.ml.cognitive.AddDocuments.AddDocuments(java_obj=None, actionCol='@search.action', batchSize=100, concurrency=1, concurrentTimeout=None, errorCol='AddDocuments_23d171c8e20c_error', handler=None, indexName=None, outputCol='AddDocuments_23d171c8e20c_output', serviceName=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
actionCol (str) – You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an ‘upsert’ where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn’t exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field ‘tags’ with value [‘budget’] and you execute a merge with value [‘economy’, ‘pool’] for ‘tags’, the final value of the ‘tags’ field will be [‘economy’, ‘pool’]. It will not be [‘budget’, ‘economy’, ‘pool’]. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null.
batchSize (int) – The max size of the buffer
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
indexName (str) –
outputCol (str) – The name of the output column
serviceName (str) –
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- actionCol = Param(parent='undefined', name='actionCol', doc=" You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an 'upsert' where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn't exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field 'tags' with value ['budget'] and you execute a merge with value ['economy', 'pool'] for 'tags', the final value of the 'tags' field will be ['economy', 'pool']. It will not be ['budget', 'economy', 'pool']. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null. ")
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getActionCol()[source]
- Returns
You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an ‘upsert’ where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn’t exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field ‘tags’ with value [‘budget’] and you execute a merge with value [‘economy’, ‘pool’] for ‘tags’, the final value of the ‘tags’ field will be [‘economy’, ‘pool’]. It will not be [‘budget’, ‘economy’, ‘pool’]. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null.
- Return type
actionCol
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- indexName = Param(parent='undefined', name='indexName', doc='')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- serviceName = Param(parent='undefined', name='serviceName', doc='')
- setActionCol(value)[source]
- Parameters
actionCol – You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an ‘upsert’ where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn’t exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field ‘tags’ with value [‘budget’] and you execute a merge with value [‘economy’, ‘pool’] for ‘tags’, the final value of the ‘tags’ field will be [‘economy’, ‘pool’]. It will not be [‘budget’, ‘economy’, ‘pool’]. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null.
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(actionCol='@search.action', batchSize=100, concurrency=1, concurrentTimeout=None, errorCol='AddDocuments_23d171c8e20c_error', handler=None, indexName=None, outputCol='AddDocuments_23d171c8e20c_output', serviceName=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeBusinessCards module
- class synapse.ml.cognitive.AnalyzeBusinessCards.AnalyzeBusinessCards(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeBusinessCards_78867d030dc8_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeBusinessCards_78867d030dc8_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
includeTextDetails (object) – Include text lines and element references in the result.
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
locale (object) – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
maxPollingRetries (int) – number of times to poll
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLocale()[source]
- Returns
Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- Return type
locale
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='ServiceParam: Include text lines and element references in the result.')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- locale = Param(parent='undefined', name='locale', doc='ServiceParam: Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setLocale(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setLocaleCol(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeBusinessCards_78867d030dc8_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeBusinessCards_78867d030dc8_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeCustomModel module
- class synapse.ml.cognitive.AnalyzeCustomModel.AnalyzeCustomModel(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeCustomModel_355269a761b5_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, maxPollingRetries=1000, modelId=None, modelIdCol=None, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeCustomModel_355269a761b5_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
includeTextDetails (object) – Include text lines and element references in the result.
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
maxPollingRetries (int) – number of times to poll
modelId (object) – Model identifier.
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='ServiceParam: Include text lines and element references in the result.')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelId = Param(parent='undefined', name='modelId', doc='ServiceParam: Model identifier.')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeCustomModel_355269a761b5_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, maxPollingRetries=1000, modelId=None, modelIdCol=None, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeCustomModel_355269a761b5_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeDocument module
- class synapse.ml.cognitive.AnalyzeDocument.AnalyzeDocument(java_obj=None, apiVersion=None, apiVersionCol=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeDocument_668b4d7ee574_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, outputCol='AnalyzeDocument_668b4d7ee574_output', pages=None, pagesCol=None, pollingDelay=300, prebuiltModelId=None, prebuiltModelIdCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
apiVersion (object) – version of the api
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
locale (object) – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
maxPollingRetries (int) – number of times to poll
outputCol (str) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
prebuiltModelId (object) – Prebuilt Model identifier for Form Recognizer V3.0, supported modelId: prebuilt-read, prebuilt-layout,prebuilt-document, prebuilt-businessCard, prebuilt-idDocument, prebuilt-invoice, prebuilt-receipt,or your custom modelId
stringIndexType (object) – Method used to compute string offset and length.
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- apiVersion = Param(parent='undefined', name='apiVersion', doc='ServiceParam: version of the api')
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLocale()[source]
- Returns
Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- Return type
locale
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getPrebuiltModelId()[source]
- Returns
Prebuilt Model identifier for Form Recognizer V3.0, supported modelId: prebuilt-read, prebuilt-layout,prebuilt-document, prebuilt-businessCard, prebuilt-idDocument, prebuilt-invoice, prebuilt-receipt,or your custom modelId
- Return type
prebuiltModelId
- getStringIndexType()[source]
- Returns
Method used to compute string offset and length.
- Return type
stringIndexType
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- locale = Param(parent='undefined', name='locale', doc='ServiceParam: Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- prebuiltModelId = Param(parent='undefined', name='prebuiltModelId', doc='ServiceParam: Prebuilt Model identifier for Form Recognizer V3.0, supported modelId: prebuilt-read, prebuilt-layout,prebuilt-document, prebuilt-businessCard, prebuilt-idDocument, prebuilt-invoice, prebuilt-receipt,or your custom modelId')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setLocale(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setLocaleCol(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(apiVersion=None, apiVersionCol=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeDocument_668b4d7ee574_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, outputCol='AnalyzeDocument_668b4d7ee574_output', pages=None, pagesCol=None, pollingDelay=300, prebuiltModelId=None, prebuiltModelIdCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setPrebuiltModelId(value)[source]
- Parameters
prebuiltModelId – Prebuilt Model identifier for Form Recognizer V3.0, supported modelId: prebuilt-read, prebuilt-layout,prebuilt-document, prebuilt-businessCard, prebuilt-idDocument, prebuilt-invoice, prebuilt-receipt,or your custom modelId
- setPrebuiltModelIdCol(value)[source]
- Parameters
prebuiltModelId – Prebuilt Model identifier for Form Recognizer V3.0, supported modelId: prebuilt-read, prebuilt-layout,prebuilt-document, prebuilt-businessCard, prebuilt-idDocument, prebuilt-invoice, prebuilt-receipt,or your custom modelId
- setStringIndexType(value)[source]
- Parameters
stringIndexType – Method used to compute string offset and length.
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType – Method used to compute string offset and length.
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Method used to compute string offset and length.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeIDDocuments module
- class synapse.ml.cognitive.AnalyzeIDDocuments.AnalyzeIDDocuments(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeIDDocuments_d43966693cb6_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeIDDocuments_d43966693cb6_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
includeTextDetails (object) – Include text lines and element references in the result.
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
maxPollingRetries (int) – number of times to poll
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='ServiceParam: Include text lines and element references in the result.')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeIDDocuments_d43966693cb6_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeIDDocuments_d43966693cb6_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeImage module
- class synapse.ml.cognitive.AnalyzeImage.AnalyzeImage(java_obj=None, concurrency=1, concurrentTimeout=None, details=None, detailsCol=None, errorCol='AnalyzeImage_0f35ae1afc62_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='AnalyzeImage_0f35ae1afc62_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None, visualFeatures=None, visualFeaturesCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
details (object) – what visual feature types to return
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – the language of the response (en if none given)
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
visualFeatures (object) – what visual feature types to return
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- details = Param(parent='undefined', name='details', doc='ServiceParam: what visual feature types to return')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language of the response (en if none given)')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguageCol(value)[source]
- Parameters
language – the language of the response (en if none given)
- setParams(concurrency=1, concurrentTimeout=None, details=None, detailsCol=None, errorCol='AnalyzeImage_0f35ae1afc62_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='AnalyzeImage_0f35ae1afc62_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None, visualFeatures=None, visualFeaturesCol=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- setVisualFeaturesCol(value)[source]
- Parameters
visualFeatures – what visual feature types to return
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- visualFeatures = Param(parent='undefined', name='visualFeatures', doc='ServiceParam: what visual feature types to return')
synapse.ml.cognitive.AnalyzeInvoices module
- class synapse.ml.cognitive.AnalyzeInvoices.AnalyzeInvoices(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeInvoices_43612dccb408_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeInvoices_43612dccb408_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
includeTextDetails (object) – Include text lines and element references in the result.
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
locale (object) – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
maxPollingRetries (int) – number of times to poll
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLocale()[source]
- Returns
Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- Return type
locale
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='ServiceParam: Include text lines and element references in the result.')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- locale = Param(parent='undefined', name='locale', doc='ServiceParam: Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setLocale(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setLocaleCol(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeInvoices_43612dccb408_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeInvoices_43612dccb408_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeLayout module
- class synapse.ml.cognitive.AnalyzeLayout.AnalyzeLayout(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeLayout_42cda16b2446_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeLayout_42cda16b2446_output', pages=None, pagesCol=None, pollingDelay=300, readingOrder=None, readingOrderCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
language (object) – The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
maxPollingRetries (int) – number of times to poll
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
readingOrder (object) – Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either ‘basic’ or ‘natural’. Will default to basic if not specified
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLanguage()[source]
- Returns
The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- Return type
language
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getReadingOrder()[source]
- Returns
Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either ‘basic’ or ‘natural’. Will default to basic if not specified
- Return type
readingOrder
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- language = Param(parent='undefined', name='language', doc='ServiceParam: The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- readingOrder = Param(parent='undefined', name='readingOrder', doc="ServiceParam: Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either 'basic' or 'natural'. Will default to basic if not specified")
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setLanguage(value)[source]
- Parameters
language – The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- setLanguageCol(value)[source]
- Parameters
language – The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeLayout_42cda16b2446_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeLayout_42cda16b2446_output', pages=None, pagesCol=None, pollingDelay=300, readingOrder=None, readingOrderCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setReadingOrder(value)[source]
- Parameters
readingOrder – Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either ‘basic’ or ‘natural’. Will default to basic if not specified
- setReadingOrderCol(value)[source]
- Parameters
readingOrder – Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either ‘basic’ or ‘natural’. Will default to basic if not specified
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeReceipts module
- class synapse.ml.cognitive.AnalyzeReceipts.AnalyzeReceipts(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeReceipts_cb89a48db0fa_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeReceipts_cb89a48db0fa_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
includeTextDetails (object) – Include text lines and element references in the result.
initialPollingDelay (int) – number of milliseconds to wait before first poll for result
locale (object) – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
maxPollingRetries (int) – number of times to poll
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (str) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
suppressMaxRetriesExceededException (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLocale()[source]
- Returns
Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- Return type
locale
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesExceededException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesExceededException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='ServiceParam: Include text lines and element references in the result.')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- locale = Param(parent='undefined', name='locale', doc='ServiceParam: Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay – number of milliseconds to wait before first poll for result
- setLocale(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setLocaleCol(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeReceipts_cb89a48db0fa_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeReceipts_cb89a48db0fa_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesExceededException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSuppressMaxRetriesExceededException(value)[source]
- Parameters
suppressMaxRetriesExceededException – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesExceededException = Param(parent='undefined', name='suppressMaxRetriesExceededException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AzureSearchWriter module
synapse.ml.cognitive.BingImageSearch module
- class synapse.ml.cognitive.BingImageSearch.BingImageSearch(java_obj=None, aspect=None, aspectCol=None, color=None, colorCol=None, concurrency=1, concurrentTimeout=None, count=None, countCol=None, errorCol='BingImageSearch_a5631379bdbb_error', freshness=None, freshnessCol=None, handler=None, height=None, heightCol=None, imageContent=None, imageContentCol=None, imageType=None, imageTypeCol=None, license=None, licenseCol=None, maxFileSize=None, maxFileSizeCol=None, maxHeight=None, maxHeightCol=None, maxWidth=None, maxWidthCol=None, minFileSize=None, minFileSizeCol=None, minHeight=None, minHeightCol=None, minWidth=None, minWidthCol=None, mkt=None, mktCol=None, offset=None, offsetCol=None, outputCol='BingImageSearch_a5631379bdbb_output', q=None, qCol=None, size=None, sizeCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url='https://api.bing.microsoft.com/v7.0/images/search', width=None, widthCol=None)[source]
Bases:
synapse.ml.cognitive._BingImageSearch._BingImageSearch
synapse.ml.cognitive.BreakSentence module
- class synapse.ml.cognitive.BreakSentence.BreakSentence(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='BreakSentence_eda0bcf1ca97_error', handler=None, language=None, languageCol=None, outputCol='BreakSentence_eda0bcf1ca97_output', script=None, scriptCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
outputCol (str) – The name of the output column
script (object) – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
subscriptionKey (object) – the API key to use
subscriptionRegion (object) – the API region to use
text (object) – the string to translate
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- Return type
language
- getScript()[source]
- Returns
Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
- Return type
script
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- script = Param(parent='undefined', name='script', doc='ServiceParam: Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setLanguageCol(value)[source]
- Parameters
language – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='BreakSentence_eda0bcf1ca97_error', handler=None, language=None, languageCol=None, outputCol='BreakSentence_eda0bcf1ca97_output', script=None, scriptCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setScript(value)[source]
- Parameters
script – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
- setScriptCol(value)[source]
- Parameters
script – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.ConversationTranscription module
- class synapse.ml.cognitive.ConversationTranscription.ConversationTranscription(java_obj=None, audioDataCol=None, endpointId=None, extraFfmpegArgs=[], fileType=None, fileTypeCol=None, format=None, formatCol=None, language=None, languageCol=None, outputCol=None, participantsJson=None, participantsJsonCol=None, profanity=None, profanityCol=None, recordAudioData=False, recordedFileNameCol=None, streamIntermediateResults=True, subscriptionKey=None, subscriptionKeyCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
audioDataCol (str) – Column holding audio data, must be either ByteArrays or Strings representing file URIs
endpointId (str) – endpoint for custom speech models
extraFfmpegArgs (list) – extra arguments to for ffmpeg output decoding
fileType (object) – The file type of the sound files, supported types: wav, ogg, mp3
format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
language (object) – Identifies the spoken language that is being recognized.
outputCol (str) – The name of the output column
participantsJson (object) – a json representation of a list of conversation participants (email, language, user)
profanity (object) – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
recordAudioData (bool) – Whether to record audio data to a file location, for use only with m3u8 streams
recordedFileNameCol (str) – Column holding file names to write audio data to if ``recordAudioData’’ is set to true
streamIntermediateResults (bool) – Whether or not to immediately return itermediate results, or group in a sequence
subscriptionKey (object) – the API key to use
url (str) – Url of the service
- audioDataCol = Param(parent='undefined', name='audioDataCol', doc='Column holding audio data, must be either ByteArrays or Strings representing file URIs')
- endpointId = Param(parent='undefined', name='endpointId', doc='endpoint for custom speech models')
- extraFfmpegArgs = Param(parent='undefined', name='extraFfmpegArgs', doc='extra arguments to for ffmpeg output decoding')
- fileType = Param(parent='undefined', name='fileType', doc='ServiceParam: The file type of the sound files, supported types: wav, ogg, mp3')
- format = Param(parent='undefined', name='format', doc='ServiceParam: Specifies the result format. Accepted values are simple and detailed. Default is simple. ')
- getAudioDataCol()[source]
- Returns
Column holding audio data, must be either ByteArrays or Strings representing file URIs
- Return type
audioDataCol
- getExtraFfmpegArgs()[source]
- Returns
extra arguments to for ffmpeg output decoding
- Return type
extraFfmpegArgs
- getFileType()[source]
- Returns
The file type of the sound files, supported types: wav, ogg, mp3
- Return type
fileType
- getFormat()[source]
- Returns
Specifies the result format. Accepted values are simple and detailed. Default is simple.
- Return type
format
- getLanguage()[source]
- Returns
Identifies the spoken language that is being recognized.
- Return type
language
- getParticipantsJson()[source]
- Returns
a json representation of a list of conversation participants (email, language, user)
- Return type
participantsJson
- getProfanity()[source]
- Returns
Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- Return type
profanity
- getRecordAudioData()[source]
- Returns
Whether to record audio data to a file location, for use only with m3u8 streams
- Return type
recordAudioData
- getRecordedFileNameCol()[source]
- Returns
Column holding file names to write audio data to if ``recordAudioData’’ is set to true
- Return type
recordedFileNameCol
- getStreamIntermediateResults()[source]
- Returns
Whether or not to immediately return itermediate results, or group in a sequence
- Return type
streamIntermediateResults
- language = Param(parent='undefined', name='language', doc='ServiceParam: Identifies the spoken language that is being recognized. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- participantsJson = Param(parent='undefined', name='participantsJson', doc='ServiceParam: a json representation of a list of conversation participants (email, language, user)')
- profanity = Param(parent='undefined', name='profanity', doc='ServiceParam: Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked. ')
- recordAudioData = Param(parent='undefined', name='recordAudioData', doc='Whether to record audio data to a file location, for use only with m3u8 streams')
- recordedFileNameCol = Param(parent='undefined', name='recordedFileNameCol', doc="Column holding file names to write audio data to if ``recordAudioData'' is set to true")
- setAudioDataCol(value)[source]
- Parameters
audioDataCol – Column holding audio data, must be either ByteArrays or Strings representing file URIs
- setExtraFfmpegArgs(value)[source]
- Parameters
extraFfmpegArgs – extra arguments to for ffmpeg output decoding
- setFileType(value)[source]
- Parameters
fileType – The file type of the sound files, supported types: wav, ogg, mp3
- setFileTypeCol(value)[source]
- Parameters
fileType – The file type of the sound files, supported types: wav, ogg, mp3
- setFormat(value)[source]
- Parameters
format – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setFormatCol(value)[source]
- Parameters
format – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setLanguage(value)[source]
- Parameters
language – Identifies the spoken language that is being recognized.
- setLanguageCol(value)[source]
- Parameters
language – Identifies the spoken language that is being recognized.
- setParams(audioDataCol=None, endpointId=None, extraFfmpegArgs=[], fileType=None, fileTypeCol=None, format=None, formatCol=None, language=None, languageCol=None, outputCol=None, participantsJson=None, participantsJsonCol=None, profanity=None, profanityCol=None, recordAudioData=False, recordedFileNameCol=None, streamIntermediateResults=True, subscriptionKey=None, subscriptionKeyCol=None, url=None)[source]
Set the (keyword only) parameters
- setParticipantsJson(value)[source]
- Parameters
participantsJson – a json representation of a list of conversation participants (email, language, user)
- setParticipantsJsonCol(value)[source]
- Parameters
participantsJson – a json representation of a list of conversation participants (email, language, user)
- setProfanity(value)[source]
- Parameters
profanity – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setProfanityCol(value)[source]
- Parameters
profanity – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setRecordAudioData(value)[source]
- Parameters
recordAudioData – Whether to record audio data to a file location, for use only with m3u8 streams
- setRecordedFileNameCol(value)[source]
- Parameters
recordedFileNameCol – Column holding file names to write audio data to if ``recordAudioData’’ is set to true
- setStreamIntermediateResults(value)[source]
- Parameters
streamIntermediateResults – Whether or not to immediately return itermediate results, or group in a sequence
- streamIntermediateResults = Param(parent='undefined', name='streamIntermediateResults', doc='Whether or not to immediately return itermediate results, or group in a sequence')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DescribeImage module
- class synapse.ml.cognitive.DescribeImage.DescribeImage(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='DescribeImage_cac8d6bc9fcc_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, maxCandidates=None, maxCandidatesCol=None, outputCol='DescribeImage_cac8d6bc9fcc_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – Language of image description
maxCandidates (object) – Maximum candidate descriptions to return
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getMaxCandidates()[source]
- Returns
Maximum candidate descriptions to return
- Return type
maxCandidates
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- language = Param(parent='undefined', name='language', doc='ServiceParam: Language of image description')
- maxCandidates = Param(parent='undefined', name='maxCandidates', doc='ServiceParam: Maximum candidate descriptions to return')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setMaxCandidates(value)[source]
- Parameters
maxCandidates – Maximum candidate descriptions to return
- setMaxCandidatesCol(value)[source]
- Parameters
maxCandidates – Maximum candidate descriptions to return
- setParams(concurrency=1, concurrentTimeout=None, errorCol='DescribeImage_cac8d6bc9fcc_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, maxCandidates=None, maxCandidatesCol=None, outputCol='DescribeImage_cac8d6bc9fcc_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.Detect module
- class synapse.ml.cognitive.Detect.Detect(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='Detect_852ec33b4f2f_error', handler=None, outputCol='Detect_852ec33b4f2f_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
outputCol (str) – The name of the output column
subscriptionKey (object) – the API key to use
subscriptionRegion (object) – the API region to use
text (object) – the string to translate
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='Detect_852ec33b4f2f_error', handler=None, outputCol='Detect_852ec33b4f2f_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DetectAnomalies module
- class synapse.ml.cognitive.DetectAnomalies.DetectAnomalies(java_obj=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectAnomalies_bf669301e314_error', granularity=None, granularityCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectAnomalies_bf669301e314_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
errorCol (str) – column to hold http errors
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
handler (object) – Which strategy to use when handling requests
imputeFixedValue (object) – Optional argument, fixed value to use when imputeMode is set to “fixed”
imputeMode (object) – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
outputCol (str) – The name of the output column
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customInterval = Param(parent='undefined', name='customInterval', doc='ServiceParam: Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5. ')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomInterval()[source]
- Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
customInterval
- getGranularity()[source]
- Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
granularity
- getImputeFixedValue()[source]
- Returns
Optional argument, fixed value to use when imputeMode is set to “fixed”
- Return type
imputeFixedValue
- getImputeMode()[source]
- Returns
Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- Return type
imputeMode
- getMaxAnomalyRatio()[source]
- Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
maxAnomalyRatio
- getPeriod()[source]
- Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
period
- getSensitivity()[source]
- Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
sensitivity
- getSeries()[source]
- Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
series
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- granularity = Param(parent='undefined', name='granularity', doc='ServiceParam: Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid. ')
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imputeFixedValue = Param(parent='undefined', name='imputeFixedValue', doc='ServiceParam: Optional argument, fixed value to use when imputeMode is set to "fixed" ')
- imputeMode = Param(parent='undefined', name='imputeMode', doc='ServiceParam: Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill ')
- maxAnomalyRatio = Param(parent='undefined', name='maxAnomalyRatio', doc='ServiceParam: Optional argument, advanced model parameter, max anomaly ratio in a time series. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- period = Param(parent='undefined', name='period', doc='ServiceParam: Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically. ')
- sensitivity = Param(parent='undefined', name='sensitivity', doc='ServiceParam: Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted ')
- series = Param(parent='undefined', name='series', doc='ServiceParam: Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned. ')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setCustomInterval(value)[source]
- Parameters
customInterval – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setCustomIntervalCol(value)[source]
- Parameters
customInterval – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setGranularity(value)[source]
- Parameters
granularity – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setGranularityCol(value)[source]
- Parameters
granularity – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setImputeFixedValue(value)[source]
- Parameters
imputeFixedValue – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeFixedValueCol(value)[source]
- Parameters
imputeFixedValue – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeMode(value)[source]
- Parameters
imputeMode – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setImputeModeCol(value)[source]
- Parameters
imputeMode – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setMaxAnomalyRatio(value)[source]
- Parameters
maxAnomalyRatio – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setMaxAnomalyRatioCol(value)[source]
- Parameters
maxAnomalyRatio – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setParams(concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectAnomalies_bf669301e314_error', granularity=None, granularityCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectAnomalies_bf669301e314_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPeriod(value)[source]
- Parameters
period – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setPeriodCol(value)[source]
- Parameters
period – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setSensitivity(value)[source]
- Parameters
sensitivity – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSensitivityCol(value)[source]
- Parameters
sensitivity – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSeries(value)[source]
- Parameters
series – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setSeriesCol(value)[source]
- Parameters
series – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DetectFace module
- class synapse.ml.cognitive.DetectFace.DetectFace(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='DetectFace_ec61c3846cd1_error', handler=None, imageUrl=None, imageUrlCol=None, outputCol='DetectFace_ec61c3846cd1_output', returnFaceAttributes=None, returnFaceAttributesCol=None, returnFaceId=None, returnFaceIdCol=None, returnFaceLandmarks=None, returnFaceLandmarksCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (str) – column to hold http errors
handler (object) – Which strategy to use when handling requests
imageUrl (object) – the url of the image to use
outputCol (str) – The name of the output column
returnFaceAttributes (object) – Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
returnFaceId (object) – Return faceIds of the detected faces or not. The default value is true
returnFaceLandmarks (object) – Return face landmarks of the detected faces or not. The default value is false.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getReturnFaceAttributes()[source]
- Returns
Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
- Return type
returnFaceAttributes
- getReturnFaceId()[source]
- Returns
Return faceIds of the detected faces or not. The default value is true
- Return type
returnFaceId
- getReturnFaceLandmarks()[source]
- Returns
Return face landmarks of the detected faces or not. The default value is false.
- Return type
returnFaceLandmarks
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- returnFaceAttributes = Param(parent='undefined', name='returnFaceAttributes', doc='ServiceParam: Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.')
- returnFaceId = Param(parent='undefined', name='returnFaceId', doc='ServiceParam: Return faceIds of the detected faces or not. The default value is true')
- returnFaceLandmarks = Param(parent='undefined', name='returnFaceLandmarks', doc='ServiceParam: Return face landmarks of the detected faces or not. The default value is false.')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='DetectFace_ec61c3846cd1_error', handler=None, imageUrl=None, imageUrlCol=None, outputCol='DetectFace_ec61c3846cd1_output', returnFaceAttributes=None, returnFaceAttributesCol=None, returnFaceId=None, returnFaceIdCol=None, returnFaceLandmarks=None, returnFaceLandmarksCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setReturnFaceAttributes(value)[source]
- Parameters
returnFaceAttributes – Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
- setReturnFaceAttributesCol(value)[source]
- Parameters
returnFaceAttributes – Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
- setReturnFaceId(value)[source]
- Parameters
returnFaceId – Return faceIds of the detected faces or not. The default value is true
- setReturnFaceIdCol(value)[source]
- Parameters
returnFaceId – Return faceIds of the detected faces or not. The default value is true
- setReturnFaceLandmarks(value)[source]
- Parameters
returnFaceLandmarks – Return face landmarks of the detected faces or not. The default value is false.
- setReturnFaceLandmarksCol(value)[source]
- Parameters
returnFaceLandmarks – Return face landmarks of the detected faces or not. The default value is false.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DetectLastAnomaly module
- class synapse.ml.cognitive.DetectLastAnomaly.DetectLastAnomaly(java_obj=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectLastAnomaly_49ae8cd4adb7_error', granularity=None, granularityCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectLastAnomaly_49ae8cd4adb7_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
errorCol (str) – column to hold http errors
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
handler (object) – Which strategy to use when handling requests
imputeFixedValue (object) – Optional argument, fixed value to use when imputeMode is set to “fixed”
imputeMode (object) – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
outputCol (str) – The name of the output column
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (str) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customInterval = Param(parent='undefined', name='customInterval', doc='ServiceParam: Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5. ')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomInterval()[source]
- Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
customInterval
- getGranularity()[source]
- Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
granularity
- getImputeFixedValue()[source]
- Returns
Optional argument, fixed value to use when imputeMode is set to “fixed”
- Return type
imputeFixedValue
- getImputeMode()[source]
- Returns
Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- Return type
imputeMode
- getMaxAnomalyRatio()[source]
- Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
maxAnomalyRatio
- getPeriod()[source]
- Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
period
- getSensitivity()[source]
- Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
sensitivity
- getSeries()[source]
- Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
series
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- granularity = Param(parent='undefined', name='granularity', doc='ServiceParam: Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid. ')
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imputeFixedValue = Param(parent='undefined', name='imputeFixedValue', doc='ServiceParam: Optional argument, fixed value to use when imputeMode is set to "fixed" ')
- imputeMode = Param(parent='undefined', name='imputeMode', doc='ServiceParam: Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill ')
- maxAnomalyRatio = Param(parent='undefined', name='maxAnomalyRatio', doc='ServiceParam: Optional argument, advanced model parameter, max anomaly ratio in a time series. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- period = Param(parent='undefined', name='period', doc='ServiceParam: Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically. ')
- sensitivity = Param(parent='undefined', name='sensitivity', doc='ServiceParam: Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted ')
- series = Param(parent='undefined', name='series', doc='ServiceParam: Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned. ')