synapse.ml.cognitive package
Submodules
synapse.ml.cognitive.AddDocuments module
- class synapse.ml.cognitive.AddDocuments.AddDocuments(java_obj=None, actionCol='@search.action', batchSize=100, concurrency=1, concurrentTimeout=None, errorCol='AddDocuments_0798f12e5127_error', handler=None, indexName=None, outputCol='AddDocuments_0798f12e5127_output', serviceName=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
actionCol¶ (str) – You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an ‘upsert’ where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn’t exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field ‘tags’ with value [‘budget’] and you execute a merge with value [‘economy’, ‘pool’] for ‘tags’, the final value of the ‘tags’ field will be [‘economy’, ‘pool’]. It will not be [‘budget’, ‘economy’, ‘pool’]. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null.
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
handler¶ (object) – Which strategy to use when handling requests
timeout¶ (float) – number of seconds to wait before closing the connection
- actionCol = Param(parent='undefined', name='actionCol', doc=" You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an 'upsert' where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn't exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field 'tags' with value ['budget'] and you execute a merge with value ['economy', 'pool'] for 'tags', the final value of the 'tags' field will be ['economy', 'pool']. It will not be ['budget', 'economy', 'pool']. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null. ")
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getActionCol()[source]
- Returns
You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an ‘upsert’ where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn’t exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field ‘tags’ with value [‘budget’] and you execute a merge with value [‘economy’, ‘pool’] for ‘tags’, the final value of the ‘tags’ field will be [‘economy’, ‘pool’]. It will not be [‘budget’, ‘economy’, ‘pool’]. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null.
- Return type
actionCol
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- indexName = Param(parent='undefined', name='indexName', doc='')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- serviceName = Param(parent='undefined', name='serviceName', doc='')
- setActionCol(value)[source]
- Parameters
actionCol¶ – You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an ‘upsert’ where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn’t exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field ‘tags’ with value [‘budget’] and you execute a merge with value [‘economy’, ‘pool’] for ‘tags’, the final value of the ‘tags’ field will be [‘economy’, ‘pool’]. It will not be [‘budget’, ‘economy’, ‘pool’]. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null.
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setParams(actionCol='@search.action', batchSize=100, concurrency=1, concurrentTimeout=None, errorCol='AddDocuments_0798f12e5127_error', handler=None, indexName=None, outputCol='AddDocuments_0798f12e5127_output', serviceName=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeBusinessCards module
- class synapse.ml.cognitive.AnalyzeBusinessCards.AnalyzeBusinessCards(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeBusinessCards_bda4d0726f15_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeBusinessCards_bda4d0726f15_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
includeTextDetails¶ (object) – Include text lines and element references in the result.
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
locale¶ (object) – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
pages¶ (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay¶ (int) – number of milliseconds to wait between polling
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout¶ (float) – number of seconds to wait before closing the connection
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLocale()[source]
- Returns
Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- Return type
locale
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='ServiceParam: Include text lines and element references in the result.')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- locale = Param(parent='undefined', name='locale', doc='ServiceParam: Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails¶ – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails¶ – Include text lines and element references in the result.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setLocale(value)[source]
- Parameters
locale¶ – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setLocaleCol(value)[source]
- Parameters
locale¶ – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setPages(value)[source]
- Parameters
pages¶ – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages¶ – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeBusinessCards_bda4d0726f15_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeBusinessCards_bda4d0726f15_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeCustomModel module
- class synapse.ml.cognitive.AnalyzeCustomModel.AnalyzeCustomModel(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeCustomModel_a61c6e744e67_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, maxPollingRetries=1000, modelId=None, modelIdCol=None, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeCustomModel_a61c6e744e67_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
includeTextDetails¶ (object) – Include text lines and element references in the result.
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
pollingDelay¶ (int) – number of milliseconds to wait between polling
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout¶ (float) – number of seconds to wait before closing the connection
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='ServiceParam: Include text lines and element references in the result.')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelId = Param(parent='undefined', name='modelId', doc='ServiceParam: Model identifier.')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails¶ – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails¶ – Include text lines and element references in the result.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeCustomModel_a61c6e744e67_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, maxPollingRetries=1000, modelId=None, modelIdCol=None, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeCustomModel_a61c6e744e67_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeDocument module
- class synapse.ml.cognitive.AnalyzeDocument.AnalyzeDocument(java_obj=None, apiVersion=None, apiVersionCol=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeDocument_fbb2dd3b421a_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, outputCol='AnalyzeDocument_fbb2dd3b421a_output', pages=None, pagesCol=None, pollingDelay=300, prebuiltModelId=None, prebuiltModelIdCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
locale¶ (object) – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
pages¶ (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay¶ (int) – number of milliseconds to wait between polling
prebuiltModelId¶ (object) – Prebuilt Model identifier for Form Recognizer V3.0, supported modelId: prebuilt-read, prebuilt-layout,prebuilt-document, prebuilt-businessCard, prebuilt-idDocument, prebuilt-invoice, prebuilt-receipt,or your custom modelId
stringIndexType¶ (object) – Method used to compute string offset and length.
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout¶ (float) – number of seconds to wait before closing the connection
- apiVersion = Param(parent='undefined', name='apiVersion', doc='ServiceParam: version of the api')
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLocale()[source]
- Returns
Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- Return type
locale
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getPrebuiltModelId()[source]
- Returns
Prebuilt Model identifier for Form Recognizer V3.0, supported modelId: prebuilt-read, prebuilt-layout,prebuilt-document, prebuilt-businessCard, prebuilt-idDocument, prebuilt-invoice, prebuilt-receipt,or your custom modelId
- Return type
prebuiltModelId
- getStringIndexType()[source]
- Returns
Method used to compute string offset and length.
- Return type
stringIndexType
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- locale = Param(parent='undefined', name='locale', doc='ServiceParam: Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- prebuiltModelId = Param(parent='undefined', name='prebuiltModelId', doc='ServiceParam: Prebuilt Model identifier for Form Recognizer V3.0, supported modelId: prebuilt-read, prebuilt-layout,prebuilt-document, prebuilt-businessCard, prebuilt-idDocument, prebuilt-invoice, prebuilt-receipt,or your custom modelId')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setLocale(value)[source]
- Parameters
locale¶ – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setLocaleCol(value)[source]
- Parameters
locale¶ – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setPages(value)[source]
- Parameters
pages¶ – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages¶ – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(apiVersion=None, apiVersionCol=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeDocument_fbb2dd3b421a_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, outputCol='AnalyzeDocument_fbb2dd3b421a_output', pages=None, pagesCol=None, pollingDelay=300, prebuiltModelId=None, prebuiltModelIdCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setPrebuiltModelId(value)[source]
- Parameters
prebuiltModelId¶ – Prebuilt Model identifier for Form Recognizer V3.0, supported modelId: prebuilt-read, prebuilt-layout,prebuilt-document, prebuilt-businessCard, prebuilt-idDocument, prebuilt-invoice, prebuilt-receipt,or your custom modelId
- setPrebuiltModelIdCol(value)[source]
- Parameters
prebuiltModelId¶ – Prebuilt Model identifier for Form Recognizer V3.0, supported modelId: prebuilt-read, prebuilt-layout,prebuilt-document, prebuilt-businessCard, prebuilt-idDocument, prebuilt-invoice, prebuilt-receipt,or your custom modelId
- setStringIndexType(value)[source]
- Parameters
stringIndexType¶ – Method used to compute string offset and length.
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType¶ – Method used to compute string offset and length.
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Method used to compute string offset and length.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeHealthText module
- class synapse.ml.cognitive.AnalyzeHealthText.AnalyzeHealthText(java_obj=None, backoffs=[100, 500, 1000], batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='AnalyzeHealthText_84b69b2be382_error', initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeHealthText_84b69b2be382_output', pollingDelay=300, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
language¶ (object) – the language code of the text (optional for some services)
pollingDelay¶ (int) – number of milliseconds to wait between polling
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout¶ (float) – number of seconds to wait before closing the connection
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getShowStats()[source]
- Returns
Whether to include detailed statistics in the response
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setLanguage(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setParams(backoffs=[100, 500, 1000], batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='AnalyzeHealthText_84b69b2be382_error', initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeHealthText_84b69b2be382_output', pollingDelay=300, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setShowStats(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setStringIndexType(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeIDDocuments module
- class synapse.ml.cognitive.AnalyzeIDDocuments.AnalyzeIDDocuments(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeIDDocuments_4f6d536ba804_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeIDDocuments_4f6d536ba804_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
includeTextDetails¶ (object) – Include text lines and element references in the result.
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
pages¶ (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay¶ (int) – number of milliseconds to wait between polling
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout¶ (float) – number of seconds to wait before closing the connection
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='ServiceParam: Include text lines and element references in the result.')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails¶ – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails¶ – Include text lines and element references in the result.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setPages(value)[source]
- Parameters
pages¶ – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages¶ – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeIDDocuments_4f6d536ba804_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeIDDocuments_4f6d536ba804_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeImage module
- class synapse.ml.cognitive.AnalyzeImage.AnalyzeImage(java_obj=None, concurrency=1, concurrentTimeout=None, descriptionExclude=None, descriptionExcludeCol=None, details=None, detailsCol=None, errorCol='AnalyzeImage_389262bb50bd_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='AnalyzeImage_389262bb50bd_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None, visualFeatures=None, visualFeaturesCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
descriptionExclude¶ (object) – Whether to exclude certain parts of the model in the description
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language of the response (en if none given)
timeout¶ (float) – number of seconds to wait before closing the connection
visualFeatures¶ (object) – what visual feature types to return
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- descriptionExclude = Param(parent='undefined', name='descriptionExclude', doc='ServiceParam: Whether to exclude certain parts of the model in the description')
- details = Param(parent='undefined', name='details', doc='ServiceParam: what visual feature types to return')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getDescriptionExclude()[source]
- Returns
Whether to exclude certain parts of the model in the description
- Return type
descriptionExclude
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language of the response (en if none given)')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setDescriptionExclude(value)[source]
- Parameters
descriptionExclude¶ – Whether to exclude certain parts of the model in the description
- setDescriptionExcludeCol(value)[source]
- Parameters
descriptionExclude¶ – Whether to exclude certain parts of the model in the description
- setLanguageCol(value)[source]
- Parameters
language¶ – the language of the response (en if none given)
- setParams(concurrency=1, concurrentTimeout=None, descriptionExclude=None, descriptionExcludeCol=None, details=None, detailsCol=None, errorCol='AnalyzeImage_389262bb50bd_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='AnalyzeImage_389262bb50bd_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None, visualFeatures=None, visualFeaturesCol=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- setVisualFeaturesCol(value)[source]
- Parameters
visualFeatures¶ – what visual feature types to return
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- visualFeatures = Param(parent='undefined', name='visualFeatures', doc='ServiceParam: what visual feature types to return')
synapse.ml.cognitive.AnalyzeInvoices module
- class synapse.ml.cognitive.AnalyzeInvoices.AnalyzeInvoices(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeInvoices_69130e2fcdca_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeInvoices_69130e2fcdca_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
includeTextDetails¶ (object) – Include text lines and element references in the result.
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
locale¶ (object) – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
pages¶ (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay¶ (int) – number of milliseconds to wait between polling
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout¶ (float) – number of seconds to wait before closing the connection
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLocale()[source]
- Returns
Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- Return type
locale
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='ServiceParam: Include text lines and element references in the result.')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- locale = Param(parent='undefined', name='locale', doc='ServiceParam: Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails¶ – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails¶ – Include text lines and element references in the result.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setLocale(value)[source]
- Parameters
locale¶ – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setLocaleCol(value)[source]
- Parameters
locale¶ – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setPages(value)[source]
- Parameters
pages¶ – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages¶ – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeInvoices_69130e2fcdca_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeInvoices_69130e2fcdca_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeLayout module
- class synapse.ml.cognitive.AnalyzeLayout.AnalyzeLayout(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeLayout_7e060026954b_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeLayout_7e060026954b_output', pages=None, pagesCol=None, pollingDelay=300, readingOrder=None, readingOrderCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
language¶ (object) – The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
pages¶ (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay¶ (int) – number of milliseconds to wait between polling
readingOrder¶ (object) – Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either ‘basic’ or ‘natural’. Will default to basic if not specified
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout¶ (float) – number of seconds to wait before closing the connection
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLanguage()[source]
- Returns
The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- Return type
language
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getReadingOrder()[source]
- Returns
Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either ‘basic’ or ‘natural’. Will default to basic if not specified
- Return type
readingOrder
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- language = Param(parent='undefined', name='language', doc='ServiceParam: The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- readingOrder = Param(parent='undefined', name='readingOrder', doc="ServiceParam: Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either 'basic' or 'natural'. Will default to basic if not specified")
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setLanguage(value)[source]
- Parameters
language¶ – The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- setLanguageCol(value)[source]
- Parameters
language¶ – The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- setPages(value)[source]
- Parameters
pages¶ – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages¶ – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeLayout_7e060026954b_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeLayout_7e060026954b_output', pages=None, pagesCol=None, pollingDelay=300, readingOrder=None, readingOrderCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setReadingOrder(value)[source]
- Parameters
readingOrder¶ – Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either ‘basic’ or ‘natural’. Will default to basic if not specified
- setReadingOrderCol(value)[source]
- Parameters
readingOrder¶ – Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either ‘basic’ or ‘natural’. Will default to basic if not specified
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeReceipts module
- class synapse.ml.cognitive.AnalyzeReceipts.AnalyzeReceipts(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeReceipts_1abfb1ae124b_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeReceipts_1abfb1ae124b_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
includeTextDetails¶ (object) – Include text lines and element references in the result.
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
locale¶ (object) – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
pages¶ (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay¶ (int) – number of milliseconds to wait between polling
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout¶ (float) – number of seconds to wait before closing the connection
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLocale()[source]
- Returns
Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- Return type
locale
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='ServiceParam: Include text lines and element references in the result.')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- locale = Param(parent='undefined', name='locale', doc='ServiceParam: Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="ServiceParam: The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed; e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails¶ – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails¶ – Include text lines and element references in the result.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setLocale(value)[source]
- Parameters
locale¶ – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setLocaleCol(value)[source]
- Parameters
locale¶ – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setPages(value)[source]
- Parameters
pages¶ – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages¶ – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed; e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeReceipts_1abfb1ae124b_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, initialPollingDelay=300, locale=None, localeCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='AnalyzeReceipts_1abfb1ae124b_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AzureSearchWriter module
synapse.ml.cognitive.BingImageSearch module
- class synapse.ml.cognitive.BingImageSearch.BingImageSearch(java_obj=None, aspect=None, aspectCol=None, color=None, colorCol=None, concurrency=1, concurrentTimeout=None, count=None, countCol=None, errorCol='BingImageSearch_3db2ca0a3358_error', freshness=None, freshnessCol=None, handler=None, height=None, heightCol=None, imageContent=None, imageContentCol=None, imageType=None, imageTypeCol=None, license=None, licenseCol=None, maxFileSize=None, maxFileSizeCol=None, maxHeight=None, maxHeightCol=None, maxWidth=None, maxWidthCol=None, minFileSize=None, minFileSizeCol=None, minHeight=None, minHeightCol=None, minWidth=None, minWidthCol=None, mkt=None, mktCol=None, offset=None, offsetCol=None, outputCol='BingImageSearch_3db2ca0a3358_output', q=None, qCol=None, size=None, sizeCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url='https://api.bing.microsoft.com/v7.0/images/search', width=None, widthCol=None)[source]
Bases:
synapse.ml.cognitive._BingImageSearch._BingImageSearch
synapse.ml.cognitive.BreakSentence module
- class synapse.ml.cognitive.BreakSentence.BreakSentence(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='BreakSentence_2077b8d5aa5a_error', handler=None, language=None, languageCol=None, outputCol='BreakSentence_2077b8d5aa5a_output', script=None, scriptCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
script¶ (object) – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
timeout¶ (float) – number of seconds to wait before closing the connection
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- Return type
language
- getScript()[source]
- Returns
Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
- Return type
script
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- script = Param(parent='undefined', name='script', doc='ServiceParam: Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language¶ – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setLanguageCol(value)[source]
- Parameters
language¶ – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='BreakSentence_2077b8d5aa5a_error', handler=None, language=None, languageCol=None, outputCol='BreakSentence_2077b8d5aa5a_output', script=None, scriptCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setScript(value)[source]
- Parameters
script¶ – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
- setScriptCol(value)[source]
- Parameters
script¶ – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.ConversationTranscription module
- class synapse.ml.cognitive.ConversationTranscription.ConversationTranscription(java_obj=None, audioDataCol=None, endpointId=None, extraFfmpegArgs=[], fileType=None, fileTypeCol=None, format=None, formatCol=None, language=None, languageCol=None, outputCol=None, participantsJson=None, participantsJsonCol=None, profanity=None, profanityCol=None, recordAudioData=False, recordedFileNameCol=None, streamIntermediateResults=True, subscriptionKey=None, subscriptionKeyCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
audioDataCol¶ (str) – Column holding audio data, must be either ByteArrays or Strings representing file URIs
extraFfmpegArgs¶ (list) – extra arguments to for ffmpeg output decoding
fileType¶ (object) – The file type of the sound files, supported types: wav, ogg, mp3
format¶ (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
language¶ (object) – Identifies the spoken language that is being recognized.
participantsJson¶ (object) – a json representation of a list of conversation participants (email, language, user)
profanity¶ (object) – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
recordAudioData¶ (bool) – Whether to record audio data to a file location, for use only with m3u8 streams
recordedFileNameCol¶ (str) – Column holding file names to write audio data to if ``recordAudioData’’ is set to true
streamIntermediateResults¶ (bool) – Whether or not to immediately return itermediate results, or group in a sequence
- audioDataCol = Param(parent='undefined', name='audioDataCol', doc='Column holding audio data, must be either ByteArrays or Strings representing file URIs')
- endpointId = Param(parent='undefined', name='endpointId', doc='endpoint for custom speech models')
- extraFfmpegArgs = Param(parent='undefined', name='extraFfmpegArgs', doc='extra arguments to for ffmpeg output decoding')
- fileType = Param(parent='undefined', name='fileType', doc='ServiceParam: The file type of the sound files, supported types: wav, ogg, mp3')
- format = Param(parent='undefined', name='format', doc='ServiceParam: Specifies the result format. Accepted values are simple and detailed. Default is simple. ')
- getAudioDataCol()[source]
- Returns
Column holding audio data, must be either ByteArrays or Strings representing file URIs
- Return type
audioDataCol
- getExtraFfmpegArgs()[source]
- Returns
extra arguments to for ffmpeg output decoding
- Return type
extraFfmpegArgs
- getFileType()[source]
- Returns
The file type of the sound files, supported types: wav, ogg, mp3
- Return type
fileType
- getFormat()[source]
- Returns
Specifies the result format. Accepted values are simple and detailed. Default is simple.
- Return type
format
- getLanguage()[source]
- Returns
Identifies the spoken language that is being recognized.
- Return type
language
- getParticipantsJson()[source]
- Returns
a json representation of a list of conversation participants (email, language, user)
- Return type
participantsJson
- getProfanity()[source]
- Returns
Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- Return type
profanity
- getRecordAudioData()[source]
- Returns
Whether to record audio data to a file location, for use only with m3u8 streams
- Return type
recordAudioData
- getRecordedFileNameCol()[source]
- Returns
Column holding file names to write audio data to if ``recordAudioData’’ is set to true
- Return type
recordedFileNameCol
- getStreamIntermediateResults()[source]
- Returns
Whether or not to immediately return itermediate results, or group in a sequence
- Return type
streamIntermediateResults
- language = Param(parent='undefined', name='language', doc='ServiceParam: Identifies the spoken language that is being recognized. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- participantsJson = Param(parent='undefined', name='participantsJson', doc='ServiceParam: a json representation of a list of conversation participants (email, language, user)')
- profanity = Param(parent='undefined', name='profanity', doc='ServiceParam: Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked. ')
- recordAudioData = Param(parent='undefined', name='recordAudioData', doc='Whether to record audio data to a file location, for use only with m3u8 streams')
- recordedFileNameCol = Param(parent='undefined', name='recordedFileNameCol', doc="Column holding file names to write audio data to if ``recordAudioData'' is set to true")
- setAudioDataCol(value)[source]
- Parameters
audioDataCol¶ – Column holding audio data, must be either ByteArrays or Strings representing file URIs
- setExtraFfmpegArgs(value)[source]
- Parameters
extraFfmpegArgs¶ – extra arguments to for ffmpeg output decoding
- setFileType(value)[source]
- Parameters
fileType¶ – The file type of the sound files, supported types: wav, ogg, mp3
- setFileTypeCol(value)[source]
- Parameters
fileType¶ – The file type of the sound files, supported types: wav, ogg, mp3
- setFormat(value)[source]
- Parameters
format¶ – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setFormatCol(value)[source]
- Parameters
format¶ – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setLanguage(value)[source]
- Parameters
language¶ – Identifies the spoken language that is being recognized.
- setLanguageCol(value)[source]
- Parameters
language¶ – Identifies the spoken language that is being recognized.
- setParams(audioDataCol=None, endpointId=None, extraFfmpegArgs=[], fileType=None, fileTypeCol=None, format=None, formatCol=None, language=None, languageCol=None, outputCol=None, participantsJson=None, participantsJsonCol=None, profanity=None, profanityCol=None, recordAudioData=False, recordedFileNameCol=None, streamIntermediateResults=True, subscriptionKey=None, subscriptionKeyCol=None, url=None)[source]
Set the (keyword only) parameters
- setParticipantsJson(value)[source]
- Parameters
participantsJson¶ – a json representation of a list of conversation participants (email, language, user)
- setParticipantsJsonCol(value)[source]
- Parameters
participantsJson¶ – a json representation of a list of conversation participants (email, language, user)
- setProfanity(value)[source]
- Parameters
profanity¶ – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setProfanityCol(value)[source]
- Parameters
profanity¶ – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setRecordAudioData(value)[source]
- Parameters
recordAudioData¶ – Whether to record audio data to a file location, for use only with m3u8 streams
- setStreamIntermediateResults(value)[source]
- Parameters
streamIntermediateResults¶ – Whether or not to immediately return itermediate results, or group in a sequence
- streamIntermediateResults = Param(parent='undefined', name='streamIntermediateResults', doc='Whether or not to immediately return itermediate results, or group in a sequence')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DescribeImage module
- class synapse.ml.cognitive.DescribeImage.DescribeImage(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='DescribeImage_d51a96e5ee4b_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, maxCandidates=None, maxCandidatesCol=None, outputCol='DescribeImage_d51a96e5ee4b_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getMaxCandidates()[source]
- Returns
Maximum candidate descriptions to return
- Return type
maxCandidates
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- language = Param(parent='undefined', name='language', doc='ServiceParam: Language of image description')
- maxCandidates = Param(parent='undefined', name='maxCandidates', doc='ServiceParam: Maximum candidate descriptions to return')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setMaxCandidates(value)[source]
- Parameters
maxCandidates¶ – Maximum candidate descriptions to return
- setMaxCandidatesCol(value)[source]
- Parameters
maxCandidates¶ – Maximum candidate descriptions to return
- setParams(concurrency=1, concurrentTimeout=None, errorCol='DescribeImage_d51a96e5ee4b_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, maxCandidates=None, maxCandidatesCol=None, outputCol='DescribeImage_d51a96e5ee4b_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.Detect module
- class synapse.ml.cognitive.Detect.Detect(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='Detect_0de4739bc560_error', handler=None, outputCol='Detect_0de4739bc560_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='Detect_0de4739bc560_error', handler=None, outputCol='Detect_0de4739bc560_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DetectAnomalies module
- class synapse.ml.cognitive.DetectAnomalies.DetectAnomalies(java_obj=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectAnomalies_6c33fc6b14ae_error', granularity=None, granularityCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectAnomalies_6c33fc6b14ae_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
customInterval¶ (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
granularity¶ (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
handler¶ (object) – Which strategy to use when handling requests
imputeFixedValue¶ (object) – Optional argument, fixed value to use when imputeMode is set to “fixed”
imputeMode¶ (object) – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
maxAnomalyRatio¶ (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
period¶ (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity¶ (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series¶ (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
timeout¶ (float) – number of seconds to wait before closing the connection
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customInterval = Param(parent='undefined', name='customInterval', doc='ServiceParam: Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5. ')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomInterval()[source]
- Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
customInterval
- getGranularity()[source]
- Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
granularity
- getImputeFixedValue()[source]
- Returns
Optional argument, fixed value to use when imputeMode is set to “fixed”
- Return type
imputeFixedValue
- getImputeMode()[source]
- Returns
Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- Return type
imputeMode
- getMaxAnomalyRatio()[source]
- Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
maxAnomalyRatio
- getPeriod()[source]
- Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
period
- getSensitivity()[source]
- Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
sensitivity
- getSeries()[source]
- Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
series
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- granularity = Param(parent='undefined', name='granularity', doc='ServiceParam: Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid. ')
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imputeFixedValue = Param(parent='undefined', name='imputeFixedValue', doc='ServiceParam: Optional argument, fixed value to use when imputeMode is set to "fixed" ')
- imputeMode = Param(parent='undefined', name='imputeMode', doc='ServiceParam: Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill ')
- maxAnomalyRatio = Param(parent='undefined', name='maxAnomalyRatio', doc='ServiceParam: Optional argument, advanced model parameter, max anomaly ratio in a time series. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- period = Param(parent='undefined', name='period', doc='ServiceParam: Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically. ')
- sensitivity = Param(parent='undefined', name='sensitivity', doc='ServiceParam: Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted ')
- series = Param(parent='undefined', name='series', doc='ServiceParam: Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned. ')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomInterval(value)[source]
- Parameters
customInterval¶ – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setCustomIntervalCol(value)[source]
- Parameters
customInterval¶ – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setGranularity(value)[source]
- Parameters
granularity¶ – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setGranularityCol(value)[source]
- Parameters
granularity¶ – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setImputeFixedValue(value)[source]
- Parameters
imputeFixedValue¶ – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeFixedValueCol(value)[source]
- Parameters
imputeFixedValue¶ – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeMode(value)[source]
- Parameters
imputeMode¶ – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setImputeModeCol(value)[source]
- Parameters
imputeMode¶ – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setMaxAnomalyRatio(value)[source]
- Parameters
maxAnomalyRatio¶ – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setMaxAnomalyRatioCol(value)[source]
- Parameters
maxAnomalyRatio¶ – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setParams(concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectAnomalies_6c33fc6b14ae_error', granularity=None, granularityCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectAnomalies_6c33fc6b14ae_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPeriod(value)[source]
- Parameters
period¶ – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setPeriodCol(value)[source]
- Parameters
period¶ – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setSensitivity(value)[source]
- Parameters
sensitivity¶ – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSensitivityCol(value)[source]
- Parameters
sensitivity¶ – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSeries(value)[source]
- Parameters
series¶ – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setSeriesCol(value)[source]
- Parameters
series¶ – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DetectFace module
- class synapse.ml.cognitive.DetectFace.DetectFace(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='DetectFace_635eb791c3ac_error', handler=None, imageUrl=None, imageUrlCol=None, outputCol='DetectFace_635eb791c3ac_output', returnFaceAttributes=None, returnFaceAttributesCol=None, returnFaceId=None, returnFaceIdCol=None, returnFaceLandmarks=None, returnFaceLandmarksCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
handler¶ (object) – Which strategy to use when handling requests
returnFaceAttributes¶ (object) – Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
returnFaceId¶ (object) – Return faceIds of the detected faces or not. The default value is true
returnFaceLandmarks¶ (object) – Return face landmarks of the detected faces or not. The default value is false.
timeout¶ (float) – number of seconds to wait before closing the connection
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getReturnFaceAttributes()[source]
- Returns
Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
- Return type
returnFaceAttributes
- getReturnFaceId()[source]
- Returns
Return faceIds of the detected faces or not. The default value is true
- Return type
returnFaceId
- getReturnFaceLandmarks()[source]
- Returns
Return face landmarks of the detected faces or not. The default value is false.
- Return type
returnFaceLandmarks
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- returnFaceAttributes = Param(parent='undefined', name='returnFaceAttributes', doc='ServiceParam: Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.')
- returnFaceId = Param(parent='undefined', name='returnFaceId', doc='ServiceParam: Return faceIds of the detected faces or not. The default value is true')
- returnFaceLandmarks = Param(parent='undefined', name='returnFaceLandmarks', doc='ServiceParam: Return face landmarks of the detected faces or not. The default value is false.')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='DetectFace_635eb791c3ac_error', handler=None, imageUrl=None, imageUrlCol=None, outputCol='DetectFace_635eb791c3ac_output', returnFaceAttributes=None, returnFaceAttributesCol=None, returnFaceId=None, returnFaceIdCol=None, returnFaceLandmarks=None, returnFaceLandmarksCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setReturnFaceAttributes(value)[source]
- Parameters
returnFaceAttributes¶ – Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
- setReturnFaceAttributesCol(value)[source]
- Parameters
returnFaceAttributes¶ – Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
- setReturnFaceId(value)[source]
- Parameters
returnFaceId¶ – Return faceIds of the detected faces or not. The default value is true
- setReturnFaceIdCol(value)[source]
- Parameters
returnFaceId¶ – Return faceIds of the detected faces or not. The default value is true
- setReturnFaceLandmarks(value)[source]
- Parameters
returnFaceLandmarks¶ – Return face landmarks of the detected faces or not. The default value is false.
- setReturnFaceLandmarksCol(value)[source]
- Parameters
returnFaceLandmarks¶ – Return face landmarks of the detected faces or not. The default value is false.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DetectLastAnomaly module
- class synapse.ml.cognitive.DetectLastAnomaly.DetectLastAnomaly(java_obj=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectLastAnomaly_e9573e7a3b25_error', granularity=None, granularityCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectLastAnomaly_e9573e7a3b25_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
customInterval¶ (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
granularity¶ (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
handler¶ (object) – Which strategy to use when handling requests
imputeFixedValue¶ (object) – Optional argument, fixed value to use when imputeMode is set to “fixed”
imputeMode¶ (object) – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
maxAnomalyRatio¶ (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
period¶ (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity¶ (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series¶ (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
timeout¶ (float) – number of seconds to wait before closing the connection
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customInterval = Param(parent='undefined', name='customInterval', doc='ServiceParam: Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5. ')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomInterval()[source]
- Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
customInterval
- getGranularity()[source]
- Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
granularity
- getImputeFixedValue()[source]
- Returns
Optional argument, fixed value to use when imputeMode is set to “fixed”
- Return type
imputeFixedValue
- getImputeMode()[source]
- Returns
Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- Return type
imputeMode
- getMaxAnomalyRatio()[source]
- Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
maxAnomalyRatio
- getPeriod()[source]
- Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
period
- getSensitivity()[source]
- Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
sensitivity
- getSeries()[source]
- Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
series
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- granularity = Param(parent='undefined', name='granularity', doc='ServiceParam: Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid. ')
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imputeFixedValue = Param(parent='undefined', name='imputeFixedValue', doc='ServiceParam: Optional argument, fixed value to use when imputeMode is set to "fixed" ')
- imputeMode = Param(parent='undefined', name='imputeMode', doc='ServiceParam: Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill ')
- maxAnomalyRatio = Param(parent='undefined', name='maxAnomalyRatio', doc='ServiceParam: Optional argument, advanced model parameter, max anomaly ratio in a time series. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- period = Param(parent='undefined', name='period', doc='ServiceParam: Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically. ')
- sensitivity = Param(parent='undefined', name='sensitivity', doc='ServiceParam: Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted ')
- series = Param(parent='undefined', name='series', doc='ServiceParam: Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned. ')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomInterval(value)[source]
- Parameters
customInterval¶ – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setCustomIntervalCol(value)[source]
- Parameters
customInterval¶ – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setGranularity(value)[source]
- Parameters
granularity¶ – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setGranularityCol(value)[source]
- Parameters
granularity¶ – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setImputeFixedValue(value)[source]
- Parameters
imputeFixedValue¶ – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeFixedValueCol(value)[source]
- Parameters
imputeFixedValue¶ – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeMode(value)[source]
- Parameters
imputeMode¶ – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setImputeModeCol(value)[source]
- Parameters
imputeMode¶ – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setMaxAnomalyRatio(value)[source]
- Parameters
maxAnomalyRatio¶ – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setMaxAnomalyRatioCol(value)[source]
- Parameters
maxAnomalyRatio¶ – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setParams(concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectLastAnomaly_e9573e7a3b25_error', granularity=None, granularityCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectLastAnomaly_e9573e7a3b25_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPeriod(value)[source]
- Parameters
period¶ – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setPeriodCol(value)[source]
- Parameters
period¶ – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setSensitivity(value)[source]
- Parameters
sensitivity¶ – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSensitivityCol(value)[source]
- Parameters
sensitivity¶ – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSeries(value)[source]
- Parameters
series¶ – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setSeriesCol(value)[source]
- Parameters
series¶ – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DetectMultivariateAnomaly module
- class synapse.ml.cognitive.DetectMultivariateAnomaly.DetectMultivariateAnomaly(java_obj=None, backoffs=[100, 500, 1000], diagnosticsInfo=None, endTime=None, errorCol='DetectMultivariateAnomaly_296061a0cc62_error', initialPollingDelay=300, inputCols=None, intermediateSaveDir=None, maxPollingRetries=1000, modelId=None, outputCol='DetectMultivariateAnomaly_296061a0cc62_output', pollingDelay=300, startTime=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timestampCol='timestamp', url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaModel
- Parameters
diagnosticsInfo¶ (object) – diagnosticsInfo for training a multivariate anomaly detection model
endTime¶ (str) – A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
intermediateSaveDir¶ (str) – Blob storage location in HDFS where intermediate data is saved while training.
pollingDelay¶ (int) – number of milliseconds to wait between polling
startTime¶ (str) – A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- diagnosticsInfo = Param(parent='undefined', name='diagnosticsInfo', doc='diagnosticsInfo for training a multivariate anomaly detection model')
- endTime = Param(parent='undefined', name='endTime', doc='A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getDiagnosticsInfo()[source]
- Returns
diagnosticsInfo for training a multivariate anomaly detection model
- Return type
diagnosticsInfo
- getEndTime()[source]
- Returns
A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- Return type
endTime
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getIntermediateSaveDir()[source]
- Returns
Blob storage location in HDFS where intermediate data is saved while training.
- Return type
intermediateSaveDir
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getStartTime()[source]
- Returns
A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- Return type
startTime
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- inputCols = Param(parent='undefined', name='inputCols', doc='The names of the input columns')
- intermediateSaveDir = Param(parent='undefined', name='intermediateSaveDir', doc='Blob storage location in HDFS where intermediate data is saved while training.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelId = Param(parent='undefined', name='modelId', doc='Format - uuid. Model identifier.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setDiagnosticsInfo(value)[source]
- Parameters
diagnosticsInfo¶ – diagnosticsInfo for training a multivariate anomaly detection model
- setEndTime(value)[source]
- Parameters
endTime¶ – A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setIntermediateSaveDir(value)[source]
- Parameters
intermediateSaveDir¶ – Blob storage location in HDFS where intermediate data is saved while training.
- setParams(backoffs=[100, 500, 1000], diagnosticsInfo=None, endTime=None, errorCol='DetectMultivariateAnomaly_296061a0cc62_error', initialPollingDelay=300, inputCols=None, intermediateSaveDir=None, maxPollingRetries=1000, modelId=None, outputCol='DetectMultivariateAnomaly_296061a0cc62_output', pollingDelay=300, startTime=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timestampCol='timestamp', url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setStartTime(value)[source]
- Parameters
startTime¶ – A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- startTime = Param(parent='undefined', name='startTime', doc='A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timestampCol = Param(parent='undefined', name='timestampCol', doc='Timestamp column name')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DictionaryExamples module
- class synapse.ml.cognitive.DictionaryExamples.DictionaryExamples(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='DictionaryExamples_d2c42078669d_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryExamples_d2c42078669d_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, textAndTranslation=None, textAndTranslationCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
fromLanguage¶ (object) – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
handler¶ (object) – Which strategy to use when handling requests
textAndTranslation¶ (object) – A string specifying the translated text previously returned by the Dictionary lookup operation.
timeout¶ (float) – number of seconds to wait before closing the connection
toLanguage¶ (object) – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fromLanguage = Param(parent='undefined', name='fromLanguage', doc='ServiceParam: Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFromLanguage()[source]
- Returns
Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- Return type
fromLanguage
- getTextAndTranslation()[source]
- Returns
A string specifying the translated text previously returned by the Dictionary lookup operation.
- Return type
textAndTranslation
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getToLanguage()[source]
- Returns
Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- Return type
toLanguage
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setFromLanguage(value)[source]
- Parameters
fromLanguage¶ – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- setFromLanguageCol(value)[source]
- Parameters
fromLanguage¶ – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='DictionaryExamples_d2c42078669d_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryExamples_d2c42078669d_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, textAndTranslation=None, textAndTranslationCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]
Set the (keyword only) parameters
- setTextAndTranslation(value)[source]
- Parameters
textAndTranslation¶ – A string specifying the translated text previously returned by the Dictionary lookup operation.
- setTextAndTranslationCol(value)[source]
- Parameters
textAndTranslation¶ – A string specifying the translated text previously returned by the Dictionary lookup operation.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- setToLanguage(value)[source]
- Parameters
toLanguage¶ – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- setToLanguageCol(value)[source]
- Parameters
toLanguage¶ – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- textAndTranslation = Param(parent='undefined', name='textAndTranslation', doc='ServiceParam: A string specifying the translated text previously returned by the Dictionary lookup operation.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- toLanguage = Param(parent='undefined', name='toLanguage', doc='ServiceParam: Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DictionaryLookup module
- class synapse.ml.cognitive.DictionaryLookup.DictionaryLookup(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='DictionaryLookup_3d747351baed_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryLookup_3d747351baed_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
fromLanguage¶ (object) – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
handler¶ (object) – Which strategy to use when handling requests
timeout¶ (float) – number of seconds to wait before closing the connection
toLanguage¶ (object) – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fromLanguage = Param(parent='undefined', name='fromLanguage', doc='ServiceParam: Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFromLanguage()[source]
- Returns
Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- Return type
fromLanguage
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getToLanguage()[source]
- Returns
Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- Return type
toLanguage
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setFromLanguage(value)[source]
- Parameters
fromLanguage¶ – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- setFromLanguageCol(value)[source]
- Parameters
fromLanguage¶ – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='DictionaryLookup_3d747351baed_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryLookup_3d747351baed_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- setToLanguage(value)[source]
- Parameters
toLanguage¶ – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- setToLanguageCol(value)[source]
- Parameters
toLanguage¶ – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- toLanguage = Param(parent='undefined', name='toLanguage', doc='ServiceParam: Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DocumentTranslator module
- class synapse.ml.cognitive.DocumentTranslator.DocumentTranslator(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='DocumentTranslator_ecb27ec36f86_error', filterPrefix=None, filterPrefixCol=None, filterSuffix=None, filterSuffixCol=None, initialPollingDelay=300, maxPollingRetries=1000, outputCol='DocumentTranslator_ecb27ec36f86_output', pollingDelay=300, serviceName=None, sourceLanguage=None, sourceLanguageCol=None, sourceStorageSource=None, sourceStorageSourceCol=None, sourceUrl=None, sourceUrlCol=None, storageType=None, storageTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, targets=None, targetsCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
filterPrefix¶ (object) – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.
filterSuffix¶ (object) – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
pollingDelay¶ (int) – number of milliseconds to wait between polling
sourceLanguage¶ (object) – Language code. If none is specified, we will perform auto detect on the document.
sourceStorageSource¶ (object) – Storage source of source input.
sourceUrl¶ (object) – Location of the folder / container or single file with your documents.
storageType¶ (object) – Storage type of the input documents source string. Required for single document translation only.
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
targets¶ (object) – Destination for the finished translated documents.
timeout¶ (float) – number of seconds to wait before closing the connection
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- filterPrefix = Param(parent='undefined', name='filterPrefix', doc='ServiceParam: A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.')
- filterSuffix = Param(parent='undefined', name='filterSuffix', doc='ServiceParam: A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFilterPrefix()[source]
- Returns
A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.
- Return type
filterPrefix
- getFilterSuffix()[source]
- Returns
A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
- Return type
filterSuffix
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSourceLanguage()[source]
- Returns
Language code. If none is specified, we will perform auto detect on the document.
- Return type
sourceLanguage
- getSourceStorageSource()[source]
- Returns
Storage source of source input.
- Return type
sourceStorageSource
- getSourceUrl()[source]
- Returns
Location of the folder / container or single file with your documents.
- Return type
sourceUrl
- getStorageType()[source]
- Returns
Storage type of the input documents source string. Required for single document translation only.
- Return type
storageType
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- getTargets()[source]
- Returns
Destination for the finished translated documents.
- Return type
targets
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- serviceName = Param(parent='undefined', name='serviceName', doc='')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setFilterPrefix(value)[source]
- Parameters
filterPrefix¶ – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.
- setFilterPrefixCol(value)[source]
- Parameters
filterPrefix¶ – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.
- setFilterSuffix(value)[source]
- Parameters
filterSuffix¶ – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
- setFilterSuffixCol(value)[source]
- Parameters
filterSuffix¶ – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='DocumentTranslator_ecb27ec36f86_error', filterPrefix=None, filterPrefixCol=None, filterSuffix=None, filterSuffixCol=None, initialPollingDelay=300, maxPollingRetries=1000, outputCol='DocumentTranslator_ecb27ec36f86_output', pollingDelay=300, serviceName=None, sourceLanguage=None, sourceLanguageCol=None, sourceStorageSource=None, sourceStorageSourceCol=None, sourceUrl=None, sourceUrlCol=None, storageType=None, storageTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, targets=None, targetsCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setSourceLanguage(value)[source]
- Parameters
sourceLanguage¶ – Language code. If none is specified, we will perform auto detect on the document.
- setSourceLanguageCol(value)[source]
- Parameters
sourceLanguage¶ – Language code. If none is specified, we will perform auto detect on the document.
- setSourceStorageSource(value)[source]
- Parameters
sourceStorageSource¶ – Storage source of source input.
- setSourceStorageSourceCol(value)[source]
- Parameters
sourceStorageSource¶ – Storage source of source input.
- setSourceUrl(value)[source]
- Parameters
sourceUrl¶ – Location of the folder / container or single file with your documents.
- setSourceUrlCol(value)[source]
- Parameters
sourceUrl¶ – Location of the folder / container or single file with your documents.
- setStorageType(value)[source]
- Parameters
storageType¶ – Storage type of the input documents source string. Required for single document translation only.
- setStorageTypeCol(value)[source]
- Parameters
storageType¶ – Storage type of the input documents source string. Required for single document translation only.
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- setTargetsCol(value)[source]
- Parameters
targets¶ – Destination for the finished translated documents.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- sourceLanguage = Param(parent='undefined', name='sourceLanguage', doc='ServiceParam: Language code. If none is specified, we will perform auto detect on the document.')
- sourceStorageSource = Param(parent='undefined', name='sourceStorageSource', doc='ServiceParam: Storage source of source input.')
- sourceUrl = Param(parent='undefined', name='sourceUrl', doc='ServiceParam: Location of the folder / container or single file with your documents.')
- storageType = Param(parent='undefined', name='storageType', doc='ServiceParam: Storage type of the input documents source string. Required for single document translation only.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- targets = Param(parent='undefined', name='targets', doc='ServiceParam: Destination for the finished translated documents.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.EntityDetector module
- class synapse.ml.cognitive.EntityDetector.EntityDetector(java_obj=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='EntityDetector_962b706c1057_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='EntityDetector_962b706c1057_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
timeout¶ (float) – number of seconds to wait before closing the connection
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getShowStats()[source]
- Returns
Whether to include detailed statistics in the response
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setParams(batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='EntityDetector_962b706c1057_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='EntityDetector_962b706c1057_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setStringIndexType(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.FindSimilarFace module
- class synapse.ml.cognitive.FindSimilarFace.FindSimilarFace(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='FindSimilarFace_f3b804dc56c6_error', faceId=None, faceIdCol=None, faceIds=None, faceIdsCol=None, faceListId=None, faceListIdCol=None, handler=None, largeFaceListId=None, largeFaceListIdCol=None, maxNumOfCandidatesReturned=None, maxNumOfCandidatesReturnedCol=None, mode=None, modeCol=None, outputCol='FindSimilarFace_f3b804dc56c6_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
faceId¶ (object) – faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
faceIds¶ (object) – An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
faceListId¶ (object) – An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
handler¶ (object) – Which strategy to use when handling requests
largeFaceListId¶ (object) – An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
maxNumOfCandidatesReturned¶ (object) – Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
mode¶ (object) – Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
timeout¶ (float) – number of seconds to wait before closing the connection
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- faceId = Param(parent='undefined', name='faceId', doc='ServiceParam: faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.')
- faceIds = Param(parent='undefined', name='faceIds', doc='ServiceParam: An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.')
- faceListId = Param(parent='undefined', name='faceListId', doc='ServiceParam: An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFaceId()[source]
- Returns
faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
- Return type
faceId
- getFaceIds()[source]
- Returns
An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- Return type
faceIds
- getFaceListId()[source]
- Returns
An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- Return type
faceListId
- getLargeFaceListId()[source]
- Returns
An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- Return type
largeFaceListId
- getMaxNumOfCandidatesReturned()[source]
- Returns
Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
- Return type
maxNumOfCandidatesReturned
- getMode()[source]
- Returns
Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
- Return type
mode
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- largeFaceListId = Param(parent='undefined', name='largeFaceListId', doc='ServiceParam: An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.')
- maxNumOfCandidatesReturned = Param(parent='undefined', name='maxNumOfCandidatesReturned', doc='ServiceParam: Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.')
- mode = Param(parent='undefined', name='mode', doc="ServiceParam: Optional parameter. Similar face searching mode. It can be 'matchPerson' or 'matchFace'. It defaults to 'matchPerson'.")
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setFaceId(value)[source]
- Parameters
faceId¶ – faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
- setFaceIdCol(value)[source]
- Parameters
faceId¶ – faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
- setFaceIds(value)[source]
- Parameters
faceIds¶ – An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setFaceIdsCol(value)[source]
- Parameters
faceIds¶ – An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setFaceListId(value)[source]
- Parameters
faceListId¶ – An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setFaceListIdCol(value)[source]
- Parameters
faceListId¶ – An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setLargeFaceListId(value)[source]
- Parameters
largeFaceListId¶ – An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setLargeFaceListIdCol(value)[source]
- Parameters
largeFaceListId¶ – An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setMaxNumOfCandidatesReturned(value)[source]
- Parameters
maxNumOfCandidatesReturned¶ – Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
- setMaxNumOfCandidatesReturnedCol(value)[source]
- Parameters
maxNumOfCandidatesReturned¶ – Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
- setMode(value)[source]
- Parameters
mode¶ – Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
- setModeCol(value)[source]
- Parameters
mode¶ – Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='FindSimilarFace_f3b804dc56c6_error', faceId=None, faceIdCol=None, faceIds=None, faceIdsCol=None, faceListId=None, faceListIdCol=None, handler=None, largeFaceListId=None, largeFaceListIdCol=None, maxNumOfCandidatesReturned=None, maxNumOfCandidatesReturnedCol=None, mode=None, modeCol=None, outputCol='FindSimilarFace_f3b804dc56c6_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.FitMultivariateAnomaly module
- class synapse.ml.cognitive.FitMultivariateAnomaly.FitMultivariateAnomaly(java_obj=None, alignMode=None, backoffs=[100, 500, 1000], displayName=None, endTime=None, errorCol='FitMultivariateAnomaly_468c56ea11dd_error', fillNAMethod=None, initialPollingDelay=300, inputCols=None, intermediateSaveDir=None, maxPollingRetries=1000, outputCol='FitMultivariateAnomaly_468c56ea11dd_output', paddingValue=None, pollingDelay=300, slidingWindow=None, startTime=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timestampCol='timestamp', url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaEstimator
- Parameters
alignMode¶ (str) – An optional field, indicates how we align different variables into the same time-range which is required by the model.{Inner, Outer}
endTime¶ (str) – A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
fillNAMethod¶ (str) – An optional field, indicates how missed values will be filled with. Can not be set to NotFill, when alignMode is Outer.{Previous, Subsequent, Linear, Zero, Fixed}
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
intermediateSaveDir¶ (str) – Blob storage location in HDFS where intermediate data is saved while training.
paddingValue¶ (int) – optional field, is only useful if FillNAMethod is set to Fixed.
pollingDelay¶ (int) – number of milliseconds to wait between polling
slidingWindow¶ (int) – An optional field, indicates how many history points will be used to determine the anomaly score of one subsequent point.
startTime¶ (str) – A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
- alignMode = Param(parent='undefined', name='alignMode', doc='An optional field, indicates how we align different variables into the same time-range which is required by the model.{Inner, Outer}')
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- displayName = Param(parent='undefined', name='displayName', doc='optional field, name of the model')
- endTime = Param(parent='undefined', name='endTime', doc='A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fillNAMethod = Param(parent='undefined', name='fillNAMethod', doc='An optional field, indicates how missed values will be filled with. Can not be set to NotFill, when alignMode is Outer.{Previous, Subsequent, Linear, Zero, Fixed}')
- getAlignMode()[source]
- Returns
An optional field, indicates how we align different variables into the same time-range which is required by the model.{Inner, Outer}
- Return type
alignMode
- getEndTime()[source]
- Returns
A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- Return type
endTime
- getFillNAMethod()[source]
- Returns
An optional field, indicates how missed values will be filled with. Can not be set to NotFill, when alignMode is Outer.{Previous, Subsequent, Linear, Zero, Fixed}
- Return type
fillNAMethod
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getIntermediateSaveDir()[source]
- Returns
Blob storage location in HDFS where intermediate data is saved while training.
- Return type
intermediateSaveDir
- getPaddingValue()[source]
- Returns
optional field, is only useful if FillNAMethod is set to Fixed.
- Return type
paddingValue
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSlidingWindow()[source]
- Returns
An optional field, indicates how many history points will be used to determine the anomaly score of one subsequent point.
- Return type
slidingWindow
- getStartTime()[source]
- Returns
A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- Return type
startTime
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- inputCols = Param(parent='undefined', name='inputCols', doc='The names of the input columns')
- intermediateSaveDir = Param(parent='undefined', name='intermediateSaveDir', doc='Blob storage location in HDFS where intermediate data is saved while training.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- paddingValue = Param(parent='undefined', name='paddingValue', doc='optional field, is only useful if FillNAMethod is set to Fixed.')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setAlignMode(value)[source]
- Parameters
alignMode¶ – An optional field, indicates how we align different variables into the same time-range which is required by the model.{Inner, Outer}
- setEndTime(value)[source]
- Parameters
endTime¶ – A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- setFillNAMethod(value)[source]
- Parameters
fillNAMethod¶ – An optional field, indicates how missed values will be filled with. Can not be set to NotFill, when alignMode is Outer.{Previous, Subsequent, Linear, Zero, Fixed}
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setIntermediateSaveDir(value)[source]
- Parameters
intermediateSaveDir¶ – Blob storage location in HDFS where intermediate data is saved while training.
- setPaddingValue(value)[source]
- Parameters
paddingValue¶ – optional field, is only useful if FillNAMethod is set to Fixed.
- setParams(alignMode=None, backoffs=[100, 500, 1000], displayName=None, endTime=None, errorCol='FitMultivariateAnomaly_468c56ea11dd_error', fillNAMethod=None, initialPollingDelay=300, inputCols=None, intermediateSaveDir=None, maxPollingRetries=1000, outputCol='FitMultivariateAnomaly_468c56ea11dd_output', paddingValue=None, pollingDelay=300, slidingWindow=None, startTime=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timestampCol='timestamp', url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setSlidingWindow(value)[source]
- Parameters
slidingWindow¶ – An optional field, indicates how many history points will be used to determine the anomaly score of one subsequent point.
- setStartTime(value)[source]
- Parameters
startTime¶ – A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- slidingWindow = Param(parent='undefined', name='slidingWindow', doc='An optional field, indicates how many history points will be used to determine the anomaly score of one subsequent point.')
- startTime = Param(parent='undefined', name='startTime', doc='A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timestampCol = Param(parent='undefined', name='timestampCol', doc='Timestamp column name')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.FormOntologyLearner module
- class synapse.ml.cognitive.FormOntologyLearner.FormOntologyLearner(java_obj=None, inputCol=None, outputCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaEstimator
- Parameters
- inputCol = Param(parent='undefined', name='inputCol', doc='The name of the input column')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
synapse.ml.cognitive.FormOntologyTransformer module
- class synapse.ml.cognitive.FormOntologyTransformer.FormOntologyTransformer(java_obj=None, inputCol=None, ontology=None, outputCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaModel
- Parameters
- inputCol = Param(parent='undefined', name='inputCol', doc='The name of the input column')
- ontology = Param(parent='undefined', name='ontology', doc='The ontology to cast values to')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
synapse.ml.cognitive.GenerateThumbnails module
- class synapse.ml.cognitive.GenerateThumbnails.GenerateThumbnails(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='GenerateThumbnails_9f1b4132fded_error', handler=None, height=None, heightCol=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, outputCol='GenerateThumbnails_9f1b4132fded_output', smartCropping=None, smartCroppingCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None, width=None, widthCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getSmartCropping()[source]
- Returns
whether to intelligently crop the image
- Return type
smartCropping
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- height = Param(parent='undefined', name='height', doc='ServiceParam: the desired height of the image')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='GenerateThumbnails_9f1b4132fded_error', handler=None, height=None, heightCol=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, outputCol='GenerateThumbnails_9f1b4132fded_output', smartCropping=None, smartCroppingCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None, width=None, widthCol=None)[source]
Set the (keyword only) parameters
- setSmartCropping(value)[source]
- Parameters
smartCropping¶ – whether to intelligently crop the image
- setSmartCroppingCol(value)[source]
- Parameters
smartCropping¶ – whether to intelligently crop the image
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- smartCropping = Param(parent='undefined', name='smartCropping', doc='ServiceParam: whether to intelligently crop the image')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- width = Param(parent='undefined', name='width', doc='ServiceParam: the desired width of the image')
synapse.ml.cognitive.GetCustomModel module
- class synapse.ml.cognitive.GetCustomModel.GetCustomModel(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='GetCustomModel_e1706bab7c80_error', handler=None, includeKeys=None, includeKeysCol=None, modelId=None, modelIdCol=None, outputCol='GetCustomModel_e1706bab7c80_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
handler¶ (object) – Which strategy to use when handling requests
includeKeys¶ (object) – Include list of extracted keys in model information.
timeout¶ (float) – number of seconds to wait before closing the connection
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeKeys()[source]
- Returns
Include list of extracted keys in model information.
- Return type
includeKeys
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- includeKeys = Param(parent='undefined', name='includeKeys', doc='ServiceParam: Include list of extracted keys in model information.')
- modelId = Param(parent='undefined', name='modelId', doc='ServiceParam: Model identifier.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setIncludeKeys(value)[source]
- Parameters
includeKeys¶ – Include list of extracted keys in model information.
- setIncludeKeysCol(value)[source]
- Parameters
includeKeys¶ – Include list of extracted keys in model information.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='GetCustomModel_e1706bab7c80_error', handler=None, includeKeys=None, includeKeysCol=None, modelId=None, modelIdCol=None, outputCol='GetCustomModel_e1706bab7c80_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.GroupFaces module
- class synapse.ml.cognitive.GroupFaces.GroupFaces(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='GroupFaces_e02dff3ff1fa_error', faceIds=None, faceIdsCol=None, handler=None, outputCol='GroupFaces_e02dff3ff1fa_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
faceIds¶ (object) – Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
handler¶ (object) – Which strategy to use when handling requests
timeout¶ (float) – number of seconds to wait before closing the connection
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- faceIds = Param(parent='undefined', name='faceIds', doc='ServiceParam: Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFaceIds()[source]
- Returns
Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
- Return type
faceIds
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setFaceIds(value)[source]
- Parameters
faceIds¶ – Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
- setFaceIdsCol(value)[source]
- Parameters
faceIds¶ – Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='GroupFaces_e02dff3ff1fa_error', faceIds=None, faceIdsCol=None, handler=None, outputCol='GroupFaces_e02dff3ff1fa_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.IdentifyFaces module
- class synapse.ml.cognitive.IdentifyFaces.IdentifyFaces(java_obj=None, concurrency=1, concurrentTimeout=None, confidenceThreshold=None, confidenceThresholdCol=None, errorCol='IdentifyFaces_6a78fb2b4255_error', faceIds=None, faceIdsCol=None, handler=None, largePersonGroupId=None, largePersonGroupIdCol=None, maxNumOfCandidatesReturned=None, maxNumOfCandidatesReturnedCol=None, outputCol='IdentifyFaces_6a78fb2b4255_output', personGroupId=None, personGroupIdCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
confidenceThreshold¶ (object) – Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
faceIds¶ (object) – Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
handler¶ (object) – Which strategy to use when handling requests
largePersonGroupId¶ (object) – largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
maxNumOfCandidatesReturned¶ (object) – The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
personGroupId¶ (object) – personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
timeout¶ (float) – number of seconds to wait before closing the connection
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- confidenceThreshold = Param(parent='undefined', name='confidenceThreshold', doc='ServiceParam: Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- faceIds = Param(parent='undefined', name='faceIds', doc='ServiceParam: Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10]. ')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getConfidenceThreshold()[source]
- Returns
Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
- Return type
confidenceThreshold
- getFaceIds()[source]
- Returns
Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
- Return type
faceIds
- getLargePersonGroupId()[source]
- Returns
largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- Return type
largePersonGroupId
- getMaxNumOfCandidatesReturned()[source]
- Returns
The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
- Return type
maxNumOfCandidatesReturned
- getPersonGroupId()[source]
- Returns
personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- Return type
personGroupId
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- largePersonGroupId = Param(parent='undefined', name='largePersonGroupId', doc='ServiceParam: largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.')
- maxNumOfCandidatesReturned = Param(parent='undefined', name='maxNumOfCandidatesReturned', doc='ServiceParam: The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- personGroupId = Param(parent='undefined', name='personGroupId', doc='ServiceParam: personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setConfidenceThreshold(value)[source]
- Parameters
confidenceThreshold¶ – Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
- setConfidenceThresholdCol(value)[source]
- Parameters
confidenceThreshold¶ – Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
- setFaceIds(value)[source]
- Parameters
faceIds¶ – Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
- setFaceIdsCol(value)[source]
- Parameters
faceIds¶ – Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
- setLargePersonGroupId(value)[source]
- Parameters
largePersonGroupId¶ – largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setLargePersonGroupIdCol(value)[source]
- Parameters
largePersonGroupId¶ – largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setMaxNumOfCandidatesReturned(value)[source]
- Parameters
maxNumOfCandidatesReturned¶ – The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
- setMaxNumOfCandidatesReturnedCol(value)[source]
- Parameters
maxNumOfCandidatesReturned¶ – The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
- setParams(concurrency=1, concurrentTimeout=None, confidenceThreshold=None, confidenceThresholdCol=None, errorCol='IdentifyFaces_6a78fb2b4255_error', faceIds=None, faceIdsCol=None, handler=None, largePersonGroupId=None, largePersonGroupIdCol=None, maxNumOfCandidatesReturned=None, maxNumOfCandidatesReturnedCol=None, outputCol='IdentifyFaces_6a78fb2b4255_output', personGroupId=None, personGroupIdCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPersonGroupId(value)[source]
- Parameters
personGroupId¶ – personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setPersonGroupIdCol(value)[source]
- Parameters
personGroupId¶ – personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.KeyPhraseExtractor module
- class synapse.ml.cognitive.KeyPhraseExtractor.KeyPhraseExtractor(java_obj=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='KeyPhraseExtractor_73a273e57ccc_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='KeyPhraseExtractor_73a273e57ccc_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
timeout¶ (float) – number of seconds to wait before closing the connection
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getShowStats()[source]
- Returns
Whether to include detailed statistics in the response
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setParams(batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='KeyPhraseExtractor_73a273e57ccc_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='KeyPhraseExtractor_73a273e57ccc_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setStringIndexType(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.LanguageDetector module
- class synapse.ml.cognitive.LanguageDetector.LanguageDetector(java_obj=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='LanguageDetector_735d92a332a6_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='LanguageDetector_735d92a332a6_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
timeout¶ (float) – number of seconds to wait before closing the connection
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getShowStats()[source]
- Returns
Whether to include detailed statistics in the response
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setParams(batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='LanguageDetector_735d92a332a6_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='LanguageDetector_735d92a332a6_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setStringIndexType(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.ListCustomModels module
- class synapse.ml.cognitive.ListCustomModels.ListCustomModels(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='ListCustomModels_5786f592e492_error', handler=None, op=None, opCol=None, outputCol='ListCustomModels_5786f592e492_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- op = Param(parent='undefined', name='op', doc='ServiceParam: Specify whether to return summary or full list of models.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='ListCustomModels_5786f592e492_error', handler=None, op=None, opCol=None, outputCol='ListCustomModels_5786f592e492_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.NER module
- class synapse.ml.cognitive.NER.NER(java_obj=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='NER_406eff705a20_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='NER_406eff705a20_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
timeout¶ (float) – number of seconds to wait before closing the connection
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getShowStats()[source]
- Returns
Whether to include detailed statistics in the response
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setParams(batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='NER_406eff705a20_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='NER_406eff705a20_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setStringIndexType(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.OCR module
- class synapse.ml.cognitive.OCR.OCR(java_obj=None, concurrency=1, concurrentTimeout=None, detectOrientation=None, detectOrientationCol=None, errorCol='OCR_3559c70e9ef8_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='OCR_3559c70e9ef8_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
detectOrientation¶ (object) – whether to detect image orientation prior to processing
handler¶ (object) – Which strategy to use when handling requests
timeout¶ (float) – number of seconds to wait before closing the connection
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- detectOrientation = Param(parent='undefined', name='detectOrientation', doc='ServiceParam: whether to detect image orientation prior to processing')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getDetectOrientation()[source]
- Returns
whether to detect image orientation prior to processing
- Return type
detectOrientation
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language to use')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setDetectOrientation(value)[source]
- Parameters
detectOrientation¶ – whether to detect image orientation prior to processing
- setDetectOrientationCol(value)[source]
- Parameters
detectOrientation¶ – whether to detect image orientation prior to processing
- setParams(concurrency=1, concurrentTimeout=None, detectOrientation=None, detectOrientationCol=None, errorCol='OCR_3559c70e9ef8_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='OCR_3559c70e9ef8_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.OpenAICompletion module
- class synapse.ml.cognitive.OpenAICompletion.OpenAICompletion(java_obj=None, apiVersion=None, apiVersionCol=None, batchIndexPrompt=None, batchIndexPromptCol=None, batchPrompt=None, batchPromptCol=None, bestOf=None, bestOfCol=None, cacheLevel=None, cacheLevelCol=None, concurrency=1, concurrentTimeout=None, deploymentName=None, deploymentNameCol=None, echo=None, echoCol=None, errorCol='OpenAPICompletion_b1144dc56e90_error', frequencyPenalty=None, frequencyPenaltyCol=None, handler=None, indexPrompt=None, indexPromptCol=None, logProbs=None, logProbsCol=None, maxTokens=None, maxTokensCol=None, model=None, modelCol=None, n=None, nCol=None, outputCol='OpenAPICompletion_b1144dc56e90_output', presencePenalty=None, presencePenaltyCol=None, prompt=None, promptCol=None, stop=None, stopCol=None, subscriptionKey=None, subscriptionKeyCol=None, temperature=None, temperatureCol=None, timeout=60.0, topP=None, topPCol=None, url=None, user=None, userCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
batchIndexPrompt¶ (object) – Sequence of index sequences to complete
bestOf¶ (object) – How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
cacheLevel¶ (object) – can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
echo¶ (object) – Echo back the prompt in addition to the completion
frequencyPenalty¶ (object) – How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.
handler¶ (object) – Which strategy to use when handling requests
logProbs¶ (object) – Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
maxTokens¶ (object) – The maximum number of tokens to generate. Has minimum of 0.
n¶ (object) – How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
presencePenalty¶ (object) – How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
stop¶ (object) – A sequence which indicates the end of the current document.
temperature¶ (object) – What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
timeout¶ (float) – number of seconds to wait before closing the connection
topP¶ (object) – An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
user¶ (object) – The ID of the end-user, for use in tracking and rate-limiting.
- apiVersion = Param(parent='undefined', name='apiVersion', doc='ServiceParam: version of the api')
- batchIndexPrompt = Param(parent='undefined', name='batchIndexPrompt', doc='ServiceParam: Sequence of index sequences to complete')
- batchPrompt = Param(parent='undefined', name='batchPrompt', doc='ServiceParam: Sequence of prompts to complete')
- bestOf = Param(parent='undefined', name='bestOf', doc='ServiceParam: How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.')
- cacheLevel = Param(parent='undefined', name='cacheLevel', doc='ServiceParam: can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- deploymentName = Param(parent='undefined', name='deploymentName', doc='ServiceParam: The name of the deployment')
- echo = Param(parent='undefined', name='echo', doc='ServiceParam: Echo back the prompt in addition to the completion')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- frequencyPenalty = Param(parent='undefined', name='frequencyPenalty', doc='ServiceParam: How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.')
- getBatchIndexPrompt()[source]
- Returns
Sequence of index sequences to complete
- Return type
batchIndexPrompt
- getBestOf()[source]
- Returns
How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
- Return type
bestOf
- getCacheLevel()[source]
- Returns
can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
- Return type
cacheLevel
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFrequencyPenalty()[source]
- Returns
How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.
- Return type
frequencyPenalty
- getLogProbs()[source]
- Returns
Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
- Return type
logProbs
- getMaxTokens()[source]
- Returns
The maximum number of tokens to generate. Has minimum of 0.
- Return type
maxTokens
- getN()[source]
- Returns
How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
- Return type
n
- getPresencePenalty()[source]
- Returns
How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
- Return type
presencePenalty
- getStop()[source]
- Returns
A sequence which indicates the end of the current document.
- Return type
stop
- getTemperature()[source]
- Returns
What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
- Return type
temperature
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getTopP()[source]
- Returns
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
- Return type
topP
- getUser()[source]
- Returns
The ID of the end-user, for use in tracking and rate-limiting.
- Return type
user
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- indexPrompt = Param(parent='undefined', name='indexPrompt', doc='ServiceParam: Sequence of indexes to complete')
- logProbs = Param(parent='undefined', name='logProbs', doc='ServiceParam: Include the log probabilities on the `logprobs` most likely tokens, as well the chosen tokens. So for example, if `logprobs` is 10, the API will return a list of the 10 most likely tokens. If `logprobs` is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.')
- maxTokens = Param(parent='undefined', name='maxTokens', doc='ServiceParam: The maximum number of tokens to generate. Has minimum of 0.')
- model = Param(parent='undefined', name='model', doc='ServiceParam: The name of the model to use')
- n = Param(parent='undefined', name='n', doc='ServiceParam: How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- presencePenalty = Param(parent='undefined', name='presencePenalty', doc='ServiceParam: How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.')
- prompt = Param(parent='undefined', name='prompt', doc='ServiceParam: The text to complete')
- setBatchIndexPrompt(value)[source]
- Parameters
batchIndexPrompt¶ – Sequence of index sequences to complete
- setBatchIndexPromptCol(value)[source]
- Parameters
batchIndexPrompt¶ – Sequence of index sequences to complete
- setBestOf(value)[source]
- Parameters
bestOf¶ – How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
- setBestOfCol(value)[source]
- Parameters
bestOf¶ – How many generations to create server side, and display only the best. Will not stream intermediate progress if best_of > 1. Has maximum value of 128.
- setCacheLevel(value)[source]
- Parameters
cacheLevel¶ – can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
- setCacheLevelCol(value)[source]
- Parameters
cacheLevel¶ – can be used to disable any server-side caching, 0=no cache, 1=prompt prefix enabled, 2=full cache
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setFrequencyPenalty(value)[source]
- Parameters
frequencyPenalty¶ – How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.
- setFrequencyPenaltyCol(value)[source]
- Parameters
frequencyPenalty¶ – How much to penalize new tokens based on whether they appear in the text so far. Increases the likelihood of the model to talk about new topics.
- setLogProbs(value)[source]
- Parameters
logProbs¶ – Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
- setLogProbsCol(value)[source]
- Parameters
logProbs¶ – Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens. So for example, if logprobs is 10, the API will return a list of the 10 most likely tokens. If logprobs is 0, only the chosen tokens will have logprobs returned. Minimum of 0 and maximum of 100 allowed.
- setMaxTokens(value)[source]
- Parameters
maxTokens¶ – The maximum number of tokens to generate. Has minimum of 0.
- setMaxTokensCol(value)[source]
- Parameters
maxTokens¶ – The maximum number of tokens to generate. Has minimum of 0.
- setN(value)[source]
- Parameters
n¶ – How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
- setNCol(value)[source]
- Parameters
n¶ – How many snippets to generate for each prompt. Minimum of 1 and maximum of 128 allowed.
- setParams(apiVersion=None, apiVersionCol=None, batchIndexPrompt=None, batchIndexPromptCol=None, batchPrompt=None, batchPromptCol=None, bestOf=None, bestOfCol=None, cacheLevel=None, cacheLevelCol=None, concurrency=1, concurrentTimeout=None, deploymentName=None, deploymentNameCol=None, echo=None, echoCol=None, errorCol='OpenAPICompletion_b1144dc56e90_error', frequencyPenalty=None, frequencyPenaltyCol=None, handler=None, indexPrompt=None, indexPromptCol=None, logProbs=None, logProbsCol=None, maxTokens=None, maxTokensCol=None, model=None, modelCol=None, n=None, nCol=None, outputCol='OpenAPICompletion_b1144dc56e90_output', presencePenalty=None, presencePenaltyCol=None, prompt=None, promptCol=None, stop=None, stopCol=None, subscriptionKey=None, subscriptionKeyCol=None, temperature=None, temperatureCol=None, timeout=60.0, topP=None, topPCol=None, url=None, user=None, userCol=None)[source]
Set the (keyword only) parameters
- setPresencePenalty(value)[source]
- Parameters
presencePenalty¶ – How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
- setPresencePenaltyCol(value)[source]
- Parameters
presencePenalty¶ – How much to penalize new tokens based on their existing frequency in the text so far. Decreases the likelihood of the model to repeat the same line verbatim. Has minimum of -2 and maximum of 2.
- setStop(value)[source]
- Parameters
stop¶ – A sequence which indicates the end of the current document.
- setStopCol(value)[source]
- Parameters
stop¶ – A sequence which indicates the end of the current document.
- setTemperature(value)[source]
- Parameters
temperature¶ – What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
- setTemperatureCol(value)[source]
- Parameters
temperature¶ – What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or top_p but not both. Minimum of 0 and maximum of 2 allowed.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- setTopP(value)[source]
- Parameters
topP¶ – An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
- setTopPCol(value)[source]
- Parameters
topP¶ – An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or temperature but not both. Minimum of 0 and maximum of 1 allowed.
- setUser(value)[source]
- Parameters
user¶ – The ID of the end-user, for use in tracking and rate-limiting.
- setUserCol(value)[source]
- Parameters
user¶ – The ID of the end-user, for use in tracking and rate-limiting.
- stop = Param(parent='undefined', name='stop', doc='ServiceParam: A sequence which indicates the end of the current document.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- temperature = Param(parent='undefined', name='temperature', doc='ServiceParam: What sampling temperature to use. Higher values means the model will take more risks. Try 0.9 for more creative applications, and 0 (argmax sampling) for ones with a well-defined answer. We generally recommend using this or `top_p` but not both. Minimum of 0 and maximum of 2 allowed.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- topP = Param(parent='undefined', name='topP', doc='ServiceParam: An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10 percent probability mass are considered. We generally recommend using this or `temperature` but not both. Minimum of 0 and maximum of 1 allowed.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- user = Param(parent='undefined', name='user', doc='ServiceParam: The ID of the end-user, for use in tracking and rate-limiting.')
synapse.ml.cognitive.PII module
- class synapse.ml.cognitive.PII.PII(java_obj=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, domain=None, domainCol=None, errorCol='PII_7590c0d8ae00_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='PII_7590c0d8ae00_output', piiCategories=None, piiCategoriesCol=None, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
domain¶ (object) – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
piiCategories¶ (object) – describes the PII categories to return
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
timeout¶ (float) – number of seconds to wait before closing the connection
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- domain = Param(parent='undefined', name='domain', doc="ServiceParam: if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: 'PHI', 'none'.")
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getDomain()[source]
- Returns
if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
- Return type
domain
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getPiiCategories()[source]
- Returns
describes the PII categories to return
- Return type
piiCategories
- getShowStats()[source]
- Returns
Whether to include detailed statistics in the response
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- piiCategories = Param(parent='undefined', name='piiCategories', doc='ServiceParam: describes the PII categories to return')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setDomain(value)[source]
- Parameters
domain¶ – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
- setDomainCol(value)[source]
- Parameters
domain¶ – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
- setLanguage(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setParams(batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, domain=None, domainCol=None, errorCol='PII_7590c0d8ae00_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='PII_7590c0d8ae00_output', piiCategories=None, piiCategoriesCol=None, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPiiCategoriesCol(value)[source]
- Parameters
piiCategories¶ – describes the PII categories to return
- setShowStats(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setStringIndexType(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.ReadImage module
- class synapse.ml.cognitive.ReadImage.ReadImage(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='ReadImage_1656c93135c9_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, outputCol='ReadImage_1656c93135c9_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
language¶ (object) – IThe BCP-47 language code of the text in the document. Currently, only English (en), Dutch (nl), French (fr), German (de), Italian (it), Portuguese (pt), and Spanish (es) are supported. Read supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
pollingDelay¶ (int) – number of milliseconds to wait between polling
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout¶ (float) – number of seconds to wait before closing the connection
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getLanguage()[source]
- Returns
IThe BCP-47 language code of the text in the document. Currently, only English (en), Dutch (nl), French (fr), German (de), Italian (it), Portuguese (pt), and Spanish (es) are supported. Read supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- Return type
language
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- language = Param(parent='undefined', name='language', doc='ServiceParam: IThe BCP-47 language code of the text in the document. Currently, only English (en), Dutch (nl), French (fr), German (de), Italian (it), Portuguese (pt), and Spanish (es) are supported. Read supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setLanguage(value)[source]
- Parameters
language¶ – IThe BCP-47 language code of the text in the document. Currently, only English (en), Dutch (nl), French (fr), German (de), Italian (it), Portuguese (pt), and Spanish (es) are supported. Read supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- setLanguageCol(value)[source]
- Parameters
language¶ – IThe BCP-47 language code of the text in the document. Currently, only English (en), Dutch (nl), French (fr), German (de), Italian (it), Portuguese (pt), and Spanish (es) are supported. Read supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='ReadImage_1656c93135c9_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, language=None, languageCol=None, maxPollingRetries=1000, outputCol='ReadImage_1656c93135c9_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.RecognizeDomainSpecificContent module
- class synapse.ml.cognitive.RecognizeDomainSpecificContent.RecognizeDomainSpecificContent(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='RecognizeDomainSpecificContent_0e978f84f80f_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, model=None, modelCol=None, outputCol='RecognizeDomainSpecificContent_0e978f84f80f_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- model = Param(parent='undefined', name='model', doc='ServiceParam: the domain specific model: celebrities, landmarks')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='RecognizeDomainSpecificContent_0e978f84f80f_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, model=None, modelCol=None, outputCol='RecognizeDomainSpecificContent_0e978f84f80f_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.RecognizeText module
- class synapse.ml.cognitive.RecognizeText.RecognizeText(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='RecognizeText_02d7c99a7c0d_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, maxPollingRetries=1000, mode=None, modeCol=None, outputCol='RecognizeText_02d7c99a7c0d_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
mode¶ (object) – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
pollingDelay¶ (int) – number of milliseconds to wait between polling
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout¶ (float) – number of seconds to wait before closing the connection
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getMode()[source]
- Returns
If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
- Return type
mode
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- mode = Param(parent='undefined', name='mode', doc="ServiceParam: If this parameter is set to 'Printed', printed text recognition is performed. If 'Handwritten' is specified, handwriting recognition is performed")
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setMode(value)[source]
- Parameters
mode¶ – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
- setModeCol(value)[source]
- Parameters
mode¶ – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='RecognizeText_02d7c99a7c0d_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, initialPollingDelay=300, maxPollingRetries=1000, mode=None, modeCol=None, outputCol='RecognizeText_02d7c99a7c0d_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.SimpleDetectAnomalies module
- class synapse.ml.cognitive.SimpleDetectAnomalies.SimpleDetectAnomalies(java_obj=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='SimpleDetectAnomalies_5af36418b784_error', granularity=None, granularityCol=None, groupbyCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='SimpleDetectAnomalies_5af36418b784_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, timestampCol='timestamp', url=None, valueCol='value')[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
customInterval¶ (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
granularity¶ (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
handler¶ (object) – Which strategy to use when handling requests
imputeFixedValue¶ (object) – Optional argument, fixed value to use when imputeMode is set to “fixed”
imputeMode¶ (object) – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
maxAnomalyRatio¶ (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
period¶ (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity¶ (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series¶ (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
timeout¶ (float) – number of seconds to wait before closing the connection
timestampCol¶ (str) – column representing the time of the series
valueCol¶ (str) – column representing the value of the series
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customInterval = Param(parent='undefined', name='customInterval', doc='ServiceParam: Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5. ')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomInterval()[source]
- Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
customInterval
- getGranularity()[source]
- Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
granularity
- getImputeFixedValue()[source]
- Returns
Optional argument, fixed value to use when imputeMode is set to “fixed”
- Return type
imputeFixedValue
- getImputeMode()[source]
- Returns
Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- Return type
imputeMode
- getMaxAnomalyRatio()[source]
- Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
maxAnomalyRatio
- getPeriod()[source]
- Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
period
- getSensitivity()[source]
- Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
sensitivity
- getSeries()[source]
- Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
series
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getTimestampCol()[source]
- Returns
column representing the time of the series
- Return type
timestampCol
- granularity = Param(parent='undefined', name='granularity', doc='ServiceParam: Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid. ')
- groupbyCol = Param(parent='undefined', name='groupbyCol', doc='column that groups the series')
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imputeFixedValue = Param(parent='undefined', name='imputeFixedValue', doc='ServiceParam: Optional argument, fixed value to use when imputeMode is set to "fixed" ')
- imputeMode = Param(parent='undefined', name='imputeMode', doc='ServiceParam: Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill ')
- maxAnomalyRatio = Param(parent='undefined', name='maxAnomalyRatio', doc='ServiceParam: Optional argument, advanced model parameter, max anomaly ratio in a time series. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- period = Param(parent='undefined', name='period', doc='ServiceParam: Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically. ')
- sensitivity = Param(parent='undefined', name='sensitivity', doc='ServiceParam: Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted ')
- series = Param(parent='undefined', name='series', doc='ServiceParam: Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned. ')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomInterval(value)[source]
- Parameters
customInterval¶ – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setCustomIntervalCol(value)[source]
- Parameters
customInterval¶ – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setGranularity(value)[source]
- Parameters
granularity¶ – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setGranularityCol(value)[source]
- Parameters
granularity¶ – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setImputeFixedValue(value)[source]
- Parameters
imputeFixedValue¶ – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeFixedValueCol(value)[source]
- Parameters
imputeFixedValue¶ – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeMode(value)[source]
- Parameters
imputeMode¶ – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setImputeModeCol(value)[source]
- Parameters
imputeMode¶ – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setMaxAnomalyRatio(value)[source]
- Parameters
maxAnomalyRatio¶ – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setMaxAnomalyRatioCol(value)[source]
- Parameters
maxAnomalyRatio¶ – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setParams(concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='SimpleDetectAnomalies_5af36418b784_error', granularity=None, granularityCol=None, groupbyCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='SimpleDetectAnomalies_5af36418b784_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, timestampCol='timestamp', url=None, valueCol='value')[source]
Set the (keyword only) parameters
- setPeriod(value)[source]
- Parameters
period¶ – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setPeriodCol(value)[source]
- Parameters
period¶ – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setSensitivity(value)[source]
- Parameters
sensitivity¶ – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSensitivityCol(value)[source]
- Parameters
sensitivity¶ – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSeries(value)[source]
- Parameters
series¶ – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setSeriesCol(value)[source]
- Parameters
series¶ – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- setTimestampCol(value)[source]
- Parameters
timestampCol¶ – column representing the time of the series
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- timestampCol = Param(parent='undefined', name='timestampCol', doc='column representing the time of the series')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- valueCol = Param(parent='undefined', name='valueCol', doc='column representing the value of the series')
synapse.ml.cognitive.SpeakerEmotionInference module
- class synapse.ml.cognitive.SpeakerEmotionInference.SpeakerEmotionInference(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='SpeakerEmotionInference_8449441ed5f4_error', handler=None, locale=None, localeCol=None, outputCol='SpeakerEmotionInference_8449441ed5f4_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None, voiceName=None, voiceNameCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- locale = Param(parent='undefined', name='locale', doc='ServiceParam: The locale of the input text')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='SpeakerEmotionInference_8449441ed5f4_error', handler=None, locale=None, localeCol=None, outputCol='SpeakerEmotionInference_8449441ed5f4_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None, voiceName=None, voiceNameCol=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: The text to annotate with inferred emotion')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- voiceName = Param(parent='undefined', name='voiceName', doc='ServiceParam: The name of the voice used for synthesis')
synapse.ml.cognitive.SpeechToText module
- class synapse.ml.cognitive.SpeechToText.SpeechToText(java_obj=None, audioData=None, audioDataCol=None, concurrency=1, concurrentTimeout=None, errorCol='SpeechToText_9074b1ccd9fe_error', format=None, formatCol=None, handler=None, language=None, languageCol=None, outputCol='SpeechToText_9074b1ccd9fe_output', profanity=None, profanityCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
audioData¶ (object) – The data sent to the service must be a .wav files
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
format¶ (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – Identifies the spoken language that is being recognized.
profanity¶ (object) – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
timeout¶ (float) – number of seconds to wait before closing the connection
- audioData = Param(parent='undefined', name='audioData', doc='ServiceParam: The data sent to the service must be a .wav files ')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- format = Param(parent='undefined', name='format', doc='ServiceParam: Specifies the result format. Accepted values are simple and detailed. Default is simple. ')
- getAudioData()[source]
- Returns
The data sent to the service must be a .wav files
- Return type
audioData
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFormat()[source]
- Returns
Specifies the result format. Accepted values are simple and detailed. Default is simple.
- Return type
format
- getLanguage()[source]
- Returns
Identifies the spoken language that is being recognized.
- Return type
language
- getProfanity()[source]
- Returns
Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- Return type
profanity
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: Identifies the spoken language that is being recognized. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- profanity = Param(parent='undefined', name='profanity', doc='ServiceParam: Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked. ')
- setAudioData(value)[source]
- Parameters
audioData¶ – The data sent to the service must be a .wav files
- setAudioDataCol(value)[source]
- Parameters
audioData¶ – The data sent to the service must be a .wav files
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setFormat(value)[source]
- Parameters
format¶ – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setFormatCol(value)[source]
- Parameters
format¶ – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setLanguage(value)[source]
- Parameters
language¶ – Identifies the spoken language that is being recognized.
- setLanguageCol(value)[source]
- Parameters
language¶ – Identifies the spoken language that is being recognized.
- setParams(audioData=None, audioDataCol=None, concurrency=1, concurrentTimeout=None, errorCol='SpeechToText_9074b1ccd9fe_error', format=None, formatCol=None, handler=None, language=None, languageCol=None, outputCol='SpeechToText_9074b1ccd9fe_output', profanity=None, profanityCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setProfanity(value)[source]
- Parameters
profanity¶ – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setProfanityCol(value)[source]
- Parameters
profanity¶ – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.SpeechToTextSDK module
- class synapse.ml.cognitive.SpeechToTextSDK.SpeechToTextSDK(java_obj=None, audioDataCol=None, endpointId=None, extraFfmpegArgs=[], fileType=None, fileTypeCol=None, format=None, formatCol=None, language=None, languageCol=None, outputCol=None, participantsJson=None, participantsJsonCol=None, profanity=None, profanityCol=None, recordAudioData=False, recordedFileNameCol=None, streamIntermediateResults=True, subscriptionKey=None, subscriptionKeyCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
audioDataCol¶ (str) – Column holding audio data, must be either ByteArrays or Strings representing file URIs
extraFfmpegArgs¶ (list) – extra arguments to for ffmpeg output decoding
fileType¶ (object) – The file type of the sound files, supported types: wav, ogg, mp3
format¶ (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
language¶ (object) – Identifies the spoken language that is being recognized.
participantsJson¶ (object) – a json representation of a list of conversation participants (email, language, user)
profanity¶ (object) – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
recordAudioData¶ (bool) – Whether to record audio data to a file location, for use only with m3u8 streams
recordedFileNameCol¶ (str) – Column holding file names to write audio data to if ``recordAudioData’’ is set to true
streamIntermediateResults¶ (bool) – Whether or not to immediately return itermediate results, or group in a sequence
- audioDataCol = Param(parent='undefined', name='audioDataCol', doc='Column holding audio data, must be either ByteArrays or Strings representing file URIs')
- endpointId = Param(parent='undefined', name='endpointId', doc='endpoint for custom speech models')
- extraFfmpegArgs = Param(parent='undefined', name='extraFfmpegArgs', doc='extra arguments to for ffmpeg output decoding')
- fileType = Param(parent='undefined', name='fileType', doc='ServiceParam: The file type of the sound files, supported types: wav, ogg, mp3')
- format = Param(parent='undefined', name='format', doc='ServiceParam: Specifies the result format. Accepted values are simple and detailed. Default is simple. ')
- getAudioDataCol()[source]
- Returns
Column holding audio data, must be either ByteArrays or Strings representing file URIs
- Return type
audioDataCol
- getExtraFfmpegArgs()[source]
- Returns
extra arguments to for ffmpeg output decoding
- Return type
extraFfmpegArgs
- getFileType()[source]
- Returns
The file type of the sound files, supported types: wav, ogg, mp3
- Return type
fileType
- getFormat()[source]
- Returns
Specifies the result format. Accepted values are simple and detailed. Default is simple.
- Return type
format
- getLanguage()[source]
- Returns
Identifies the spoken language that is being recognized.
- Return type
language
- getParticipantsJson()[source]
- Returns
a json representation of a list of conversation participants (email, language, user)
- Return type
participantsJson
- getProfanity()[source]
- Returns
Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- Return type
profanity
- getRecordAudioData()[source]
- Returns
Whether to record audio data to a file location, for use only with m3u8 streams
- Return type
recordAudioData
- getRecordedFileNameCol()[source]
- Returns
Column holding file names to write audio data to if ``recordAudioData’’ is set to true
- Return type
recordedFileNameCol
- getStreamIntermediateResults()[source]
- Returns
Whether or not to immediately return itermediate results, or group in a sequence
- Return type
streamIntermediateResults
- language = Param(parent='undefined', name='language', doc='ServiceParam: Identifies the spoken language that is being recognized. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- participantsJson = Param(parent='undefined', name='participantsJson', doc='ServiceParam: a json representation of a list of conversation participants (email, language, user)')
- profanity = Param(parent='undefined', name='profanity', doc='ServiceParam: Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked. ')
- recordAudioData = Param(parent='undefined', name='recordAudioData', doc='Whether to record audio data to a file location, for use only with m3u8 streams')
- recordedFileNameCol = Param(parent='undefined', name='recordedFileNameCol', doc="Column holding file names to write audio data to if ``recordAudioData'' is set to true")
- setAudioDataCol(value)[source]
- Parameters
audioDataCol¶ – Column holding audio data, must be either ByteArrays or Strings representing file URIs
- setExtraFfmpegArgs(value)[source]
- Parameters
extraFfmpegArgs¶ – extra arguments to for ffmpeg output decoding
- setFileType(value)[source]
- Parameters
fileType¶ – The file type of the sound files, supported types: wav, ogg, mp3
- setFileTypeCol(value)[source]
- Parameters
fileType¶ – The file type of the sound files, supported types: wav, ogg, mp3
- setFormat(value)[source]
- Parameters
format¶ – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setFormatCol(value)[source]
- Parameters
format¶ – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setLanguage(value)[source]
- Parameters
language¶ – Identifies the spoken language that is being recognized.
- setLanguageCol(value)[source]
- Parameters
language¶ – Identifies the spoken language that is being recognized.
- setParams(audioDataCol=None, endpointId=None, extraFfmpegArgs=[], fileType=None, fileTypeCol=None, format=None, formatCol=None, language=None, languageCol=None, outputCol=None, participantsJson=None, participantsJsonCol=None, profanity=None, profanityCol=None, recordAudioData=False, recordedFileNameCol=None, streamIntermediateResults=True, subscriptionKey=None, subscriptionKeyCol=None, url=None)[source]
Set the (keyword only) parameters
- setParticipantsJson(value)[source]
- Parameters
participantsJson¶ – a json representation of a list of conversation participants (email, language, user)
- setParticipantsJsonCol(value)[source]
- Parameters
participantsJson¶ – a json representation of a list of conversation participants (email, language, user)
- setProfanity(value)[source]
- Parameters
profanity¶ – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setProfanityCol(value)[source]
- Parameters
profanity¶ – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setRecordAudioData(value)[source]
- Parameters
recordAudioData¶ – Whether to record audio data to a file location, for use only with m3u8 streams
- setStreamIntermediateResults(value)[source]
- Parameters
streamIntermediateResults¶ – Whether or not to immediately return itermediate results, or group in a sequence
- streamIntermediateResults = Param(parent='undefined', name='streamIntermediateResults', doc='Whether or not to immediately return itermediate results, or group in a sequence')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.TagImage module
- class synapse.ml.cognitive.TagImage.TagImage(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='TagImage_93f6c9d77c73_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='TagImage_93f6c9d77c73_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='ServiceParam: bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='ServiceParam: the url of the image to use')
- language = Param(parent='undefined', name='language', doc='ServiceParam: The desired language for output generation.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='TagImage_93f6c9d77c73_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='TagImage_93f6c9d77c73_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.TextAnalyze module
- class synapse.ml.cognitive.TextAnalyze.TextAnalyze(java_obj=None, backoffs=[100, 500, 1000], batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, entityLinkingParams={'model-version': 'latest'}, entityRecognitionParams={'model-version': 'latest'}, errorCol='TextAnalyze_cfb4c4b8ce79_error', includeEntityLinking=True, includeEntityRecognition=True, includeKeyPhraseExtraction=True, includePii=True, includeSentimentAnalysis=True, initialPollingDelay=300, keyPhraseExtractionParams={'model-version': 'latest'}, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='TextAnalyze_cfb4c4b8ce79_output', piiParams={'model-version': 'latest'}, pollingDelay=300, sentimentAnalysisParams={'model-version': 'latest'}, showStats=None, showStatsCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
entityLinkingParams¶ (dict) – the parameters to pass to the entityLinking model
entityRecognitionParams¶ (dict) – the parameters to pass to the entity recognition model
includeEntityLinking¶ (bool) – Whether to perform EntityLinking
includeEntityRecognition¶ (bool) – Whether to perform entity recognition
includeKeyPhraseExtraction¶ (bool) – Whether to perform EntityLinking
includeSentimentAnalysis¶ (bool) – Whether to perform SentimentAnalysis
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
keyPhraseExtractionParams¶ (dict) – the parameters to pass to the keyPhraseExtraction model
language¶ (object) – the language code of the text (optional for some services)
pollingDelay¶ (int) – number of milliseconds to wait between polling
sentimentAnalysisParams¶ (dict) – the parameters to pass to the sentimentAnalysis model
showStats¶ (object) – Whether to include detailed statistics in the response
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
timeout¶ (float) – number of seconds to wait before closing the connection
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- entityLinkingParams = Param(parent='undefined', name='entityLinkingParams', doc='the parameters to pass to the entityLinking model')
- entityRecognitionParams = Param(parent='undefined', name='entityRecognitionParams', doc='the parameters to pass to the entity recognition model')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getEntityLinkingParams()[source]
- Returns
the parameters to pass to the entityLinking model
- Return type
entityLinkingParams
- getEntityRecognitionParams()[source]
- Returns
the parameters to pass to the entity recognition model
- Return type
entityRecognitionParams
- getIncludeEntityLinking()[source]
- Returns
Whether to perform EntityLinking
- Return type
includeEntityLinking
- getIncludeEntityRecognition()[source]
- Returns
Whether to perform entity recognition
- Return type
includeEntityRecognition
- getIncludeKeyPhraseExtraction()[source]
- Returns
Whether to perform EntityLinking
- Return type
includeKeyPhraseExtraction
- getIncludeSentimentAnalysis()[source]
- Returns
Whether to perform SentimentAnalysis
- Return type
includeSentimentAnalysis
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getKeyPhraseExtractionParams()[source]
- Returns
the parameters to pass to the keyPhraseExtraction model
- Return type
keyPhraseExtractionParams
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSentimentAnalysisParams()[source]
- Returns
the parameters to pass to the sentimentAnalysis model
- Return type
sentimentAnalysisParams
- getShowStats()[source]
- Returns
Whether to include detailed statistics in the response
- Return type
showStats
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- includeEntityLinking = Param(parent='undefined', name='includeEntityLinking', doc='Whether to perform EntityLinking')
- includeEntityRecognition = Param(parent='undefined', name='includeEntityRecognition', doc='Whether to perform entity recognition')
- includeKeyPhraseExtraction = Param(parent='undefined', name='includeKeyPhraseExtraction', doc='Whether to perform EntityLinking')
- includePii = Param(parent='undefined', name='includePii', doc='Whether to perform PII Detection')
- includeSentimentAnalysis = Param(parent='undefined', name='includeSentimentAnalysis', doc='Whether to perform SentimentAnalysis')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- keyPhraseExtractionParams = Param(parent='undefined', name='keyPhraseExtractionParams', doc='the parameters to pass to the keyPhraseExtraction model')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- piiParams = Param(parent='undefined', name='piiParams', doc='the parameters to pass to the PII model')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- sentimentAnalysisParams = Param(parent='undefined', name='sentimentAnalysisParams', doc='the parameters to pass to the sentimentAnalysis model')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setEntityLinkingParams(value)[source]
- Parameters
entityLinkingParams¶ – the parameters to pass to the entityLinking model
- setEntityRecognitionParams(value)[source]
- Parameters
entityRecognitionParams¶ – the parameters to pass to the entity recognition model
- setIncludeEntityLinking(value)[source]
- Parameters
includeEntityLinking¶ – Whether to perform EntityLinking
- setIncludeEntityRecognition(value)[source]
- Parameters
includeEntityRecognition¶ – Whether to perform entity recognition
- setIncludeKeyPhraseExtraction(value)[source]
- Parameters
includeKeyPhraseExtraction¶ – Whether to perform EntityLinking
- setIncludeSentimentAnalysis(value)[source]
- Parameters
includeSentimentAnalysis¶ – Whether to perform SentimentAnalysis
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setKeyPhraseExtractionParams(value)[source]
- Parameters
keyPhraseExtractionParams¶ – the parameters to pass to the keyPhraseExtraction model
- setLanguage(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setParams(backoffs=[100, 500, 1000], batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, entityLinkingParams={'model-version': 'latest'}, entityRecognitionParams={'model-version': 'latest'}, errorCol='TextAnalyze_cfb4c4b8ce79_error', includeEntityLinking=True, includeEntityRecognition=True, includeKeyPhraseExtraction=True, includePii=True, includeSentimentAnalysis=True, initialPollingDelay=300, keyPhraseExtractionParams={'model-version': 'latest'}, language=None, languageCol=None, maxPollingRetries=1000, modelVersion=None, modelVersionCol=None, outputCol='TextAnalyze_cfb4c4b8ce79_output', piiParams={'model-version': 'latest'}, pollingDelay=300, sentimentAnalysisParams={'model-version': 'latest'}, showStats=None, showStatsCol=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setSentimentAnalysisParams(value)[source]
- Parameters
sentimentAnalysisParams¶ – the parameters to pass to the sentimentAnalysis model
- setShowStats(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.TextSentiment module
- class synapse.ml.cognitive.TextSentiment.TextSentiment(java_obj=None, batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='TextSentiment_8f444c8e3e82_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, opinionMining=None, opinionMiningCol=None, outputCol='TextSentiment_8f444c8e3e82_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – the language code of the text (optional for some services)
opinionMining¶ (object) – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
showStats¶ (object) – Whether to include detailed statistics in the response
stringIndexType¶ (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
timeout¶ (float) – number of seconds to wait before closing the connection
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- disableServiceLogs = Param(parent='undefined', name='disableServiceLogs', doc='ServiceParam: disableServiceLogs option')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getOpinionMining()[source]
- Returns
if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
- Return type
opinionMining
- getShowStats()[source]
- Returns
Whether to include detailed statistics in the response
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='ServiceParam: Version of the model')
- opinionMining = Param(parent='undefined', name='opinionMining', doc='ServiceParam: if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language¶ – the language code of the text (optional for some services)
- setOpinionMining(value)[source]
- Parameters
opinionMining¶ – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
- setOpinionMiningCol(value)[source]
- Parameters
opinionMining¶ – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
- setParams(batchSize=10, concurrency=1, concurrentTimeout=None, disableServiceLogs=None, disableServiceLogsCol=None, errorCol='TextSentiment_8f444c8e3e82_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, opinionMining=None, opinionMiningCol=None, outputCol='TextSentiment_8f444c8e3e82_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setShowStatsCol(value)[source]
- Parameters
showStats¶ – Whether to include detailed statistics in the response
- setStringIndexType(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType¶ – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='ServiceParam: Whether to include detailed statistics in the response')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='ServiceParam: Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.TextToSpeech module
- class synapse.ml.cognitive.TextToSpeech.TextToSpeech(java_obj=None, errorCol='TextToSpeech_8f61fd3216eb_errors', language=None, languageCol=None, locale=None, localeCol=None, outputFileCol=None, outputFormat=None, outputFormatCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, url=None, useSSML=None, useSSMLCol=None, voiceName=None, voiceNameCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
language¶ (object) – The name of the language used for synthesis
outputFileCol¶ (str) – The location of the saved file as an HDFS compliant URI
outputFormat¶ (object) – The format for the output audio can be one of ArraySeq(Raw8Khz8BitMonoMULaw, Riff16Khz16KbpsMonoSiren, Audio16Khz16KbpsMonoSiren, Audio16Khz32KBitRateMonoMp3, Audio16Khz128KBitRateMonoMp3, Audio16Khz64KBitRateMonoMp3, Audio24Khz48KBitRateMonoMp3, Audio24Khz96KBitRateMonoMp3, Audio24Khz160KBitRateMonoMp3, Raw16Khz16BitMonoTrueSilk, Riff16Khz16BitMonoPcm, Riff8Khz16BitMonoPcm, Riff24Khz16BitMonoPcm, Riff8Khz8BitMonoMULaw, Raw16Khz16BitMonoPcm, Raw24Khz16BitMonoPcm, Raw8Khz16BitMonoPcm, Ogg16Khz16BitMonoOpus, Ogg24Khz16BitMonoOpus)
useSSML¶ (object) – whether to interpret the provided text input as SSML (Speech Synthesis Markup Language). The default value is false.
voiceName¶ (object) – The name of the voice used for synthesis
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getOutputFileCol()[source]
- Returns
The location of the saved file as an HDFS compliant URI
- Return type
outputFileCol
- getOutputFormat()[source]
- Returns
The format for the output audio can be one of ArraySeq(Raw8Khz8BitMonoMULaw, Riff16Khz16KbpsMonoSiren, Audio16Khz16KbpsMonoSiren, Audio16Khz32KBitRateMonoMp3, Audio16Khz128KBitRateMonoMp3, Audio16Khz64KBitRateMonoMp3, Audio24Khz48KBitRateMonoMp3, Audio24Khz96KBitRateMonoMp3, Audio24Khz160KBitRateMonoMp3, Raw16Khz16BitMonoTrueSilk, Riff16Khz16BitMonoPcm, Riff8Khz16BitMonoPcm, Riff24Khz16BitMonoPcm, Riff8Khz8BitMonoMULaw, Raw16Khz16BitMonoPcm, Raw24Khz16BitMonoPcm, Raw8Khz16BitMonoPcm, Ogg16Khz16BitMonoOpus, Ogg24Khz16BitMonoOpus)
- Return type
outputFormat
- getUseSSML()[source]
- Returns
whether to interpret the provided text input as SSML (Speech Synthesis Markup Language). The default value is false.
- Return type
useSSML
- language = Param(parent='undefined', name='language', doc='ServiceParam: The name of the language used for synthesis')
- locale = Param(parent='undefined', name='locale', doc='ServiceParam: The locale of the input text')
- outputFileCol = Param(parent='undefined', name='outputFileCol', doc='The location of the saved file as an HDFS compliant URI')
- outputFormat = Param(parent='undefined', name='outputFormat', doc='ServiceParam: The format for the output audio can be one of ArraySeq(Raw8Khz8BitMonoMULaw, Riff16Khz16KbpsMonoSiren, Audio16Khz16KbpsMonoSiren, Audio16Khz32KBitRateMonoMp3, Audio16Khz128KBitRateMonoMp3, Audio16Khz64KBitRateMonoMp3, Audio24Khz48KBitRateMonoMp3, Audio24Khz96KBitRateMonoMp3, Audio24Khz160KBitRateMonoMp3, Raw16Khz16BitMonoTrueSilk, Riff16Khz16BitMonoPcm, Riff8Khz16BitMonoPcm, Riff24Khz16BitMonoPcm, Riff8Khz8BitMonoMULaw, Raw16Khz16BitMonoPcm, Raw24Khz16BitMonoPcm, Raw8Khz16BitMonoPcm, Ogg16Khz16BitMonoOpus, Ogg24Khz16BitMonoOpus)')
- setOutputFileCol(value)[source]
- Parameters
outputFileCol¶ – The location of the saved file as an HDFS compliant URI
- setOutputFormat(value)[source]
- Parameters
outputFormat¶ – The format for the output audio can be one of ArraySeq(Raw8Khz8BitMonoMULaw, Riff16Khz16KbpsMonoSiren, Audio16Khz16KbpsMonoSiren, Audio16Khz32KBitRateMonoMp3, Audio16Khz128KBitRateMonoMp3, Audio16Khz64KBitRateMonoMp3, Audio24Khz48KBitRateMonoMp3, Audio24Khz96KBitRateMonoMp3, Audio24Khz160KBitRateMonoMp3, Raw16Khz16BitMonoTrueSilk, Riff16Khz16BitMonoPcm, Riff8Khz16BitMonoPcm, Riff24Khz16BitMonoPcm, Riff8Khz8BitMonoMULaw, Raw16Khz16BitMonoPcm, Raw24Khz16BitMonoPcm, Raw8Khz16BitMonoPcm, Ogg16Khz16BitMonoOpus, Ogg24Khz16BitMonoOpus)
- setOutputFormatCol(value)[source]
- Parameters
outputFormat¶ – The format for the output audio can be one of ArraySeq(Raw8Khz8BitMonoMULaw, Riff16Khz16KbpsMonoSiren, Audio16Khz16KbpsMonoSiren, Audio16Khz32KBitRateMonoMp3, Audio16Khz128KBitRateMonoMp3, Audio16Khz64KBitRateMonoMp3, Audio24Khz48KBitRateMonoMp3, Audio24Khz96KBitRateMonoMp3, Audio24Khz160KBitRateMonoMp3, Raw16Khz16BitMonoTrueSilk, Riff16Khz16BitMonoPcm, Riff8Khz16BitMonoPcm, Riff24Khz16BitMonoPcm, Riff8Khz8BitMonoMULaw, Raw16Khz16BitMonoPcm, Raw24Khz16BitMonoPcm, Raw8Khz16BitMonoPcm, Ogg16Khz16BitMonoOpus, Ogg24Khz16BitMonoOpus)
- setParams(errorCol='TextToSpeech_8f61fd3216eb_errors', language=None, languageCol=None, locale=None, localeCol=None, outputFileCol=None, outputFormat=None, outputFormatCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, url=None, useSSML=None, useSSMLCol=None, voiceName=None, voiceNameCol=None)[source]
Set the (keyword only) parameters
- setUseSSML(value)[source]
- Parameters
useSSML¶ – whether to interpret the provided text input as SSML (Speech Synthesis Markup Language). The default value is false.
- setUseSSMLCol(value)[source]
- Parameters
useSSML¶ – whether to interpret the provided text input as SSML (Speech Synthesis Markup Language). The default value is false.
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: The text to synthesize')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- useSSML = Param(parent='undefined', name='useSSML', doc='ServiceParam: whether to interpret the provided text input as SSML (Speech Synthesis Markup Language). The default value is false.')
- voiceName = Param(parent='undefined', name='voiceName', doc='ServiceParam: The name of the voice used for synthesis')
synapse.ml.cognitive.Translate module
- class synapse.ml.cognitive.Translate.Translate(java_obj=None, allowFallback=None, allowFallbackCol=None, category=None, categoryCol=None, concurrency=1, concurrentTimeout=None, errorCol='Translate_11069b99ca00_error', fromLanguage=None, fromLanguageCol=None, fromScript=None, fromScriptCol=None, handler=None, includeAlignment=None, includeAlignmentCol=None, includeSentenceLength=None, includeSentenceLengthCol=None, outputCol='Translate_11069b99ca00_output', profanityAction=None, profanityActionCol=None, profanityMarker=None, profanityMarkerCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, suggestedFrom=None, suggestedFromCol=None, text=None, textCol=None, textType=None, textTypeCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, toScript=None, toScriptCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
allowFallback¶ (object) – Specifies that the service is allowed to fall back to a general system when a custom system does not exist.
category¶ (object) – A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
fromLanguage¶ (object) – Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.
fromScript¶ (object) – Specifies the script of the input text.
handler¶ (object) – Which strategy to use when handling requests
includeAlignment¶ (object) – Specifies whether to include alignment projection from source text to translated text.
includeSentenceLength¶ (object) – Specifies whether to include sentence boundaries for the input text and the translated text.
profanityAction¶ (object) – Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.
profanityMarker¶ (object) – Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.
suggestedFrom¶ (object) – Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.
textType¶ (object) – Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.
timeout¶ (float) – number of seconds to wait before closing the connection
toLanguage¶ (object) – Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.
toScript¶ (object) – Specifies the script of the translated text.
- allowFallback = Param(parent='undefined', name='allowFallback', doc='ServiceParam: Specifies that the service is allowed to fall back to a general system when a custom system does not exist. ')
- category = Param(parent='undefined', name='category', doc='ServiceParam: A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fromLanguage = Param(parent='undefined', name='fromLanguage', doc='ServiceParam: Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.')
- fromScript = Param(parent='undefined', name='fromScript', doc='ServiceParam: Specifies the script of the input text.')
- getAllowFallback()[source]
- Returns
Specifies that the service is allowed to fall back to a general system when a custom system does not exist.
- Return type
allowFallback
- getCategory()[source]
- Returns
A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.
- Return type
category
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFromLanguage()[source]
- Returns
Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.
- Return type
fromLanguage
- getIncludeAlignment()[source]
- Returns
Specifies whether to include alignment projection from source text to translated text.
- Return type
includeAlignment
- getIncludeSentenceLength()[source]
- Returns
Specifies whether to include sentence boundaries for the input text and the translated text.
- Return type
includeSentenceLength
- getProfanityAction()[source]
- Returns
Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.
- Return type
profanityAction
- getProfanityMarker()[source]
- Returns
Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.
- Return type
profanityMarker
- getSuggestedFrom()[source]
- Returns
Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.
- Return type
suggestedFrom
- getTextType()[source]
- Returns
Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.
- Return type
textType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getToLanguage()[source]
- Returns
Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.
- Return type
toLanguage
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- includeAlignment = Param(parent='undefined', name='includeAlignment', doc='ServiceParam: Specifies whether to include alignment projection from source text to translated text.')
- includeSentenceLength = Param(parent='undefined', name='includeSentenceLength', doc='ServiceParam: Specifies whether to include sentence boundaries for the input text and the translated text. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- profanityAction = Param(parent='undefined', name='profanityAction', doc='ServiceParam: Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted. ')
- profanityMarker = Param(parent='undefined', name='profanityMarker', doc='ServiceParam: Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.')
- setAllowFallback(value)[source]
- Parameters
allowFallback¶ – Specifies that the service is allowed to fall back to a general system when a custom system does not exist.
- setAllowFallbackCol(value)[source]
- Parameters
allowFallback¶ – Specifies that the service is allowed to fall back to a general system when a custom system does not exist.
- setCategory(value)[source]
- Parameters
category¶ – A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.
- setCategoryCol(value)[source]
- Parameters
category¶ – A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setFromLanguage(value)[source]
- Parameters
fromLanguage¶ – Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.
- setFromLanguageCol(value)[source]
- Parameters
fromLanguage¶ – Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.
- setIncludeAlignment(value)[source]
- Parameters
includeAlignment¶ – Specifies whether to include alignment projection from source text to translated text.
- setIncludeAlignmentCol(value)[source]
- Parameters
includeAlignment¶ – Specifies whether to include alignment projection from source text to translated text.
- setIncludeSentenceLength(value)[source]
- Parameters
includeSentenceLength¶ – Specifies whether to include sentence boundaries for the input text and the translated text.
- setIncludeSentenceLengthCol(value)[source]
- Parameters
includeSentenceLength¶ – Specifies whether to include sentence boundaries for the input text and the translated text.
- setParams(allowFallback=None, allowFallbackCol=None, category=None, categoryCol=None, concurrency=1, concurrentTimeout=None, errorCol='Translate_11069b99ca00_error', fromLanguage=None, fromLanguageCol=None, fromScript=None, fromScriptCol=None, handler=None, includeAlignment=None, includeAlignmentCol=None, includeSentenceLength=None, includeSentenceLengthCol=None, outputCol='Translate_11069b99ca00_output', profanityAction=None, profanityActionCol=None, profanityMarker=None, profanityMarkerCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, suggestedFrom=None, suggestedFromCol=None, text=None, textCol=None, textType=None, textTypeCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, toScript=None, toScriptCol=None, url=None)[source]
Set the (keyword only) parameters
- setProfanityAction(value)[source]
- Parameters
profanityAction¶ – Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.
- setProfanityActionCol(value)[source]
- Parameters
profanityAction¶ – Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.
- setProfanityMarker(value)[source]
- Parameters
profanityMarker¶ – Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.
- setProfanityMarkerCol(value)[source]
- Parameters
profanityMarker¶ – Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.
- setSuggestedFrom(value)[source]
- Parameters
suggestedFrom¶ – Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.
- setSuggestedFromCol(value)[source]
- Parameters
suggestedFrom¶ – Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.
- setTextType(value)[source]
- Parameters
textType¶ – Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.
- setTextTypeCol(value)[source]
- Parameters
textType¶ – Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- setToLanguage(value)[source]
- Parameters
toLanguage¶ – Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.
- setToLanguageCol(value)[source]
- Parameters
toLanguage¶ – Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- suggestedFrom = Param(parent='undefined', name='suggestedFrom', doc="ServiceParam: Specifies a fallback language if the language of the input text can't be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.")
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- textType = Param(parent='undefined', name='textType', doc='ServiceParam: Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- toLanguage = Param(parent='undefined', name='toLanguage', doc="ServiceParam: Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It's possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de and to=it to translate to German and Italian.")
- toScript = Param(parent='undefined', name='toScript', doc='ServiceParam: Specifies the script of the translated text.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.Transliterate module
- class synapse.ml.cognitive.Transliterate.Transliterate(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='Transliterate_3bf47f57f310_error', fromScript=None, fromScriptCol=None, handler=None, language=None, languageCol=None, outputCol='Transliterate_3bf47f57f310_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toScript=None, toScriptCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
fromScript¶ (object) – Specifies the script of the input text.
handler¶ (object) – Which strategy to use when handling requests
language¶ (object) – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
timeout¶ (float) – number of seconds to wait before closing the connection
toScript¶ (object) – Specifies the script of the translated text.
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fromScript = Param(parent='undefined', name='fromScript', doc='ServiceParam: Specifies the script of the input text.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='ServiceParam: Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language¶ – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setLanguageCol(value)[source]
- Parameters
language¶ – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='Transliterate_3bf47f57f310_error', fromScript=None, fromScriptCol=None, handler=None, language=None, languageCol=None, outputCol='Transliterate_3bf47f57f310_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toScript=None, toScriptCol=None, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='ServiceParam: the API region to use')
- text = Param(parent='undefined', name='text', doc='ServiceParam: the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- toScript = Param(parent='undefined', name='toScript', doc='ServiceParam: Specifies the script of the translated text.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.VerifyFaces module
- class synapse.ml.cognitive.VerifyFaces.VerifyFaces(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='VerifyFaces_0b6fc2860d94_error', faceId=None, faceIdCol=None, faceId1=None, faceId1Col=None, faceId2=None, faceId2Col=None, handler=None, largePersonGroupId=None, largePersonGroupIdCol=None, outputCol='VerifyFaces_0b6fc2860d94_output', personGroupId=None, personGroupIdCol=None, personId=None, personIdCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
faceId¶ (object) – faceId of the face, comes from Face - Detect.
faceId1¶ (object) – faceId of one face, comes from Face - Detect.
faceId2¶ (object) – faceId of another face, comes from Face - Detect.
handler¶ (object) – Which strategy to use when handling requests
largePersonGroupId¶ (object) – Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
personGroupId¶ (object) – Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
personId¶ (object) – Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
timeout¶ (float) – number of seconds to wait before closing the connection
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- faceId = Param(parent='undefined', name='faceId', doc='ServiceParam: faceId of the face, comes from Face - Detect.')
- faceId1 = Param(parent='undefined', name='faceId1', doc='ServiceParam: faceId of one face, comes from Face - Detect.')
- faceId2 = Param(parent='undefined', name='faceId2', doc='ServiceParam: faceId of another face, comes from Face - Detect.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLargePersonGroupId()[source]
- Returns
Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- Return type
largePersonGroupId
- getPersonGroupId()[source]
- Returns
Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- Return type
personGroupId
- getPersonId()[source]
- Returns
Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
- Return type
personId
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- largePersonGroupId = Param(parent='undefined', name='largePersonGroupId', doc='ServiceParam: Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- personGroupId = Param(parent='undefined', name='personGroupId', doc='ServiceParam: Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.')
- personId = Param(parent='undefined', name='personId', doc='ServiceParam: Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setFaceId2Col(value)[source]
- Parameters
faceId2¶ – faceId of another face, comes from Face - Detect.
- setLargePersonGroupId(value)[source]
- Parameters
largePersonGroupId¶ – Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setLargePersonGroupIdCol(value)[source]
- Parameters
largePersonGroupId¶ – Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='VerifyFaces_0b6fc2860d94_error', faceId=None, faceIdCol=None, faceId1=None, faceId1Col=None, faceId2=None, faceId2Col=None, handler=None, largePersonGroupId=None, largePersonGroupIdCol=None, outputCol='VerifyFaces_0b6fc2860d94_output', personGroupId=None, personGroupIdCol=None, personId=None, personIdCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPersonGroupId(value)[source]
- Parameters
personGroupId¶ – Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setPersonGroupIdCol(value)[source]
- Parameters
personGroupId¶ – Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setPersonId(value)[source]
- Parameters
personId¶ – Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
- setPersonIdCol(value)[source]
- Parameters
personId¶ – Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
Module contents
SynapseML is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK), LightGBM and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of datasources.
SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. In this vein, SynapseML provides easy to use SparkML transformers for a wide variety of Microsoft Cognitive Services. For production grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.
SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+.