synapse.ml.cognitive package
Submodules
synapse.ml.cognitive.AddDocuments module
- class synapse.ml.cognitive.AddDocuments.AddDocuments(java_obj=None, actionCol='@search.action', batchSize=100, concurrency=1, concurrentTimeout=None, errorCol='AddDocuments_563f8874fe32_error', handler=None, indexName=None, outputCol='AddDocuments_563f8874fe32_output', serviceName=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
actionCol (object) – You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an ‘upsert’ where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn’t exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field ‘tags’ with value [‘budget’] and you execute a merge with value [‘economy’, ‘pool’] for ‘tags’, the final value of the ‘tags’ field will be [‘economy’, ‘pool’]. It will not be [‘budget’, ‘economy’, ‘pool’]. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null.
batchSize (int) – The max size of the buffer
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
indexName (object) –
outputCol (object) – The name of the output column
serviceName (object) –
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- actionCol = Param(parent='undefined', name='actionCol', doc=" You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an 'upsert' where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn't exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field 'tags' with value ['budget'] and you execute a merge with value ['economy', 'pool'] for 'tags', the final value of the 'tags' field will be ['economy', 'pool']. It will not be ['budget', 'economy', 'pool']. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null. ")
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getActionCol()[source]
- Returns
You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an ‘upsert’ where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn’t exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field ‘tags’ with value [‘budget’] and you execute a merge with value [‘economy’, ‘pool’] for ‘tags’, the final value of the ‘tags’ field will be [‘economy’, ‘pool’]. It will not be [‘budget’, ‘economy’, ‘pool’]. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null.
- Return type
actionCol
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- indexName = Param(parent='undefined', name='indexName', doc='')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- serviceName = Param(parent='undefined', name='serviceName', doc='')
- setActionCol(value)[source]
- Parameters
actionCol – You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an ‘upsert’ where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn’t exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field ‘tags’ with value [‘budget’] and you execute a merge with value [‘economy’, ‘pool’] for ‘tags’, the final value of the ‘tags’ field will be [‘economy’, ‘pool’]. It will not be [‘budget’, ‘economy’, ‘pool’]. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null.
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(actionCol='@search.action', batchSize=100, concurrency=1, concurrentTimeout=None, errorCol='AddDocuments_563f8874fe32_error', handler=None, indexName=None, outputCol='AddDocuments_563f8874fe32_output', serviceName=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeBusinessCards module
- class synapse.ml.cognitive.AnalyzeBusinessCards.AnalyzeBusinessCards(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeBusinessCards_05788d7888b8_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, locale=None, localeCol=None, maxPollingRetries=1000, outputCol='AnalyzeBusinessCards_05788d7888b8_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
includeTextDetails (object) – Include text lines and element references in the result.
locale (object) – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
maxPollingRetries (int) – number of times to poll
outputCol (object) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getLocale()[source]
- Returns
Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- Return type
locale
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='Include text lines and element references in the result.')
- locale = Param(parent='undefined', name='locale', doc='Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed & e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setLocale(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setLocaleCol(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeBusinessCards_05788d7888b8_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, locale=None, localeCol=None, maxPollingRetries=1000, outputCol='AnalyzeBusinessCards_05788d7888b8_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeCustomModel module
- class synapse.ml.cognitive.AnalyzeCustomModel.AnalyzeCustomModel(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeCustomModel_693f2f3c57bc_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, maxPollingRetries=1000, modelId=None, modelIdCol=None, outputCol='AnalyzeCustomModel_693f2f3c57bc_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
includeTextDetails (object) – Include text lines and element references in the result.
maxPollingRetries (int) – number of times to poll
modelId (object) – Model identifier.
outputCol (object) – The name of the output column
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='Include text lines and element references in the result.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelId = Param(parent='undefined', name='modelId', doc='Model identifier.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeCustomModel_693f2f3c57bc_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, maxPollingRetries=1000, modelId=None, modelIdCol=None, outputCol='AnalyzeCustomModel_693f2f3c57bc_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeIDDocuments module
- class synapse.ml.cognitive.AnalyzeIDDocuments.AnalyzeIDDocuments(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeIDDocuments_ff604c3cf01c_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, maxPollingRetries=1000, outputCol='AnalyzeIDDocuments_ff604c3cf01c_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
includeTextDetails (object) – Include text lines and element references in the result.
maxPollingRetries (int) – number of times to poll
outputCol (object) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='Include text lines and element references in the result.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed & e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeIDDocuments_ff604c3cf01c_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, maxPollingRetries=1000, outputCol='AnalyzeIDDocuments_ff604c3cf01c_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeImage module
- class synapse.ml.cognitive.AnalyzeImage.AnalyzeImage(java_obj=None, concurrency=1, concurrentTimeout=None, details=None, detailsCol=None, errorCol='AnalyzeImage_eb41c458b46d_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='AnalyzeImage_eb41c458b46d_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None, visualFeatures=None, visualFeaturesCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
details (object) – what visual feature types to return
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – the language of the response (en if none given)
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
visualFeatures (object) – what visual feature types to return
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- details = Param(parent='undefined', name='details', doc='what visual feature types to return')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='the url of the image to use')
- language = Param(parent='undefined', name='language', doc='the language of the response (en if none given)')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguageCol(value)[source]
- Parameters
language – the language of the response (en if none given)
- setParams(concurrency=1, concurrentTimeout=None, details=None, detailsCol=None, errorCol='AnalyzeImage_eb41c458b46d_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='AnalyzeImage_eb41c458b46d_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None, visualFeatures=None, visualFeaturesCol=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- setVisualFeaturesCol(value)[source]
- Parameters
visualFeatures – what visual feature types to return
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- visualFeatures = Param(parent='undefined', name='visualFeatures', doc='what visual feature types to return')
synapse.ml.cognitive.AnalyzeInvoices module
- class synapse.ml.cognitive.AnalyzeInvoices.AnalyzeInvoices(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeInvoices_9951e78e002e_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, locale=None, localeCol=None, maxPollingRetries=1000, outputCol='AnalyzeInvoices_9951e78e002e_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
includeTextDetails (object) – Include text lines and element references in the result.
locale (object) – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
maxPollingRetries (int) – number of times to poll
outputCol (object) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getLocale()[source]
- Returns
Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- Return type
locale
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='Include text lines and element references in the result.')
- locale = Param(parent='undefined', name='locale', doc='Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed & e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setLocale(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setLocaleCol(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeInvoices_9951e78e002e_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, locale=None, localeCol=None, maxPollingRetries=1000, outputCol='AnalyzeInvoices_9951e78e002e_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeLayout module
- class synapse.ml.cognitive.AnalyzeLayout.AnalyzeLayout(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeLayout_dd2187eeebe3_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, maxPollingRetries=1000, outputCol='AnalyzeLayout_dd2187eeebe3_output', pages=None, pagesCol=None, pollingDelay=300, readingOrder=None, readingOrderCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
maxPollingRetries (int) – number of times to poll
outputCol (object) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
readingOrder (object) – Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either ‘basic’ or ‘natural’. Will default to basic if not specified
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- Return type
language
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getReadingOrder()[source]
- Returns
Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either ‘basic’ or ‘natural’. Will default to basic if not specified
- Return type
readingOrder
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='the url of the image to use')
- language = Param(parent='undefined', name='language', doc='The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed & e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- readingOrder = Param(parent='undefined', name='readingOrder', doc="Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either 'basic' or 'natural'. Will default to basic if not specified")
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- setLanguageCol(value)[source]
- Parameters
language – The BCP-47 language code of the text in the document. Layout supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeLayout_dd2187eeebe3_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, maxPollingRetries=1000, outputCol='AnalyzeLayout_dd2187eeebe3_output', pages=None, pagesCol=None, pollingDelay=300, readingOrder=None, readingOrderCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setReadingOrder(value)[source]
- Parameters
readingOrder – Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either ‘basic’ or ‘natural’. Will default to basic if not specified
- setReadingOrderCol(value)[source]
- Parameters
readingOrder – Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either ‘basic’ or ‘natural’. Will default to basic if not specified
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AnalyzeReceipts module
- class synapse.ml.cognitive.AnalyzeReceipts.AnalyzeReceipts(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeReceipts_89a2a29b1819_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, locale=None, localeCol=None, maxPollingRetries=1000, outputCol='AnalyzeReceipts_89a2a29b1819_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
includeTextDetails (object) – Include text lines and element references in the result.
locale (object) – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
maxPollingRetries (int) – number of times to poll
outputCol (object) – The name of the output column
pages (object) – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeTextDetails()[source]
- Returns
Include text lines and element references in the result.
- Return type
includeTextDetails
- getLocale()[source]
- Returns
Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- Return type
locale
- getPages()[source]
- Returns
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- Return type
pages
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='the url of the image to use')
- includeTextDetails = Param(parent='undefined', name='includeTextDetails', doc='Include text lines and element references in the result.')
- locale = Param(parent='undefined', name='locale', doc='Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pages = Param(parent='undefined', name='pages', doc="The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed & e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.")
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeTextDetails(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setIncludeTextDetailsCol(value)[source]
- Parameters
includeTextDetails – Include text lines and element references in the result.
- setLocale(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setLocaleCol(value)[source]
- Parameters
locale – Locale of the receipt. Supported locales: en-AU, en-CA, en-GB, en-IN, en-US.
- setPages(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setPagesCol(value)[source]
- Parameters
pages – The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.’1, 2’ -> pages 1 and 2 will be processed), finite (e.g. ‘2-5’ -> pages 2 to 5 will be processed) and open-ended ranges (e.g. ‘5-’ -> all the pages from page 5 will be processed & e.g. ‘-10’ -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. ‘-5, 1, 3, 5-10’ - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using ‘5-100’ on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='AnalyzeReceipts_89a2a29b1819_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, includeTextDetails=None, includeTextDetailsCol=None, locale=None, localeCol=None, maxPollingRetries=1000, outputCol='AnalyzeReceipts_89a2a29b1819_output', pages=None, pagesCol=None, pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.AzureSearchWriter module
synapse.ml.cognitive.BingImageSearch module
- class synapse.ml.cognitive.BingImageSearch.BingImageSearch(java_obj=None, aspect=None, aspectCol=None, color=None, colorCol=None, concurrency=1, concurrentTimeout=None, count=None, countCol=None, errorCol='BingImageSearch_f4e87eb5757a_error', freshness=None, freshnessCol=None, handler=None, height=None, heightCol=None, imageContent=None, imageContentCol=None, imageType=None, imageTypeCol=None, license=None, licenseCol=None, maxFileSize=None, maxFileSizeCol=None, maxHeight=None, maxHeightCol=None, maxWidth=None, maxWidthCol=None, minFileSize=None, minFileSizeCol=None, minHeight=None, minHeightCol=None, minWidth=None, minWidthCol=None, mkt=None, mktCol=None, offset=None, offsetCol=None, outputCol='BingImageSearch_f4e87eb5757a_output', q=None, qCol=None, size=None, sizeCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url='https://api.bing.microsoft.com/v7.0/images/search', width=None, widthCol=None)[source]
Bases:
synapse.ml.cognitive._BingImageSearch._BingImageSearch
synapse.ml.cognitive.BreakSentence module
- class synapse.ml.cognitive.BreakSentence.BreakSentence(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='BreakSentence_c8ee7040de78_error', handler=None, language=None, languageCol=None, outputCol='BreakSentence_c8ee7040de78_output', script=None, scriptCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
outputCol (object) – The name of the output column
script (object) – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
subscriptionKey (object) – the API key to use
subscriptionRegion (object) – the API region to use
text (object) – the string to translate
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- Return type
language
- getScript()[source]
- Returns
Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
- Return type
script
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- script = Param(parent='undefined', name='script', doc='Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setLanguageCol(value)[source]
- Parameters
language – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='BreakSentence_c8ee7040de78_error', handler=None, language=None, languageCol=None, outputCol='BreakSentence_c8ee7040de78_output', script=None, scriptCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setScript(value)[source]
- Parameters
script – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
- setScriptCol(value)[source]
- Parameters
script – Script tag identifying the script used by the input text. If a script is not specified, the default script of the language will be assumed.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='the API region to use')
- text = Param(parent='undefined', name='text', doc='the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.ConversationTranscription module
- class synapse.ml.cognitive.ConversationTranscription.ConversationTranscription(java_obj=None, audioDataCol=None, endpointId=None, extraFfmpegArgs=[], fileType=None, fileTypeCol=None, format=None, formatCol=None, language=None, languageCol=None, outputCol=None, participantsJson=None, participantsJsonCol=None, profanity=None, profanityCol=None, recordAudioData=False, recordedFileNameCol=None, streamIntermediateResults=True, subscriptionKey=None, subscriptionKeyCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
audioDataCol (object) – Column holding audio data, must be either ByteArrays or Strings representing file URIs
endpointId (object) – endpoint for custom speech models
extraFfmpegArgs (list) – extra arguments to for ffmpeg output decoding
fileType (object) – The file type of the sound files, supported types: wav, ogg, mp3
format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
language (object) – Identifies the spoken language that is being recognized.
outputCol (object) – The name of the output column
participantsJson (object) – a json representation of a list of conversation participants (email, language, user)
profanity (object) – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
recordAudioData (bool) – Whether to record audio data to a file location, for use only with m3u8 streams
recordedFileNameCol (object) – Column holding file names to write audio data to if ``recordAudioData’’ is set to true
streamIntermediateResults (bool) – Whether or not to immediately return itermediate results, or group in a sequence
subscriptionKey (object) – the API key to use
url (object) – Url of the service
- audioDataCol = Param(parent='undefined', name='audioDataCol', doc='Column holding audio data, must be either ByteArrays or Strings representing file URIs')
- endpointId = Param(parent='undefined', name='endpointId', doc='endpoint for custom speech models')
- extraFfmpegArgs = Param(parent='undefined', name='extraFfmpegArgs', doc='extra arguments to for ffmpeg output decoding')
- fileType = Param(parent='undefined', name='fileType', doc='The file type of the sound files, supported types: wav, ogg, mp3')
- format = Param(parent='undefined', name='format', doc=' Specifies the result format. Accepted values are simple and detailed. Default is simple. ')
- getAudioDataCol()[source]
- Returns
Column holding audio data, must be either ByteArrays or Strings representing file URIs
- Return type
audioDataCol
- getExtraFfmpegArgs()[source]
- Returns
extra arguments to for ffmpeg output decoding
- Return type
extraFfmpegArgs
- getFileType()[source]
- Returns
The file type of the sound files, supported types: wav, ogg, mp3
- Return type
fileType
- getFormat()[source]
- Returns
Specifies the result format. Accepted values are simple and detailed. Default is simple.
- Return type
format
- getLanguage()[source]
- Returns
Identifies the spoken language that is being recognized.
- Return type
language
- getParticipantsJson()[source]
- Returns
a json representation of a list of conversation participants (email, language, user)
- Return type
participantsJson
- getProfanity()[source]
- Returns
Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- Return type
profanity
- getRecordAudioData()[source]
- Returns
Whether to record audio data to a file location, for use only with m3u8 streams
- Return type
recordAudioData
- getRecordedFileNameCol()[source]
- Returns
Column holding file names to write audio data to if ``recordAudioData’’ is set to true
- Return type
recordedFileNameCol
- getStreamIntermediateResults()[source]
- Returns
Whether or not to immediately return itermediate results, or group in a sequence
- Return type
streamIntermediateResults
- language = Param(parent='undefined', name='language', doc=' Identifies the spoken language that is being recognized. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- participantsJson = Param(parent='undefined', name='participantsJson', doc='a json representation of a list of conversation participants (email, language, user)')
- profanity = Param(parent='undefined', name='profanity', doc=' Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked. ')
- recordAudioData = Param(parent='undefined', name='recordAudioData', doc='Whether to record audio data to a file location, for use only with m3u8 streams')
- recordedFileNameCol = Param(parent='undefined', name='recordedFileNameCol', doc="Column holding file names to write audio data to if ``recordAudioData'' is set to true")
- setAudioDataCol(value)[source]
- Parameters
audioDataCol – Column holding audio data, must be either ByteArrays or Strings representing file URIs
- setExtraFfmpegArgs(value)[source]
- Parameters
extraFfmpegArgs – extra arguments to for ffmpeg output decoding
- setFileType(value)[source]
- Parameters
fileType – The file type of the sound files, supported types: wav, ogg, mp3
- setFileTypeCol(value)[source]
- Parameters
fileType – The file type of the sound files, supported types: wav, ogg, mp3
- setFormat(value)[source]
- Parameters
format – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setFormatCol(value)[source]
- Parameters
format – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setLanguage(value)[source]
- Parameters
language – Identifies the spoken language that is being recognized.
- setLanguageCol(value)[source]
- Parameters
language – Identifies the spoken language that is being recognized.
- setParams(audioDataCol=None, endpointId=None, extraFfmpegArgs=[], fileType=None, fileTypeCol=None, format=None, formatCol=None, language=None, languageCol=None, outputCol=None, participantsJson=None, participantsJsonCol=None, profanity=None, profanityCol=None, recordAudioData=False, recordedFileNameCol=None, streamIntermediateResults=True, subscriptionKey=None, subscriptionKeyCol=None, url=None)[source]
Set the (keyword only) parameters
- setParticipantsJson(value)[source]
- Parameters
participantsJson – a json representation of a list of conversation participants (email, language, user)
- setParticipantsJsonCol(value)[source]
- Parameters
participantsJson – a json representation of a list of conversation participants (email, language, user)
- setProfanity(value)[source]
- Parameters
profanity – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setProfanityCol(value)[source]
- Parameters
profanity – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setRecordAudioData(value)[source]
- Parameters
recordAudioData – Whether to record audio data to a file location, for use only with m3u8 streams
- setRecordedFileNameCol(value)[source]
- Parameters
recordedFileNameCol – Column holding file names to write audio data to if ``recordAudioData’’ is set to true
- setStreamIntermediateResults(value)[source]
- Parameters
streamIntermediateResults – Whether or not to immediately return itermediate results, or group in a sequence
- streamIntermediateResults = Param(parent='undefined', name='streamIntermediateResults', doc='Whether or not to immediately return itermediate results, or group in a sequence')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DescribeImage module
- class synapse.ml.cognitive.DescribeImage.DescribeImage(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='DescribeImage_6418dfc660e0_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, maxCandidates=None, maxCandidatesCol=None, outputCol='DescribeImage_6418dfc660e0_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – Language of image description
maxCandidates (object) – Maximum candidate descriptions to return
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getMaxCandidates()[source]
- Returns
Maximum candidate descriptions to return
- Return type
maxCandidates
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='the url of the image to use')
- language = Param(parent='undefined', name='language', doc='Language of image description')
- maxCandidates = Param(parent='undefined', name='maxCandidates', doc='Maximum candidate descriptions to return')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setMaxCandidates(value)[source]
- Parameters
maxCandidates – Maximum candidate descriptions to return
- setMaxCandidatesCol(value)[source]
- Parameters
maxCandidates – Maximum candidate descriptions to return
- setParams(concurrency=1, concurrentTimeout=None, errorCol='DescribeImage_6418dfc660e0_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, maxCandidates=None, maxCandidatesCol=None, outputCol='DescribeImage_6418dfc660e0_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.Detect module
- class synapse.ml.cognitive.Detect.Detect(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='Detect_94af685c6a76_error', handler=None, outputCol='Detect_94af685c6a76_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
subscriptionRegion (object) – the API region to use
text (object) – the string to translate
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='Detect_94af685c6a76_error', handler=None, outputCol='Detect_94af685c6a76_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='the API region to use')
- text = Param(parent='undefined', name='text', doc='the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DetectAnomalies module
- class synapse.ml.cognitive.DetectAnomalies.DetectAnomalies(java_obj=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectAnomalies_200b200eb1de_error', granularity=None, granularityCol=None, handler=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectAnomalies_200b200eb1de_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
errorCol (object) – column to hold http errors
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
handler (object) – Which strategy to use when handling requests
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
outputCol (object) – The name of the output column
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customInterval = Param(parent='undefined', name='customInterval', doc=' Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5. ')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomInterval()[source]
- Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
customInterval
- getGranularity()[source]
- Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
granularity
- getMaxAnomalyRatio()[source]
- Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
maxAnomalyRatio
- getPeriod()[source]
- Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
period
- getSensitivity()[source]
- Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
sensitivity
- getSeries()[source]
- Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
series
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- granularity = Param(parent='undefined', name='granularity', doc=' Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid. ')
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- maxAnomalyRatio = Param(parent='undefined', name='maxAnomalyRatio', doc=' Optional argument, advanced model parameter, max anomaly ratio in a time series. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- period = Param(parent='undefined', name='period', doc=' Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically. ')
- sensitivity = Param(parent='undefined', name='sensitivity', doc=' Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted ')
- series = Param(parent='undefined', name='series', doc=' Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned. ')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setCustomInterval(value)[source]
- Parameters
customInterval – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setCustomIntervalCol(value)[source]
- Parameters
customInterval – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setGranularity(value)[source]
- Parameters
granularity – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setGranularityCol(value)[source]
- Parameters
granularity – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setMaxAnomalyRatio(value)[source]
- Parameters
maxAnomalyRatio – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setMaxAnomalyRatioCol(value)[source]
- Parameters
maxAnomalyRatio – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setParams(concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectAnomalies_200b200eb1de_error', granularity=None, granularityCol=None, handler=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectAnomalies_200b200eb1de_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPeriod(value)[source]
- Parameters
period – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setPeriodCol(value)[source]
- Parameters
period – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setSensitivity(value)[source]
- Parameters
sensitivity – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSensitivityCol(value)[source]
- Parameters
sensitivity – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSeries(value)[source]
- Parameters
series – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setSeriesCol(value)[source]
- Parameters
series – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DetectFace module
- class synapse.ml.cognitive.DetectFace.DetectFace(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='DetectFace_76d9693a80e9_error', handler=None, imageUrl=None, imageUrlCol=None, outputCol='DetectFace_76d9693a80e9_output', returnFaceAttributes=None, returnFaceAttributesCol=None, returnFaceId=None, returnFaceIdCol=None, returnFaceLandmarks=None, returnFaceLandmarksCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
imageUrl (object) – the url of the image to use
outputCol (object) – The name of the output column
returnFaceAttributes (object) – Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
returnFaceId (object) – Return faceIds of the detected faces or not. The default value is true
returnFaceLandmarks (object) – Return face landmarks of the detected faces or not. The default value is false.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getReturnFaceAttributes()[source]
- Returns
Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
- Return type
returnFaceAttributes
- getReturnFaceId()[source]
- Returns
Return faceIds of the detected faces or not. The default value is true
- Return type
returnFaceId
- getReturnFaceLandmarks()[source]
- Returns
Return face landmarks of the detected faces or not. The default value is false.
- Return type
returnFaceLandmarks
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='the url of the image to use')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- returnFaceAttributes = Param(parent='undefined', name='returnFaceAttributes', doc='Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.')
- returnFaceId = Param(parent='undefined', name='returnFaceId', doc='Return faceIds of the detected faces or not. The default value is true')
- returnFaceLandmarks = Param(parent='undefined', name='returnFaceLandmarks', doc='Return face landmarks of the detected faces or not. The default value is false.')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='DetectFace_76d9693a80e9_error', handler=None, imageUrl=None, imageUrlCol=None, outputCol='DetectFace_76d9693a80e9_output', returnFaceAttributes=None, returnFaceAttributesCol=None, returnFaceId=None, returnFaceIdCol=None, returnFaceLandmarks=None, returnFaceLandmarksCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setReturnFaceAttributes(value)[source]
- Parameters
returnFaceAttributes – Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
- setReturnFaceAttributesCol(value)[source]
- Parameters
returnFaceAttributes – Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
- setReturnFaceId(value)[source]
- Parameters
returnFaceId – Return faceIds of the detected faces or not. The default value is true
- setReturnFaceIdCol(value)[source]
- Parameters
returnFaceId – Return faceIds of the detected faces or not. The default value is true
- setReturnFaceLandmarks(value)[source]
- Parameters
returnFaceLandmarks – Return face landmarks of the detected faces or not. The default value is false.
- setReturnFaceLandmarksCol(value)[source]
- Parameters
returnFaceLandmarks – Return face landmarks of the detected faces or not. The default value is false.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DetectLastAnomaly module
- class synapse.ml.cognitive.DetectLastAnomaly.DetectLastAnomaly(java_obj=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectLastAnomaly_b8965dda5a43_error', granularity=None, granularityCol=None, handler=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectLastAnomaly_b8965dda5a43_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
errorCol (object) – column to hold http errors
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
handler (object) – Which strategy to use when handling requests
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
outputCol (object) – The name of the output column
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customInterval = Param(parent='undefined', name='customInterval', doc=' Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5. ')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomInterval()[source]
- Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
customInterval
- getGranularity()[source]
- Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
granularity
- getMaxAnomalyRatio()[source]
- Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
maxAnomalyRatio
- getPeriod()[source]
- Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
period
- getSensitivity()[source]
- Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
sensitivity
- getSeries()[source]
- Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
series
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- granularity = Param(parent='undefined', name='granularity', doc=' Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid. ')
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- maxAnomalyRatio = Param(parent='undefined', name='maxAnomalyRatio', doc=' Optional argument, advanced model parameter, max anomaly ratio in a time series. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- period = Param(parent='undefined', name='period', doc=' Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically. ')
- sensitivity = Param(parent='undefined', name='sensitivity', doc=' Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted ')
- series = Param(parent='undefined', name='series', doc=' Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned. ')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setCustomInterval(value)[source]
- Parameters
customInterval – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setCustomIntervalCol(value)[source]
- Parameters
customInterval – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setGranularity(value)[source]
- Parameters
granularity – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setGranularityCol(value)[source]
- Parameters
granularity – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setMaxAnomalyRatio(value)[source]
- Parameters
maxAnomalyRatio – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setMaxAnomalyRatioCol(value)[source]
- Parameters
maxAnomalyRatio – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setParams(concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectLastAnomaly_b8965dda5a43_error', granularity=None, granularityCol=None, handler=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectLastAnomaly_b8965dda5a43_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPeriod(value)[source]
- Parameters
period – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setPeriodCol(value)[source]
- Parameters
period – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setSensitivity(value)[source]
- Parameters
sensitivity – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSensitivityCol(value)[source]
- Parameters
sensitivity – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSeries(value)[source]
- Parameters
series – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setSeriesCol(value)[source]
- Parameters
series – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DictionaryExamples module
- class synapse.ml.cognitive.DictionaryExamples.DictionaryExamples(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='DictionaryExamples_1942097a4398_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryExamples_1942097a4398_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, textAndTranslation=None, textAndTranslationCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
fromLanguage (object) – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
handler (object) – Which strategy to use when handling requests
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
subscriptionRegion (object) – the API region to use
textAndTranslation (object) – A string specifying the translated text previously returned by the Dictionary lookup operation.
timeout (float) – number of seconds to wait before closing the connection
toLanguage (object) – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fromLanguage = Param(parent='undefined', name='fromLanguage', doc='Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFromLanguage()[source]
- Returns
Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- Return type
fromLanguage
- getTextAndTranslation()[source]
- Returns
A string specifying the translated text previously returned by the Dictionary lookup operation.
- Return type
textAndTranslation
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getToLanguage()[source]
- Returns
Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- Return type
toLanguage
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFromLanguage(value)[source]
- Parameters
fromLanguage – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- setFromLanguageCol(value)[source]
- Parameters
fromLanguage – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='DictionaryExamples_1942097a4398_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryExamples_1942097a4398_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, textAndTranslation=None, textAndTranslationCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]
Set the (keyword only) parameters
- setTextAndTranslation(value)[source]
- Parameters
textAndTranslation – A string specifying the translated text previously returned by the Dictionary lookup operation.
- setTextAndTranslationCol(value)[source]
- Parameters
textAndTranslation – A string specifying the translated text previously returned by the Dictionary lookup operation.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- setToLanguage(value)[source]
- Parameters
toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- setToLanguageCol(value)[source]
- Parameters
toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='the API region to use')
- textAndTranslation = Param(parent='undefined', name='textAndTranslation', doc=' A string specifying the translated text previously returned by the Dictionary lookup operation.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- toLanguage = Param(parent='undefined', name='toLanguage', doc='Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DictionaryLookup module
- class synapse.ml.cognitive.DictionaryLookup.DictionaryLookup(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='DictionaryLookup_401f9f0b371b_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryLookup_401f9f0b371b_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
fromLanguage (object) – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
handler (object) – Which strategy to use when handling requests
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
subscriptionRegion (object) – the API region to use
text (object) – the string to translate
timeout (float) – number of seconds to wait before closing the connection
toLanguage (object) – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fromLanguage = Param(parent='undefined', name='fromLanguage', doc='Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFromLanguage()[source]
- Returns
Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- Return type
fromLanguage
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getToLanguage()[source]
- Returns
Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- Return type
toLanguage
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFromLanguage(value)[source]
- Parameters
fromLanguage – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- setFromLanguageCol(value)[source]
- Parameters
fromLanguage – Specifies the language of the input text. The source language must be one of the supported languages included in the dictionary scope.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='DictionaryLookup_401f9f0b371b_error', fromLanguage=None, fromLanguageCol=None, handler=None, outputCol='DictionaryLookup_401f9f0b371b_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- setToLanguage(value)[source]
- Parameters
toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- setToLanguageCol(value)[source]
- Parameters
toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='the API region to use')
- text = Param(parent='undefined', name='text', doc='the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- toLanguage = Param(parent='undefined', name='toLanguage', doc='Specifies the language of the output text. The target language must be one of the supported languages included in the dictionary scope.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.DocumentTranslator module
- class synapse.ml.cognitive.DocumentTranslator.DocumentTranslator(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='DocumentTranslator_94499a55d24f_error', filterPrefix=None, filterPrefixCol=None, filterSuffix=None, filterSuffixCol=None, maxPollingRetries=1000, outputCol='DocumentTranslator_94499a55d24f_output', pollingDelay=300, serviceName=None, sourceLanguage=None, sourceLanguageCol=None, sourceStorageSource=None, sourceStorageSourceCol=None, sourceUrl=None, sourceUrlCol=None, storageType=None, storageTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, targets=None, targetsCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
filterPrefix (object) – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.
filterSuffix (object) – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
maxPollingRetries (int) – number of times to poll
outputCol (object) – The name of the output column
pollingDelay (int) – number of milliseconds to wait between polling
serviceName (object) –
sourceLanguage (object) – Language code. If none is specified, we will perform auto detect on the document.
sourceStorageSource (object) – Storage source of source input.
sourceUrl (object) – Location of the folder / container or single file with your documents.
storageType (object) – Storage type of the input documents source string. Required for single document translation only.
subscriptionKey (object) – the API key to use
targets (object) – Destination for the finished translated documents.
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- filterPrefix = Param(parent='undefined', name='filterPrefix', doc='A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.')
- filterSuffix = Param(parent='undefined', name='filterSuffix', doc='A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFilterPrefix()[source]
- Returns
A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.
- Return type
filterPrefix
- getFilterSuffix()[source]
- Returns
A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
- Return type
filterSuffix
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSourceLanguage()[source]
- Returns
Language code. If none is specified, we will perform auto detect on the document.
- Return type
sourceLanguage
- getSourceStorageSource()[source]
- Returns
Storage source of source input.
- Return type
sourceStorageSource
- getSourceUrl()[source]
- Returns
Location of the folder / container or single file with your documents.
- Return type
sourceUrl
- getStorageType()[source]
- Returns
Storage type of the input documents source string. Required for single document translation only.
- Return type
storageType
- getTargets()[source]
- Returns
Destination for the finished translated documents.
- Return type
targets
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- serviceName = Param(parent='undefined', name='serviceName', doc='')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFilterPrefix(value)[source]
- Parameters
filterPrefix – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.
- setFilterPrefixCol(value)[source]
- Parameters
filterPrefix – A case-sensitive prefix string to filter documents in the source path for translation. For example, when using an Azure storage blob Uri, use the prefix to restrict sub folders for translation.
- setFilterSuffix(value)[source]
- Parameters
filterSuffix – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
- setFilterSuffixCol(value)[source]
- Parameters
filterSuffix – A case-sensitive suffix string to filter documents in the source path for translation. This is most often use for file extensions.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='DocumentTranslator_94499a55d24f_error', filterPrefix=None, filterPrefixCol=None, filterSuffix=None, filterSuffixCol=None, maxPollingRetries=1000, outputCol='DocumentTranslator_94499a55d24f_output', pollingDelay=300, serviceName=None, sourceLanguage=None, sourceLanguageCol=None, sourceStorageSource=None, sourceStorageSourceCol=None, sourceUrl=None, sourceUrlCol=None, storageType=None, storageTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, targets=None, targetsCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setSourceLanguage(value)[source]
- Parameters
sourceLanguage – Language code. If none is specified, we will perform auto detect on the document.
- setSourceLanguageCol(value)[source]
- Parameters
sourceLanguage – Language code. If none is specified, we will perform auto detect on the document.
- setSourceStorageSource(value)[source]
- Parameters
sourceStorageSource – Storage source of source input.
- setSourceStorageSourceCol(value)[source]
- Parameters
sourceStorageSource – Storage source of source input.
- setSourceUrl(value)[source]
- Parameters
sourceUrl – Location of the folder / container or single file with your documents.
- setSourceUrlCol(value)[source]
- Parameters
sourceUrl – Location of the folder / container or single file with your documents.
- setStorageType(value)[source]
- Parameters
storageType – Storage type of the input documents source string. Required for single document translation only.
- setStorageTypeCol(value)[source]
- Parameters
storageType – Storage type of the input documents source string. Required for single document translation only.
- setTargetsCol(value)[source]
- Parameters
targets – Destination for the finished translated documents.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- sourceLanguage = Param(parent='undefined', name='sourceLanguage', doc='Language code. If none is specified, we will perform auto detect on the document.')
- sourceStorageSource = Param(parent='undefined', name='sourceStorageSource', doc='Storage source of source input.')
- sourceUrl = Param(parent='undefined', name='sourceUrl', doc='Location of the folder / container or single file with your documents.')
- storageType = Param(parent='undefined', name='storageType', doc='Storage type of the input documents source string. Required for single document translation only.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- targets = Param(parent='undefined', name='targets', doc='Destination for the finished translated documents.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.EntityDetector module
- class synapse.ml.cognitive.EntityDetector.EntityDetector(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='EntityDetector_9555940e457c_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='EntityDetector_9555940e457c_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (object) – The name of the output column
showStats (object) – if set to true, response will contain input and document level statistics.
stringIndexType (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getShowStats()[source]
- Returns
if set to true, response will contain input and document level statistics.
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='EntityDetector_9555940e457c_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='EntityDetector_9555940e457c_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setShowStatsCol(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setStringIndexType(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='if set to true, response will contain input and document level statistics.')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- text = Param(parent='undefined', name='text', doc='the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.EntityDetectorV2 module
- class synapse.ml.cognitive.EntityDetectorV2.EntityDetectorV2(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='EntityDetectorV2_8afcc7527fa4_error', handler=None, language=None, languageCol=None, outputCol='EntityDetectorV2_8afcc7527fa4_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='the language code of the text (optional for some services)')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(concurrency=1, concurrentTimeout=None, errorCol='EntityDetectorV2_8afcc7527fa4_error', handler=None, language=None, languageCol=None, outputCol='EntityDetectorV2_8afcc7527fa4_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- text = Param(parent='undefined', name='text', doc='the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.FindSimilarFace module
- class synapse.ml.cognitive.FindSimilarFace.FindSimilarFace(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='FindSimilarFace_3b132647f6dc_error', faceId=None, faceIdCol=None, faceIds=None, faceIdsCol=None, faceListId=None, faceListIdCol=None, handler=None, largeFaceListId=None, largeFaceListIdCol=None, maxNumOfCandidatesReturned=None, maxNumOfCandidatesReturnedCol=None, mode=None, modeCol=None, outputCol='FindSimilarFace_3b132647f6dc_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
faceId (object) – faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
faceIds (object) – An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
faceListId (object) – An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
handler (object) – Which strategy to use when handling requests
largeFaceListId (object) – An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
maxNumOfCandidatesReturned (object) – Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
mode (object) – Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- faceId = Param(parent='undefined', name='faceId', doc='faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.')
- faceIds = Param(parent='undefined', name='faceIds', doc=' An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.')
- faceListId = Param(parent='undefined', name='faceListId', doc=' An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFaceId()[source]
- Returns
faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
- Return type
faceId
- getFaceIds()[source]
- Returns
An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- Return type
faceIds
- getFaceListId()[source]
- Returns
An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- Return type
faceListId
- getLargeFaceListId()[source]
- Returns
An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- Return type
largeFaceListId
- getMaxNumOfCandidatesReturned()[source]
- Returns
Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
- Return type
maxNumOfCandidatesReturned
- getMode()[source]
- Returns
Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
- Return type
mode
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- largeFaceListId = Param(parent='undefined', name='largeFaceListId', doc=' An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.')
- maxNumOfCandidatesReturned = Param(parent='undefined', name='maxNumOfCandidatesReturned', doc=' Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.')
- mode = Param(parent='undefined', name='mode', doc=" Optional parameter. Similar face searching mode. It can be 'matchPerson' or 'matchFace'. It defaults to 'matchPerson'.")
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFaceId(value)[source]
- Parameters
faceId – faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
- setFaceIdCol(value)[source]
- Parameters
faceId – faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
- setFaceIds(value)[source]
- Parameters
faceIds – An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setFaceIdsCol(value)[source]
- Parameters
faceIds – An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setFaceListId(value)[source]
- Parameters
faceListId – An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setFaceListIdCol(value)[source]
- Parameters
faceListId – An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setLargeFaceListId(value)[source]
- Parameters
largeFaceListId – An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setLargeFaceListIdCol(value)[source]
- Parameters
largeFaceListId – An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- setMaxNumOfCandidatesReturned(value)[source]
- Parameters
maxNumOfCandidatesReturned – Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
- setMaxNumOfCandidatesReturnedCol(value)[source]
- Parameters
maxNumOfCandidatesReturned – Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
- setMode(value)[source]
- Parameters
mode – Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
- setModeCol(value)[source]
- Parameters
mode – Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='FindSimilarFace_3b132647f6dc_error', faceId=None, faceIdCol=None, faceIds=None, faceIdsCol=None, faceListId=None, faceListIdCol=None, handler=None, largeFaceListId=None, largeFaceListIdCol=None, maxNumOfCandidatesReturned=None, maxNumOfCandidatesReturnedCol=None, mode=None, modeCol=None, outputCol='FindSimilarFace_3b132647f6dc_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.FormOntologyLearner module
- class synapse.ml.cognitive.FormOntologyLearner.FormOntologyLearner(java_obj=None, inputCol=None, outputCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaEstimator
- Parameters
- inputCol = Param(parent='undefined', name='inputCol', doc='The name of the input column')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
synapse.ml.cognitive.FormOntologyTransformer module
- class synapse.ml.cognitive.FormOntologyTransformer.FormOntologyTransformer(java_obj=None, inputCol=None, ontology=None, outputCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaModel
- Parameters
- inputCol = Param(parent='undefined', name='inputCol', doc='The name of the input column')
- ontology = Param(parent='undefined', name='ontology', doc='The ontology to cast values to')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
synapse.ml.cognitive.GenerateThumbnails module
- class synapse.ml.cognitive.GenerateThumbnails.GenerateThumbnails(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='GenerateThumbnails_516f8b91b638_error', handler=None, height=None, heightCol=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, outputCol='GenerateThumbnails_516f8b91b638_output', smartCropping=None, smartCroppingCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None, width=None, widthCol=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
height (object) – the desired height of the image
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
outputCol (object) – The name of the output column
smartCropping (object) – whether to intelligently crop the image
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
width (object) – the desired width of the image
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getSmartCropping()[source]
- Returns
whether to intelligently crop the image
- Return type
smartCropping
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- height = Param(parent='undefined', name='height', doc='the desired height of the image')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='the url of the image to use')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='GenerateThumbnails_516f8b91b638_error', handler=None, height=None, heightCol=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, outputCol='GenerateThumbnails_516f8b91b638_output', smartCropping=None, smartCroppingCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None, width=None, widthCol=None)[source]
Set the (keyword only) parameters
- setSmartCroppingCol(value)[source]
- Parameters
smartCropping – whether to intelligently crop the image
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- smartCropping = Param(parent='undefined', name='smartCropping', doc='whether to intelligently crop the image')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- width = Param(parent='undefined', name='width', doc='the desired width of the image')
synapse.ml.cognitive.GetCustomModel module
- class synapse.ml.cognitive.GetCustomModel.GetCustomModel(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='GetCustomModel_527240c02013_error', handler=None, includeKeys=None, includeKeysCol=None, modelId=None, modelIdCol=None, outputCol='GetCustomModel_527240c02013_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
includeKeys (object) – Include list of extracted keys in model information.
modelId (object) – Model identifier.
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getIncludeKeys()[source]
- Returns
Include list of extracted keys in model information.
- Return type
includeKeys
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- includeKeys = Param(parent='undefined', name='includeKeys', doc='Include list of extracted keys in model information.')
- modelId = Param(parent='undefined', name='modelId', doc='Model identifier.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setIncludeKeys(value)[source]
- Parameters
includeKeys – Include list of extracted keys in model information.
- setIncludeKeysCol(value)[source]
- Parameters
includeKeys – Include list of extracted keys in model information.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='GetCustomModel_527240c02013_error', handler=None, includeKeys=None, includeKeysCol=None, modelId=None, modelIdCol=None, outputCol='GetCustomModel_527240c02013_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.GroupFaces module
- class synapse.ml.cognitive.GroupFaces.GroupFaces(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='GroupFaces_2ebfa1532290_error', faceIds=None, faceIdsCol=None, handler=None, outputCol='GroupFaces_2ebfa1532290_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
faceIds (object) – Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
handler (object) – Which strategy to use when handling requests
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- faceIds = Param(parent='undefined', name='faceIds', doc='Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFaceIds()[source]
- Returns
Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
- Return type
faceIds
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFaceIds(value)[source]
- Parameters
faceIds – Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
- setFaceIdsCol(value)[source]
- Parameters
faceIds – Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='GroupFaces_2ebfa1532290_error', faceIds=None, faceIdsCol=None, handler=None, outputCol='GroupFaces_2ebfa1532290_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.IdentifyFaces module
- class synapse.ml.cognitive.IdentifyFaces.IdentifyFaces(java_obj=None, concurrency=1, concurrentTimeout=None, confidenceThreshold=None, confidenceThresholdCol=None, errorCol='IdentifyFaces_adb21870ba92_error', faceIds=None, faceIdsCol=None, handler=None, largePersonGroupId=None, largePersonGroupIdCol=None, maxNumOfCandidatesReturned=None, maxNumOfCandidatesReturnedCol=None, outputCol='IdentifyFaces_adb21870ba92_output', personGroupId=None, personGroupIdCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
confidenceThreshold (object) – Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
errorCol (object) – column to hold http errors
faceIds (object) – Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
handler (object) – Which strategy to use when handling requests
largePersonGroupId (object) – largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
maxNumOfCandidatesReturned (object) – The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
outputCol (object) – The name of the output column
personGroupId (object) – personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- confidenceThreshold = Param(parent='undefined', name='confidenceThreshold', doc='Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- faceIds = Param(parent='undefined', name='faceIds', doc='Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10]. ')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getConfidenceThreshold()[source]
- Returns
Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
- Return type
confidenceThreshold
- getFaceIds()[source]
- Returns
Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
- Return type
faceIds
- getLargePersonGroupId()[source]
- Returns
largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- Return type
largePersonGroupId
- getMaxNumOfCandidatesReturned()[source]
- Returns
The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
- Return type
maxNumOfCandidatesReturned
- getPersonGroupId()[source]
- Returns
personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- Return type
personGroupId
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- largePersonGroupId = Param(parent='undefined', name='largePersonGroupId', doc='largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.')
- maxNumOfCandidatesReturned = Param(parent='undefined', name='maxNumOfCandidatesReturned', doc='The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- personGroupId = Param(parent='undefined', name='personGroupId', doc='personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setConfidenceThreshold(value)[source]
- Parameters
confidenceThreshold – Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
- setConfidenceThresholdCol(value)[source]
- Parameters
confidenceThreshold – Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
- setFaceIds(value)[source]
- Parameters
faceIds – Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
- setFaceIdsCol(value)[source]
- Parameters
faceIds – Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
- setLargePersonGroupId(value)[source]
- Parameters
largePersonGroupId – largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setLargePersonGroupIdCol(value)[source]
- Parameters
largePersonGroupId – largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setMaxNumOfCandidatesReturned(value)[source]
- Parameters
maxNumOfCandidatesReturned – The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
- setMaxNumOfCandidatesReturnedCol(value)[source]
- Parameters
maxNumOfCandidatesReturned – The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
- setParams(concurrency=1, concurrentTimeout=None, confidenceThreshold=None, confidenceThresholdCol=None, errorCol='IdentifyFaces_adb21870ba92_error', faceIds=None, faceIdsCol=None, handler=None, largePersonGroupId=None, largePersonGroupIdCol=None, maxNumOfCandidatesReturned=None, maxNumOfCandidatesReturnedCol=None, outputCol='IdentifyFaces_adb21870ba92_output', personGroupId=None, personGroupIdCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPersonGroupId(value)[source]
- Parameters
personGroupId – personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setPersonGroupIdCol(value)[source]
- Parameters
personGroupId – personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.KeyPhraseExtractor module
- class synapse.ml.cognitive.KeyPhraseExtractor.KeyPhraseExtractor(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='KeyPhraseExtractor_4d1f2700218e_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='KeyPhraseExtractor_4d1f2700218e_output', showStats=None, showStatsCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (object) – The name of the output column
showStats (object) – if set to true, response will contain input and document level statistics.
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getShowStats()[source]
- Returns
if set to true, response will contain input and document level statistics.
- Return type
showStats
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='KeyPhraseExtractor_4d1f2700218e_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='KeyPhraseExtractor_4d1f2700218e_output', showStats=None, showStatsCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setShowStatsCol(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='if set to true, response will contain input and document level statistics.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- text = Param(parent='undefined', name='text', doc='the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.KeyPhraseExtractorV2 module
- class synapse.ml.cognitive.KeyPhraseExtractorV2.KeyPhraseExtractorV2(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='KeyPhraseExtractorV2_4105fe2b3b07_error', handler=None, language=None, languageCol=None, outputCol='KeyPhraseExtractorV2_4105fe2b3b07_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='the language code of the text (optional for some services)')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(concurrency=1, concurrentTimeout=None, errorCol='KeyPhraseExtractorV2_4105fe2b3b07_error', handler=None, language=None, languageCol=None, outputCol='KeyPhraseExtractorV2_4105fe2b3b07_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- text = Param(parent='undefined', name='text', doc='the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.LanguageDetector module
- class synapse.ml.cognitive.LanguageDetector.LanguageDetector(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='LanguageDetector_9a89e32d8e2b_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='LanguageDetector_9a89e32d8e2b_output', showStats=None, showStatsCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (object) – The name of the output column
showStats (object) – if set to true, response will contain input and document level statistics.
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getShowStats()[source]
- Returns
if set to true, response will contain input and document level statistics.
- Return type
showStats
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='LanguageDetector_9a89e32d8e2b_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='LanguageDetector_9a89e32d8e2b_output', showStats=None, showStatsCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setShowStatsCol(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='if set to true, response will contain input and document level statistics.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- text = Param(parent='undefined', name='text', doc='the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.LanguageDetectorV2 module
- class synapse.ml.cognitive.LanguageDetectorV2.LanguageDetectorV2(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='LanguageDetectorV2_b13686d08326_error', handler=None, language=None, languageCol=None, outputCol='LanguageDetectorV2_b13686d08326_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='the language code of the text (optional for some services)')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(concurrency=1, concurrentTimeout=None, errorCol='LanguageDetectorV2_b13686d08326_error', handler=None, language=None, languageCol=None, outputCol='LanguageDetectorV2_b13686d08326_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- text = Param(parent='undefined', name='text', doc='the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.ListCustomModels module
- class synapse.ml.cognitive.ListCustomModels.ListCustomModels(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='ListCustomModels_e7006bf90e98_error', handler=None, op=None, opCol=None, outputCol='ListCustomModels_e7006bf90e98_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
op (object) – Specify whether to return summary or full list of models.
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- op = Param(parent='undefined', name='op', doc='Specify whether to return summary or full list of models.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='ListCustomModels_e7006bf90e98_error', handler=None, op=None, opCol=None, outputCol='ListCustomModels_e7006bf90e98_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.NER module
- class synapse.ml.cognitive.NER.NER(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='NER_8f7553b38eb5_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='NER_8f7553b38eb5_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (object) – The name of the output column
showStats (object) – if set to true, response will contain input and document level statistics.
stringIndexType (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getShowStats()[source]
- Returns
if set to true, response will contain input and document level statistics.
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='NER_8f7553b38eb5_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='NER_8f7553b38eb5_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setShowStatsCol(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setStringIndexType(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='if set to true, response will contain input and document level statistics.')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- text = Param(parent='undefined', name='text', doc='the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.NERV2 module
- class synapse.ml.cognitive.NERV2.NERV2(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='NERV2_737e951eaa22_error', handler=None, language=None, languageCol=None, outputCol='NERV2_737e951eaa22_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='the language code of the text (optional for some services)')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(concurrency=1, concurrentTimeout=None, errorCol='NERV2_737e951eaa22_error', handler=None, language=None, languageCol=None, outputCol='NERV2_737e951eaa22_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- text = Param(parent='undefined', name='text', doc='the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.OCR module
- class synapse.ml.cognitive.OCR.OCR(java_obj=None, concurrency=1, concurrentTimeout=None, detectOrientation=None, detectOrientationCol=None, errorCol='OCR_a56adf21dbf1_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='OCR_a56adf21dbf1_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
detectOrientation (object) – whether to detect image orientation prior to processing
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – the language to use
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- detectOrientation = Param(parent='undefined', name='detectOrientation', doc='whether to detect image orientation prior to processing')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getDetectOrientation()[source]
- Returns
whether to detect image orientation prior to processing
- Return type
detectOrientation
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='the url of the image to use')
- language = Param(parent='undefined', name='language', doc='the language to use')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setDetectOrientation(value)[source]
- Parameters
detectOrientation – whether to detect image orientation prior to processing
- setDetectOrientationCol(value)[source]
- Parameters
detectOrientation – whether to detect image orientation prior to processing
- setParams(concurrency=1, concurrentTimeout=None, detectOrientation=None, detectOrientationCol=None, errorCol='OCR_a56adf21dbf1_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='OCR_a56adf21dbf1_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.PII module
- class synapse.ml.cognitive.PII.PII(java_obj=None, concurrency=1, concurrentTimeout=None, domain=None, domainCol=None, errorCol='PII_860b945b98ad_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='PII_860b945b98ad_output', piiCategories=None, piiCategoriesCol=None, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
domain (object) – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
outputCol (object) – The name of the output column
piiCategories (object) – describes the PII categories to return
showStats (object) – if set to true, response will contain input and document level statistics.
stringIndexType (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- domain = Param(parent='undefined', name='domain', doc="if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: 'PHI', 'none'.")
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getDomain()[source]
- Returns
if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
- Return type
domain
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getPiiCategories()[source]
- Returns
describes the PII categories to return
- Return type
piiCategories
- getShowStats()[source]
- Returns
if set to true, response will contain input and document level statistics.
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- piiCategories = Param(parent='undefined', name='piiCategories', doc='describes the PII categories to return')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setDomain(value)[source]
- Parameters
domain – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
- setDomainCol(value)[source]
- Parameters
domain – if specified, will set the PII domain to include only a subset of the entity categories. Possible values include: ‘PHI’, ‘none’.
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setParams(concurrency=1, concurrentTimeout=None, domain=None, domainCol=None, errorCol='PII_860b945b98ad_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, outputCol='PII_860b945b98ad_output', piiCategories=None, piiCategoriesCol=None, showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPiiCategoriesCol(value)[source]
- Parameters
piiCategories – describes the PII categories to return
- setShowStats(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setShowStatsCol(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setStringIndexType(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='if set to true, response will contain input and document level statistics.')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- text = Param(parent='undefined', name='text', doc='the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.ReadImage module
- class synapse.ml.cognitive.ReadImage.ReadImage(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='ReadImage_ee0a3d11c90f_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, maxPollingRetries=1000, outputCol='ReadImage_ee0a3d11c90f_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – IThe BCP-47 language code of the text in the document. Currently, only English (en), Dutch (nl), French (fr), German (de), Italian (it), Portuguese (pt), and Spanish (es) are supported. Read supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
maxPollingRetries (int) – number of times to poll
outputCol (object) – The name of the output column
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
IThe BCP-47 language code of the text in the document. Currently, only English (en), Dutch (nl), French (fr), German (de), Italian (it), Portuguese (pt), and Spanish (es) are supported. Read supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- Return type
language
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='the url of the image to use')
- language = Param(parent='undefined', name='language', doc='IThe BCP-47 language code of the text in the document. Currently, only English (en), Dutch (nl), French (fr), German (de), Italian (it), Portuguese (pt), and Spanish (es) are supported. Read supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – IThe BCP-47 language code of the text in the document. Currently, only English (en), Dutch (nl), French (fr), German (de), Italian (it), Portuguese (pt), and Spanish (es) are supported. Read supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- setLanguageCol(value)[source]
- Parameters
language – IThe BCP-47 language code of the text in the document. Currently, only English (en), Dutch (nl), French (fr), German (de), Italian (it), Portuguese (pt), and Spanish (es) are supported. Read supports auto language identification and multilanguage documents, so only provide a language code if you would like to force the documented to be processed as that specific language.
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='ReadImage_ee0a3d11c90f_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, maxPollingRetries=1000, outputCol='ReadImage_ee0a3d11c90f_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.RecognizeDomainSpecificContent module
- class synapse.ml.cognitive.RecognizeDomainSpecificContent.RecognizeDomainSpecificContent(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='RecognizeDomainSpecificContent_ff840cfd4398_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, model=None, modelCol=None, outputCol='RecognizeDomainSpecificContent_ff840cfd4398_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
model (object) – the domain specific model: celebrities, landmarks
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='the url of the image to use')
- model = Param(parent='undefined', name='model', doc='the domain specific model: celebrities, landmarks')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='RecognizeDomainSpecificContent_ff840cfd4398_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, model=None, modelCol=None, outputCol='RecognizeDomainSpecificContent_ff840cfd4398_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.RecognizeText module
- class synapse.ml.cognitive.RecognizeText.RecognizeText(java_obj=None, backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='RecognizeText_930f597860ef_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, maxPollingRetries=1000, mode=None, modeCol=None, outputCol='RecognizeText_930f597860ef_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
maxPollingRetries (int) – number of times to poll
mode (object) – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
outputCol (object) – The name of the output column
pollingDelay (int) – number of milliseconds to wait between polling
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getMode()[source]
- Returns
If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
- Return type
mode
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- imageBytes = Param(parent='undefined', name='imageBytes', doc='bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='the url of the image to use')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- mode = Param(parent='undefined', name='mode', doc="If this parameter is set to 'Printed', printed text recognition is performed. If 'Handwritten' is specified, handwriting recognition is performed")
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setMode(value)[source]
- Parameters
mode – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
- setModeCol(value)[source]
- Parameters
mode – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
- setParams(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=None, errorCol='RecognizeText_930f597860ef_error', imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, maxPollingRetries=1000, mode=None, modeCol=None, outputCol='RecognizeText_930f597860ef_output', pollingDelay=300, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay – number of milliseconds to wait between polling
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.SimpleDetectAnomalies module
- class synapse.ml.cognitive.SimpleDetectAnomalies.SimpleDetectAnomalies(java_obj=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='SimpleDetectAnomalies_ea50daf23243_error', granularity=None, granularityCol=None, groupbyCol=None, handler=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='SimpleDetectAnomalies_ea50daf23243_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, timestampCol='timestamp', url=None, valueCol='value')[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
errorCol (object) – column to hold http errors
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
groupbyCol (object) – column that groups the series
handler (object) – Which strategy to use when handling requests
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
outputCol (object) – The name of the output column
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
timestampCol (object) – column representing the time of the series
url (object) – Url of the service
valueCol (object) – column representing the value of the series
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customInterval = Param(parent='undefined', name='customInterval', doc=' Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5. ')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomInterval()[source]
- Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
customInterval
- getGranularity()[source]
- Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
granularity
- getMaxAnomalyRatio()[source]
- Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
maxAnomalyRatio
- getPeriod()[source]
- Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
period
- getSensitivity()[source]
- Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
sensitivity
- getSeries()[source]
- Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
series
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getTimestampCol()[source]
- Returns
column representing the time of the series
- Return type
timestampCol
- granularity = Param(parent='undefined', name='granularity', doc=' Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid. ')
- groupbyCol = Param(parent='undefined', name='groupbyCol', doc='column that groups the series')
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- maxAnomalyRatio = Param(parent='undefined', name='maxAnomalyRatio', doc=' Optional argument, advanced model parameter, max anomaly ratio in a time series. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- period = Param(parent='undefined', name='period', doc=' Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically. ')
- sensitivity = Param(parent='undefined', name='sensitivity', doc=' Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted ')
- series = Param(parent='undefined', name='series', doc=' Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned. ')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setCustomInterval(value)[source]
- Parameters
customInterval – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setCustomIntervalCol(value)[source]
- Parameters
customInterval – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setGranularity(value)[source]
- Parameters
granularity – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setGranularityCol(value)[source]
- Parameters
granularity – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setMaxAnomalyRatio(value)[source]
- Parameters
maxAnomalyRatio – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setMaxAnomalyRatioCol(value)[source]
- Parameters
maxAnomalyRatio – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setParams(concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='SimpleDetectAnomalies_ea50daf23243_error', granularity=None, granularityCol=None, groupbyCol=None, handler=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='SimpleDetectAnomalies_ea50daf23243_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, timestampCol='timestamp', url=None, valueCol='value')[source]
Set the (keyword only) parameters
- setPeriod(value)[source]
- Parameters
period – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setPeriodCol(value)[source]
- Parameters
period – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setSensitivity(value)[source]
- Parameters
sensitivity – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSensitivityCol(value)[source]
- Parameters
sensitivity – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSeries(value)[source]
- Parameters
series – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setSeriesCol(value)[source]
- Parameters
series – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- setTimestampCol(value)[source]
- Parameters
timestampCol – column representing the time of the series
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- timestampCol = Param(parent='undefined', name='timestampCol', doc='column representing the time of the series')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- valueCol = Param(parent='undefined', name='valueCol', doc='column representing the value of the series')
synapse.ml.cognitive.SpeechToText module
- class synapse.ml.cognitive.SpeechToText.SpeechToText(java_obj=None, audioData=None, audioDataCol=None, concurrency=1, concurrentTimeout=None, errorCol='SpeechToText_8b2fd21f0b36_error', format=None, formatCol=None, handler=None, language=None, languageCol=None, outputCol='SpeechToText_8b2fd21f0b36_output', profanity=None, profanityCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
audioData (object) – The data sent to the service must be a .wav files
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
handler (object) – Which strategy to use when handling requests
language (object) – Identifies the spoken language that is being recognized.
outputCol (object) – The name of the output column
profanity (object) – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- audioData = Param(parent='undefined', name='audioData', doc=' The data sent to the service must be a .wav files ')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- format = Param(parent='undefined', name='format', doc=' Specifies the result format. Accepted values are simple and detailed. Default is simple. ')
- getAudioData()[source]
- Returns
The data sent to the service must be a .wav files
- Return type
audioData
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFormat()[source]
- Returns
Specifies the result format. Accepted values are simple and detailed. Default is simple.
- Return type
format
- getLanguage()[source]
- Returns
Identifies the spoken language that is being recognized.
- Return type
language
- getProfanity()[source]
- Returns
Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- Return type
profanity
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc=' Identifies the spoken language that is being recognized. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- profanity = Param(parent='undefined', name='profanity', doc=' Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked. ')
- setAudioData(value)[source]
- Parameters
audioData – The data sent to the service must be a .wav files
- setAudioDataCol(value)[source]
- Parameters
audioData – The data sent to the service must be a .wav files
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFormat(value)[source]
- Parameters
format – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setFormatCol(value)[source]
- Parameters
format – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setLanguage(value)[source]
- Parameters
language – Identifies the spoken language that is being recognized.
- setLanguageCol(value)[source]
- Parameters
language – Identifies the spoken language that is being recognized.
- setParams(audioData=None, audioDataCol=None, concurrency=1, concurrentTimeout=None, errorCol='SpeechToText_8b2fd21f0b36_error', format=None, formatCol=None, handler=None, language=None, languageCol=None, outputCol='SpeechToText_8b2fd21f0b36_output', profanity=None, profanityCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setProfanity(value)[source]
- Parameters
profanity – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setProfanityCol(value)[source]
- Parameters
profanity – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.SpeechToTextSDK module
- class synapse.ml.cognitive.SpeechToTextSDK.SpeechToTextSDK(java_obj=None, audioDataCol=None, endpointId=None, extraFfmpegArgs=[], fileType=None, fileTypeCol=None, format=None, formatCol=None, language=None, languageCol=None, outputCol=None, participantsJson=None, participantsJsonCol=None, profanity=None, profanityCol=None, recordAudioData=False, recordedFileNameCol=None, streamIntermediateResults=True, subscriptionKey=None, subscriptionKeyCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
audioDataCol (object) – Column holding audio data, must be either ByteArrays or Strings representing file URIs
endpointId (object) – endpoint for custom speech models
extraFfmpegArgs (list) – extra arguments to for ffmpeg output decoding
fileType (object) – The file type of the sound files, supported types: wav, ogg, mp3
format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
language (object) – Identifies the spoken language that is being recognized.
outputCol (object) – The name of the output column
participantsJson (object) – a json representation of a list of conversation participants (email, language, user)
profanity (object) – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
recordAudioData (bool) – Whether to record audio data to a file location, for use only with m3u8 streams
recordedFileNameCol (object) – Column holding file names to write audio data to if ``recordAudioData’’ is set to true
streamIntermediateResults (bool) – Whether or not to immediately return itermediate results, or group in a sequence
subscriptionKey (object) – the API key to use
url (object) – Url of the service
- audioDataCol = Param(parent='undefined', name='audioDataCol', doc='Column holding audio data, must be either ByteArrays or Strings representing file URIs')
- endpointId = Param(parent='undefined', name='endpointId', doc='endpoint for custom speech models')
- extraFfmpegArgs = Param(parent='undefined', name='extraFfmpegArgs', doc='extra arguments to for ffmpeg output decoding')
- fileType = Param(parent='undefined', name='fileType', doc='The file type of the sound files, supported types: wav, ogg, mp3')
- format = Param(parent='undefined', name='format', doc=' Specifies the result format. Accepted values are simple and detailed. Default is simple. ')
- getAudioDataCol()[source]
- Returns
Column holding audio data, must be either ByteArrays or Strings representing file URIs
- Return type
audioDataCol
- getExtraFfmpegArgs()[source]
- Returns
extra arguments to for ffmpeg output decoding
- Return type
extraFfmpegArgs
- getFileType()[source]
- Returns
The file type of the sound files, supported types: wav, ogg, mp3
- Return type
fileType
- getFormat()[source]
- Returns
Specifies the result format. Accepted values are simple and detailed. Default is simple.
- Return type
format
- getLanguage()[source]
- Returns
Identifies the spoken language that is being recognized.
- Return type
language
- getParticipantsJson()[source]
- Returns
a json representation of a list of conversation participants (email, language, user)
- Return type
participantsJson
- getProfanity()[source]
- Returns
Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- Return type
profanity
- getRecordAudioData()[source]
- Returns
Whether to record audio data to a file location, for use only with m3u8 streams
- Return type
recordAudioData
- getRecordedFileNameCol()[source]
- Returns
Column holding file names to write audio data to if ``recordAudioData’’ is set to true
- Return type
recordedFileNameCol
- getStreamIntermediateResults()[source]
- Returns
Whether or not to immediately return itermediate results, or group in a sequence
- Return type
streamIntermediateResults
- language = Param(parent='undefined', name='language', doc=' Identifies the spoken language that is being recognized. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- participantsJson = Param(parent='undefined', name='participantsJson', doc='a json representation of a list of conversation participants (email, language, user)')
- profanity = Param(parent='undefined', name='profanity', doc=' Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked. ')
- recordAudioData = Param(parent='undefined', name='recordAudioData', doc='Whether to record audio data to a file location, for use only with m3u8 streams')
- recordedFileNameCol = Param(parent='undefined', name='recordedFileNameCol', doc="Column holding file names to write audio data to if ``recordAudioData'' is set to true")
- setAudioDataCol(value)[source]
- Parameters
audioDataCol – Column holding audio data, must be either ByteArrays or Strings representing file URIs
- setExtraFfmpegArgs(value)[source]
- Parameters
extraFfmpegArgs – extra arguments to for ffmpeg output decoding
- setFileType(value)[source]
- Parameters
fileType – The file type of the sound files, supported types: wav, ogg, mp3
- setFileTypeCol(value)[source]
- Parameters
fileType – The file type of the sound files, supported types: wav, ogg, mp3
- setFormat(value)[source]
- Parameters
format – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setFormatCol(value)[source]
- Parameters
format – Specifies the result format. Accepted values are simple and detailed. Default is simple.
- setLanguage(value)[source]
- Parameters
language – Identifies the spoken language that is being recognized.
- setLanguageCol(value)[source]
- Parameters
language – Identifies the spoken language that is being recognized.
- setParams(audioDataCol=None, endpointId=None, extraFfmpegArgs=[], fileType=None, fileTypeCol=None, format=None, formatCol=None, language=None, languageCol=None, outputCol=None, participantsJson=None, participantsJsonCol=None, profanity=None, profanityCol=None, recordAudioData=False, recordedFileNameCol=None, streamIntermediateResults=True, subscriptionKey=None, subscriptionKeyCol=None, url=None)[source]
Set the (keyword only) parameters
- setParticipantsJson(value)[source]
- Parameters
participantsJson – a json representation of a list of conversation participants (email, language, user)
- setParticipantsJsonCol(value)[source]
- Parameters
participantsJson – a json representation of a list of conversation participants (email, language, user)
- setProfanity(value)[source]
- Parameters
profanity – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setProfanityCol(value)[source]
- Parameters
profanity – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- setRecordAudioData(value)[source]
- Parameters
recordAudioData – Whether to record audio data to a file location, for use only with m3u8 streams
- setRecordedFileNameCol(value)[source]
- Parameters
recordedFileNameCol – Column holding file names to write audio data to if ``recordAudioData’’ is set to true
- setStreamIntermediateResults(value)[source]
- Parameters
streamIntermediateResults – Whether or not to immediately return itermediate results, or group in a sequence
- streamIntermediateResults = Param(parent='undefined', name='streamIntermediateResults', doc='Whether or not to immediately return itermediate results, or group in a sequence')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.TagImage module
- class synapse.ml.cognitive.TagImage.TagImage(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='TagImage_6dc9b2588e33_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='TagImage_6dc9b2588e33_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – The desired language for output generation.
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imageBytes = Param(parent='undefined', name='imageBytes', doc='bytestream of the image to use')
- imageUrl = Param(parent='undefined', name='imageUrl', doc='the url of the image to use')
- language = Param(parent='undefined', name='language', doc='The desired language for output generation.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setParams(concurrency=1, concurrentTimeout=None, errorCol='TagImage_6dc9b2588e33_error', handler=None, imageBytes=None, imageBytesCol=None, imageUrl=None, imageUrlCol=None, language=None, languageCol=None, outputCol='TagImage_6dc9b2588e33_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.TextSentiment module
- class synapse.ml.cognitive.TextSentiment.TextSentiment(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='TextSentiment_71e3ce683c70_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, opinionMining=None, opinionMiningCol=None, outputCol='TextSentiment_71e3ce683c70_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
modelVersion (object) – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
opinionMining (object) – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
outputCol (object) – The name of the output column
showStats (object) – if set to true, response will contain input and document level statistics.
stringIndexType (object) – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getModelVersion()[source]
- Returns
This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- Return type
modelVersion
- getOpinionMining()[source]
- Returns
if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
- Return type
opinionMining
- getShowStats()[source]
- Returns
if set to true, response will contain input and document level statistics.
- Return type
showStats
- getStringIndexType()[source]
- Returns
Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- Return type
stringIndexType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='the language code of the text (optional for some services)')
- modelVersion = Param(parent='undefined', name='modelVersion', doc='This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.')
- opinionMining = Param(parent='undefined', name='opinionMining', doc='if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setModelVersion(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setModelVersionCol(value)[source]
- Parameters
modelVersion – This value indicates which model will be used for scoring. If a model-version is not specified, the API should default to the latest, non-preview version.
- setOpinionMining(value)[source]
- Parameters
opinionMining – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
- setOpinionMiningCol(value)[source]
- Parameters
opinionMining – if set to true, response will contain not only sentiment prediction but also opinion mining (aspect-based sentiment analysis) results.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='TextSentiment_71e3ce683c70_error', handler=None, language=None, languageCol=None, modelVersion=None, modelVersionCol=None, opinionMining=None, opinionMiningCol=None, outputCol='TextSentiment_71e3ce683c70_output', showStats=None, showStatsCol=None, stringIndexType=None, stringIndexTypeCol=None, subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setShowStats(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setShowStatsCol(value)[source]
- Parameters
showStats – if set to true, response will contain input and document level statistics.
- setStringIndexType(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setStringIndexTypeCol(value)[source]
- Parameters
stringIndexType – Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- showStats = Param(parent='undefined', name='showStats', doc='if set to true, response will contain input and document level statistics.')
- stringIndexType = Param(parent='undefined', name='stringIndexType', doc='Specifies the method used to interpret string offsets. Defaults to Text Elements (Graphemes) according to Unicode v8.0.0. For additional information see https://aka.ms/text-analytics-offsets')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- text = Param(parent='undefined', name='text', doc='the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.TextSentimentV2 module
- class synapse.ml.cognitive.TextSentimentV2.TextSentimentV2(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='TextSentimentV2_1a77b3dc7ee7_error', handler=None, language=None, languageCol=None, outputCol='TextSentimentV2_1a77b3dc7ee7_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
handler (object) – Which strategy to use when handling requests
language (object) – the language code of the text (optional for some services)
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
text (object) – the text in the request body
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
the language code of the text (optional for some services)
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='the language code of the text (optional for some services)')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setLanguageCol(value)[source]
- Parameters
language – the language code of the text (optional for some services)
- setParams(concurrency=1, concurrentTimeout=None, errorCol='TextSentimentV2_1a77b3dc7ee7_error', handler=None, language=None, languageCol=None, outputCol='TextSentimentV2_1a77b3dc7ee7_output', subscriptionKey=None, subscriptionKeyCol=None, text=None, textCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- text = Param(parent='undefined', name='text', doc='the text in the request body')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.Translate module
- class synapse.ml.cognitive.Translate.Translate(java_obj=None, allowFallback=None, allowFallbackCol=None, category=None, categoryCol=None, concurrency=1, concurrentTimeout=None, errorCol='Translate_e4ba15333d36_error', fromLanguage=None, fromLanguageCol=None, fromScript=None, fromScriptCol=None, handler=None, includeAlignment=None, includeAlignmentCol=None, includeSentenceLength=None, includeSentenceLengthCol=None, outputCol='Translate_e4ba15333d36_output', profanityAction=None, profanityActionCol=None, profanityMarker=None, profanityMarkerCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, suggestedFrom=None, suggestedFromCol=None, text=None, textCol=None, textType=None, textTypeCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, toScript=None, toScriptCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
allowFallback (object) – Specifies that the service is allowed to fall back to a general system when a custom system does not exist.
category (object) – A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
fromLanguage (object) – Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.
fromScript (object) – Specifies the script of the input text.
handler (object) – Which strategy to use when handling requests
includeAlignment (object) – Specifies whether to include alignment projection from source text to translated text.
includeSentenceLength (object) – Specifies whether to include sentence boundaries for the input text and the translated text.
outputCol (object) – The name of the output column
profanityAction (object) – Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.
profanityMarker (object) – Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.
subscriptionKey (object) – the API key to use
subscriptionRegion (object) – the API region to use
suggestedFrom (object) – Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.
text (object) – the string to translate
textType (object) – Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.
timeout (float) – number of seconds to wait before closing the connection
toLanguage (object) – Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de&to=it to translate to German and Italian.
toScript (object) – Specifies the script of the translated text.
url (object) – Url of the service
- allowFallback = Param(parent='undefined', name='allowFallback', doc='Specifies that the service is allowed to fall back to a general system when a custom system does not exist. ')
- category = Param(parent='undefined', name='category', doc='A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fromLanguage = Param(parent='undefined', name='fromLanguage', doc='Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.')
- fromScript = Param(parent='undefined', name='fromScript', doc='Specifies the script of the input text.')
- getAllowFallback()[source]
- Returns
Specifies that the service is allowed to fall back to a general system when a custom system does not exist.
- Return type
allowFallback
- getCategory()[source]
- Returns
A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.
- Return type
category
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getFromLanguage()[source]
- Returns
Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.
- Return type
fromLanguage
- getIncludeAlignment()[source]
- Returns
Specifies whether to include alignment projection from source text to translated text.
- Return type
includeAlignment
- getIncludeSentenceLength()[source]
- Returns
Specifies whether to include sentence boundaries for the input text and the translated text.
- Return type
includeSentenceLength
- getProfanityAction()[source]
- Returns
Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.
- Return type
profanityAction
- getProfanityMarker()[source]
- Returns
Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.
- Return type
profanityMarker
- getSuggestedFrom()[source]
- Returns
Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.
- Return type
suggestedFrom
- getTextType()[source]
- Returns
Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.
- Return type
textType
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getToLanguage()[source]
- Returns
Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de&to=it to translate to German and Italian.
- Return type
toLanguage
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- includeAlignment = Param(parent='undefined', name='includeAlignment', doc='Specifies whether to include alignment projection from source text to translated text.')
- includeSentenceLength = Param(parent='undefined', name='includeSentenceLength', doc='Specifies whether to include sentence boundaries for the input text and the translated text. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- profanityAction = Param(parent='undefined', name='profanityAction', doc='Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted. ')
- profanityMarker = Param(parent='undefined', name='profanityMarker', doc='Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.')
- setAllowFallback(value)[source]
- Parameters
allowFallback – Specifies that the service is allowed to fall back to a general system when a custom system does not exist.
- setAllowFallbackCol(value)[source]
- Parameters
allowFallback – Specifies that the service is allowed to fall back to a general system when a custom system does not exist.
- setCategory(value)[source]
- Parameters
category – A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.
- setCategoryCol(value)[source]
- Parameters
category – A string specifying the category (domain) of the translation. This parameter is used to get translations from a customized system built with Custom Translator. Add the Category ID from your Custom Translator project details to this parameter to use your deployed customized system. Default value is: general.
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFromLanguage(value)[source]
- Parameters
fromLanguage – Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.
- setFromLanguageCol(value)[source]
- Parameters
fromLanguage – Specifies the language of the input text. Find which languages are available to translate from by looking up supported languages using the translation scope. If the from parameter is not specified, automatic language detection is applied to determine the source language. You must use the from parameter rather than autodetection when using the dynamic dictionary feature.
- setIncludeAlignment(value)[source]
- Parameters
includeAlignment – Specifies whether to include alignment projection from source text to translated text.
- setIncludeAlignmentCol(value)[source]
- Parameters
includeAlignment – Specifies whether to include alignment projection from source text to translated text.
- setIncludeSentenceLength(value)[source]
- Parameters
includeSentenceLength – Specifies whether to include sentence boundaries for the input text and the translated text.
- setIncludeSentenceLengthCol(value)[source]
- Parameters
includeSentenceLength – Specifies whether to include sentence boundaries for the input text and the translated text.
- setParams(allowFallback=None, allowFallbackCol=None, category=None, categoryCol=None, concurrency=1, concurrentTimeout=None, errorCol='Translate_e4ba15333d36_error', fromLanguage=None, fromLanguageCol=None, fromScript=None, fromScriptCol=None, handler=None, includeAlignment=None, includeAlignmentCol=None, includeSentenceLength=None, includeSentenceLengthCol=None, outputCol='Translate_e4ba15333d36_output', profanityAction=None, profanityActionCol=None, profanityMarker=None, profanityMarkerCol=None, subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, suggestedFrom=None, suggestedFromCol=None, text=None, textCol=None, textType=None, textTypeCol=None, timeout=60.0, toLanguage=None, toLanguageCol=None, toScript=None, toScriptCol=None, url=None)[source]
Set the (keyword only) parameters
- setProfanityAction(value)[source]
- Parameters
profanityAction – Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.
- setProfanityActionCol(value)[source]
- Parameters
profanityAction – Specifies how profanities should be treated in translations. Possible values are: NoAction (default), Marked or Deleted.
- setProfanityMarker(value)[source]
- Parameters
profanityMarker – Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.
- setProfanityMarkerCol(value)[source]
- Parameters
profanityMarker – Specifies how profanities should be marked in translations. Possible values are: Asterisk (default) or Tag.
- setSuggestedFrom(value)[source]
- Parameters
suggestedFrom – Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.
- setSuggestedFromCol(value)[source]
- Parameters
suggestedFrom – Specifies a fallback language if the language of the input text can’t be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.
- setTextType(value)[source]
- Parameters
textType – Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.
- setTextTypeCol(value)[source]
- Parameters
textType – Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- setToLanguage(value)[source]
- Parameters
toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de&to=it to translate to German and Italian.
- setToLanguageCol(value)[source]
- Parameters
toLanguage – Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It’s possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de&to=it to translate to German and Italian.
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='the API region to use')
- suggestedFrom = Param(parent='undefined', name='suggestedFrom', doc="Specifies a fallback language if the language of the input text can't be identified. Language autodetection is applied when the from parameter is omitted. If detection fails, the suggestedFrom language will be assumed.")
- text = Param(parent='undefined', name='text', doc='the string to translate')
- textType = Param(parent='undefined', name='textType', doc='Defines whether the text being translated is plain text or HTML text. Any HTML needs to be a well-formed, complete element. Possible values are: plain (default) or html.')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- toLanguage = Param(parent='undefined', name='toLanguage', doc="Specifies the language of the output text. The target language must be one of the supported languages included in the translation scope. For example, use to=de to translate to German. It's possible to translate to multiple languages simultaneously by repeating the parameter in the query string. For example, use to=de&to=it to translate to German and Italian.")
- toScript = Param(parent='undefined', name='toScript', doc='Specifies the script of the translated text.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.Transliterate module
- class synapse.ml.cognitive.Transliterate.Transliterate(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='Transliterate_af2462fbb766_error', fromScript=None, fromScriptCol=None, handler=None, language=None, languageCol=None, outputCol='Transliterate_af2462fbb766_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toScript=None, toScriptCol=None, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
fromScript (object) – Specifies the script of the input text.
handler (object) – Which strategy to use when handling requests
language (object) – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
outputCol (object) – The name of the output column
subscriptionKey (object) – the API key to use
subscriptionRegion (object) – the API region to use
text (object) – the string to translate
timeout (float) – number of seconds to wait before closing the connection
toScript (object) – Specifies the script of the translated text.
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fromScript = Param(parent='undefined', name='fromScript', doc='Specifies the script of the input text.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLanguage()[source]
- Returns
Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- Return type
language
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- language = Param(parent='undefined', name='language', doc='Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setLanguage(value)[source]
- Parameters
language – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setLanguageCol(value)[source]
- Parameters
language – Language tag identifying the language of the input text. If a code is not specified, automatic language detection will be applied.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='Transliterate_af2462fbb766_error', fromScript=None, fromScriptCol=None, handler=None, language=None, languageCol=None, outputCol='Transliterate_af2462fbb766_output', subscriptionKey=None, subscriptionKeyCol=None, subscriptionRegion=None, subscriptionRegionCol=None, text=None, textCol=None, timeout=60.0, toScript=None, toScriptCol=None, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- subscriptionRegion = Param(parent='undefined', name='subscriptionRegion', doc='the API region to use')
- text = Param(parent='undefined', name='text', doc='the string to translate')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- toScript = Param(parent='undefined', name='toScript', doc='Specifies the script of the translated text.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.cognitive.VerifyFaces module
- class synapse.ml.cognitive.VerifyFaces.VerifyFaces(java_obj=None, concurrency=1, concurrentTimeout=None, errorCol='VerifyFaces_db2641d91c47_error', faceId=None, faceIdCol=None, faceId1=None, faceId1Col=None, faceId2=None, faceId2Col=None, handler=None, largePersonGroupId=None, largePersonGroupIdCol=None, outputCol='VerifyFaces_db2641d91c47_output', personGroupId=None, personGroupIdCol=None, personId=None, personIdCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls
concurrentTimeout (float) – max number seconds to wait on futures if concurrency >= 1
errorCol (object) – column to hold http errors
faceId (object) – faceId of the face, comes from Face - Detect.
faceId1 (object) – faceId of one face, comes from Face - Detect.
faceId2 (object) – faceId of another face, comes from Face - Detect.
handler (object) – Which strategy to use when handling requests
largePersonGroupId (object) – Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
outputCol (object) – The name of the output column
personGroupId (object) – Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
personId (object) – Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
subscriptionKey (object) – the API key to use
timeout (float) – number of seconds to wait before closing the connection
url (object) – Url of the service
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- faceId = Param(parent='undefined', name='faceId', doc='faceId of the face, comes from Face - Detect.')
- faceId1 = Param(parent='undefined', name='faceId1', doc='faceId of one face, comes from Face - Detect.')
- faceId2 = Param(parent='undefined', name='faceId2', doc='faceId of another face, comes from Face - Detect.')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getLargePersonGroupId()[source]
- Returns
Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- Return type
largePersonGroupId
- getPersonGroupId()[source]
- Returns
Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- Return type
personGroupId
- getPersonId()[source]
- Returns
Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
- Return type
personId
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- largePersonGroupId = Param(parent='undefined', name='largePersonGroupId', doc='Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- personGroupId = Param(parent='undefined', name='personGroupId', doc='Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.')
- personId = Param(parent='undefined', name='personId', doc='Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout – max number seconds to wait on futures if concurrency >= 1
- setFaceId2Col(value)[source]
- Parameters
faceId2 – faceId of another face, comes from Face - Detect.
- setLargePersonGroupId(value)[source]
- Parameters
largePersonGroupId – Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setLargePersonGroupIdCol(value)[source]
- Parameters
largePersonGroupId – Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setParams(concurrency=1, concurrentTimeout=None, errorCol='VerifyFaces_db2641d91c47_error', faceId=None, faceIdCol=None, faceId1=None, faceId1Col=None, faceId2=None, faceId2Col=None, handler=None, largePersonGroupId=None, largePersonGroupIdCol=None, outputCol='VerifyFaces_db2641d91c47_output', personGroupId=None, personGroupIdCol=None, personId=None, personIdCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPersonGroupId(value)[source]
- Parameters
personGroupId – Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setPersonGroupIdCol(value)[source]
- Parameters
personGroupId – Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- setPersonId(value)[source]
- Parameters
personId – Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
- setPersonIdCol(value)[source]
- Parameters
personId – Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
- setTimeout(value)[source]
- Parameters
timeout – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
Module contents
SynapseML is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK), LightGBM and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of datasources.
SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. In this vein, SynapseML provides easy to use SparkML transformers for a wide variety of Microsoft Cognitive Services. For production grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.
SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+.