mmlspark.cognitive package¶
Submodules¶
mmlspark.cognitive.AddDocuments module¶
-
class
mmlspark.cognitive.AddDocuments.
AddDocuments
(actionCol='@search.action', batchSize=100, concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, indexName=None, outputCol=None, serviceName=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
actionCol (str) – You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an ‘upsert’ where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn’t exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field ‘tags’ with value [‘budget’] and you execute a merge with value [‘economy’, ‘pool’] for ‘tags’, the final value of the ‘tags’ field will be [‘economy’, ‘pool’]. It will not be [‘budget’, ‘economy’, ‘pool’]. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null. (default: @search.action)
batchSize (int) – The max size of the buffer (default: 100)
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
indexName (str) –
outputCol (str) – The name of the output column (default: [self.uid]_output)
serviceName (str) –
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getActionCol
()[source]¶ - Returns
You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an ‘upsert’ where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn’t exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field ‘tags’ with value [‘budget’] and you execute a merge with value [‘economy’, ‘pool’] for ‘tags’, the final value of the ‘tags’ field will be [‘economy’, ‘pool’]. It will not be [‘budget’, ‘economy’, ‘pool’]. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null. (default: @search.action)
- Return type
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setActionCol
(value)[source]¶ - Parameters
actionCol (str) – You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an ‘upsert’ where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn’t exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field ‘tags’ with value [‘budget’] and you execute a merge with value [‘economy’, ‘pool’] for ‘tags’, the final value of the ‘tags’ field will be [‘economy’, ‘pool’]. It will not be [‘budget’, ‘economy’, ‘pool’]. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null. (default: @search.action)
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(actionCol='@search.action', batchSize=100, concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, indexName=None, outputCol=None, serviceName=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
actionCol (str) – You can combine actions, such as an upload and a delete, in the same batch. upload: An upload action is similar to an ‘upsert’ where the document will be inserted if it is new and updated/replaced if it exists. Note that all fields are replaced in the update case. merge: Merge updates an existing document with the specified fields. If the document doesn’t exist, the merge will fail. Any field you specify in a merge will replace the existing field in the document. This includes fields of type Collection(Edm.String). For example, if the document contains a field ‘tags’ with value [‘budget’] and you execute a merge with value [‘economy’, ‘pool’] for ‘tags’, the final value of the ‘tags’ field will be [‘economy’, ‘pool’]. It will not be [‘budget’, ‘economy’, ‘pool’]. mergeOrUpload: This action behaves like merge if a document with the given key already exists in the index. If the document does not exist, it behaves like upload with a new document. delete: Delete removes the specified document from the index. Note that any field you specify in a delete operation, other than the key field, will be ignored. If you want to remove an individual field from a document, use merge instead and simply set the field explicitly to null. (default: @search.action)
batchSize (int) – The max size of the buffer (default: 100)
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
indexName (str) –
outputCol (str) – The name of the output column (default: [self.uid]_output)
serviceName (str) –
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
mmlspark.cognitive.AnalyzeImage module¶
-
class
mmlspark.cognitive.AnalyzeImage.
AnalyzeImage
(concurrency=1, concurrentTimeout=100.0, details=None, errorCol=None, handler=None, imageBytes=None, imageUrl=None, language=None, outputCol=None, subscriptionKey=None, timeout=60.0, url=None, visualFeatures=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
details (object) – what visual feature types to return
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – the language of the response (en if none given) (default: ServiceParamData(None,Some(en)))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
visualFeatures (object) – what visual feature types to return
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getLanguage
()[source]¶ - Returns
the language of the response (en if none given) (default: ServiceParamData(None,Some(en)))
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setLanguage
(value)[source]¶ - Parameters
language (object) – the language of the response (en if none given) (default: ServiceParamData(None,Some(en)))
-
setLanguageCol
(value)[source]¶ - Parameters
language (object) – the language of the response (en if none given) (default: ServiceParamData(None,Some(en)))
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, details=None, errorCol=None, handler=None, imageBytes=None, imageUrl=None, language=None, outputCol=None, subscriptionKey=None, timeout=60.0, url=None, visualFeatures=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
details (object) – what visual feature types to return
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – the language of the response (en if none given) (default: ServiceParamData(None,Some(en)))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
visualFeatures (object) – what visual feature types to return
-
setTimeout
(value)[source]¶ - Parameters
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
mmlspark.cognitive.AzureSearchWriter module¶
mmlspark.cognitive.BingImageSearch module¶
-
class
mmlspark.cognitive.BingImageSearch.
BingImageSearch
(aspect=None, color=None, concurrency=1, concurrentTimeout=100.0, count=None, errorCol=None, freshness=None, handler=None, height=None, imageContent=None, imageType=None, license=None, maxFileSize=None, maxHeight=None, maxWidth=None, minFileSize=None, minHeight=None, minWidth=None, mkt=None, offset=None, outputCol=None, q=None, size=None, subscriptionKey=None, timeout=60.0, url='https://api.cognitive.microsoft.com/bing/v7.0/images/search', width=None)[source]¶ Bases:
mmlspark.cognitive._BingImageSearch._BingImageSearch
mmlspark.cognitive.DescribeImage module¶
-
class
mmlspark.cognitive.DescribeImage.
DescribeImage
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, imageBytes=None, imageUrl=None, language=None, maxCandidates=None, outputCol=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – Language of image description (default: ServiceParamData(None,Some(en)))
maxCandidates (object) – Maximum candidate descriptions to return (default: ServiceParamData(None,Some(1)))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getLanguage
()[source]¶ - Returns
Language of image description (default: ServiceParamData(None,Some(en)))
- Return type
-
getMaxCandidates
()[source]¶ - Returns
Maximum candidate descriptions to return (default: ServiceParamData(None,Some(1)))
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setLanguage
(value)[source]¶ - Parameters
language (object) – Language of image description (default: ServiceParamData(None,Some(en)))
-
setLanguageCol
(value)[source]¶ - Parameters
language (object) – Language of image description (default: ServiceParamData(None,Some(en)))
-
setMaxCandidates
(value)[source]¶ - Parameters
maxCandidates (object) – Maximum candidate descriptions to return (default: ServiceParamData(None,Some(1)))
-
setMaxCandidatesCol
(value)[source]¶ - Parameters
maxCandidates (object) – Maximum candidate descriptions to return (default: ServiceParamData(None,Some(1)))
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, imageBytes=None, imageUrl=None, language=None, maxCandidates=None, outputCol=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – Language of image description (default: ServiceParamData(None,Some(en)))
maxCandidates (object) – Maximum candidate descriptions to return (default: ServiceParamData(None,Some(1)))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
mmlspark.cognitive.DetectAnomalies module¶
-
class
mmlspark.cognitive.DetectAnomalies.
DetectAnomalies
(concurrency=1, concurrentTimeout=100.0, customInterval=None, errorCol=None, granularity=None, handler=None, maxAnomalyRatio=None, outputCol=None, period=None, sensitivity=None, series=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
errorCol (str) – column to hold http errors (default: [self.uid]_error)
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
outputCol (str) – The name of the output column (default: [self.uid]_output)
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getCustomInterval
()[source]¶ - Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
-
getGranularity
()[source]¶ - Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getMaxAnomalyRatio
()[source]¶ - Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getPeriod
()[source]¶ - Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
-
getSensitivity
()[source]¶ - Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
-
getSeries
()[source]¶ - Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setCustomInterval
(value)[source]¶ - Parameters
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
-
setCustomIntervalCol
(value)[source]¶ - Parameters
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setGranularity
(value)[source]¶ - Parameters
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
-
setGranularityCol
(value)[source]¶ - Parameters
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setMaxAnomalyRatio
(value)[source]¶ - Parameters
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
-
setMaxAnomalyRatioCol
(value)[source]¶ - Parameters
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, customInterval=None, errorCol=None, granularity=None, handler=None, maxAnomalyRatio=None, outputCol=None, period=None, sensitivity=None, series=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
errorCol (str) – column to hold http errors (default: [self.uid]_error)
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
outputCol (str) – The name of the output column (default: [self.uid]_output)
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
setPeriod
(value)[source]¶ - Parameters
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
-
setPeriodCol
(value)[source]¶ - Parameters
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
-
setSensitivity
(value)[source]¶ - Parameters
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
-
setSensitivityCol
(value)[source]¶ - Parameters
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
-
setSeries
(value)[source]¶ - Parameters
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
-
setSeriesCol
(value)[source]¶ - Parameters
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
mmlspark.cognitive.DetectFace module¶
-
class
mmlspark.cognitive.DetectFace.
DetectFace
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, imageUrl=None, outputCol=None, returnFaceAttributes=None, returnFaceId=None, returnFaceLandmarks=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
imageUrl (object) – the url of the image to use
outputCol (str) – The name of the output column (default: [self.uid]_output)
returnFaceAttributes (object) – Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
returnFaceId (object) – Return faceIds of the detected faces or not. The default value is true
returnFaceLandmarks (object) – Return face landmarks of the detected faces or not. The default value is false.
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getReturnFaceAttributes
()[source]¶ - Returns
Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
- Return type
-
getReturnFaceId
()[source]¶ - Returns
Return faceIds of the detected faces or not. The default value is true
- Return type
-
getReturnFaceLandmarks
()[source]¶ - Returns
Return face landmarks of the detected faces or not. The default value is false.
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, imageUrl=None, outputCol=None, returnFaceAttributes=None, returnFaceId=None, returnFaceLandmarks=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
imageUrl (object) – the url of the image to use
outputCol (str) – The name of the output column (default: [self.uid]_output)
returnFaceAttributes (object) – Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
returnFaceId (object) – Return faceIds of the detected faces or not. The default value is true
returnFaceLandmarks (object) – Return face landmarks of the detected faces or not. The default value is false.
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
setReturnFaceAttributes
(value)[source]¶ - Parameters
returnFaceAttributes (object) – Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
-
setReturnFaceAttributesCol
(value)[source]¶ - Parameters
returnFaceAttributes (object) – Analyze and return the one or more specified face attributes Supported face attributes include: age, gender, headPose, smile, facialHair, glasses, emotion, hair, makeup, occlusion, accessories, blur, exposure and noise. Face attribute analysis has additional computational and time cost.
-
setReturnFaceId
(value)[source]¶ - Parameters
returnFaceId (object) – Return faceIds of the detected faces or not. The default value is true
-
setReturnFaceIdCol
(value)[source]¶ - Parameters
returnFaceId (object) – Return faceIds of the detected faces or not. The default value is true
-
setReturnFaceLandmarks
(value)[source]¶ - Parameters
returnFaceLandmarks (object) – Return face landmarks of the detected faces or not. The default value is false.
-
setReturnFaceLandmarksCol
(value)[source]¶ - Parameters
returnFaceLandmarks (object) – Return face landmarks of the detected faces or not. The default value is false.
mmlspark.cognitive.DetectLastAnomaly module¶
-
class
mmlspark.cognitive.DetectLastAnomaly.
DetectLastAnomaly
(concurrency=1, concurrentTimeout=100.0, customInterval=None, errorCol=None, granularity=None, handler=None, maxAnomalyRatio=None, outputCol=None, period=None, sensitivity=None, series=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
errorCol (str) – column to hold http errors (default: [self.uid]_error)
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
outputCol (str) – The name of the output column (default: [self.uid]_output)
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getCustomInterval
()[source]¶ - Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
-
getGranularity
()[source]¶ - Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getMaxAnomalyRatio
()[source]¶ - Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getPeriod
()[source]¶ - Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
-
getSensitivity
()[source]¶ - Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
-
getSeries
()[source]¶ - Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setCustomInterval
(value)[source]¶ - Parameters
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
-
setCustomIntervalCol
(value)[source]¶ - Parameters
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setGranularity
(value)[source]¶ - Parameters
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
-
setGranularityCol
(value)[source]¶ - Parameters
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setMaxAnomalyRatio
(value)[source]¶ - Parameters
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
-
setMaxAnomalyRatioCol
(value)[source]¶ - Parameters
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, customInterval=None, errorCol=None, granularity=None, handler=None, maxAnomalyRatio=None, outputCol=None, period=None, sensitivity=None, series=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
errorCol (str) – column to hold http errors (default: [self.uid]_error)
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
outputCol (str) – The name of the output column (default: [self.uid]_output)
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
setPeriod
(value)[source]¶ - Parameters
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
-
setPeriodCol
(value)[source]¶ - Parameters
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
-
setSensitivity
(value)[source]¶ - Parameters
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
-
setSensitivityCol
(value)[source]¶ - Parameters
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
-
setSeries
(value)[source]¶ - Parameters
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
-
setSeriesCol
(value)[source]¶ - Parameters
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
mmlspark.cognitive.EntityDetector module¶
-
class
mmlspark.cognitive.EntityDetector.
EntityDetector
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, language=None, outputCol=None, subscriptionKey=None, text=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getLanguage
()[source]¶ - Returns
the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getText
()[source]¶ - Returns
the text in the request body (default: ServiceParamData(Some(Right(text)),None))
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setLanguage
(value)[source]¶ - Parameters
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
-
setLanguageCol
(value)[source]¶ - Parameters
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, language=None, outputCol=None, subscriptionKey=None, text=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
setText
(value)[source]¶ - Parameters
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
-
setTextCol
(value)[source]¶ - Parameters
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
mmlspark.cognitive.FindSimilarFace module¶
-
class
mmlspark.cognitive.FindSimilarFace.
FindSimilarFace
(concurrency=1, concurrentTimeout=100.0, errorCol=None, faceId=None, faceIds=None, faceListId=None, handler=None, largeFaceListId=None, maxNumOfCandidatesReturned=None, mode=None, outputCol=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
faceId (object) – faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
faceIds (object) – An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
faceListId (object) – An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
largeFaceListId (object) – An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
maxNumOfCandidatesReturned (object) – Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
mode (object) – Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getFaceId
()[source]¶ - Returns
faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
- Return type
-
getFaceIds
()[source]¶ - Returns
An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- Return type
-
getFaceListId
()[source]¶ - Returns
An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- Return type
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getLargeFaceListId
()[source]¶ - Returns
An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
- Return type
-
getMaxNumOfCandidatesReturned
()[source]¶ - Returns
Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
- Return type
-
getMode
()[source]¶ - Returns
Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setFaceId
(value)[source]¶ - Parameters
faceId (object) – faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
-
setFaceIdCol
(value)[source]¶ - Parameters
faceId (object) – faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
-
setFaceIds
(value)[source]¶ - Parameters
faceIds (object) – An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
-
setFaceIdsCol
(value)[source]¶ - Parameters
faceIds (object) – An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
-
setFaceListId
(value)[source]¶ - Parameters
faceListId (object) – An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
-
setFaceListIdCol
(value)[source]¶ - Parameters
faceListId (object) – An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setLargeFaceListId
(value)[source]¶ - Parameters
largeFaceListId (object) – An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
-
setLargeFaceListIdCol
(value)[source]¶ - Parameters
largeFaceListId (object) – An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
-
setMaxNumOfCandidatesReturned
(value)[source]¶ - Parameters
maxNumOfCandidatesReturned (object) – Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
-
setMaxNumOfCandidatesReturnedCol
(value)[source]¶ - Parameters
maxNumOfCandidatesReturned (object) – Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
-
setMode
(value)[source]¶ - Parameters
mode (object) – Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
-
setModeCol
(value)[source]¶ - Parameters
mode (object) – Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, errorCol=None, faceId=None, faceIds=None, faceListId=None, handler=None, largeFaceListId=None, maxNumOfCandidatesReturned=None, mode=None, outputCol=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
faceId (object) – faceId of the query face. User needs to call FaceDetect first to get a valid faceId. Note that this faceId is not persisted and will expire 24 hours after the detection call.
faceIds (object) – An array of candidate faceIds. All of them are created by FaceDetect and the faceIds will expire 24 hours after the detection call. The number of faceIds is limited to 1000. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
faceListId (object) – An existing user-specified unique candidate face list, created in FaceList - Create. Face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
largeFaceListId (object) – An existing user-specified unique candidate large face list, created in LargeFaceList - Create. Large face list contains a set of persistedFaceIds which are persisted and will never expire. Parameter faceListId, largeFaceListId and faceIds should not be provided at the same time.
maxNumOfCandidatesReturned (object) – Optional parameter. The number of top similar faces returned. The valid range is [1, 1000].It defaults to 20.
mode (object) – Optional parameter. Similar face searching mode. It can be ‘matchPerson’ or ‘matchFace’. It defaults to ‘matchPerson’.
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
mmlspark.cognitive.GenerateThumbnails module¶
-
class
mmlspark.cognitive.GenerateThumbnails.
GenerateThumbnails
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, height=None, imageBytes=None, imageUrl=None, outputCol=None, smartCropping=None, subscriptionKey=None, timeout=60.0, url=None, width=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
height (object) – the desired height of the image
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
outputCol (str) – The name of the output column (default: [self.uid]_output)
smartCropping (object) – whether to intelligently crop the image
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
width (object) – the desired width of the image
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, height=None, imageBytes=None, imageUrl=None, outputCol=None, smartCropping=None, subscriptionKey=None, timeout=60.0, url=None, width=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
height (object) – the desired height of the image
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
outputCol (str) – The name of the output column (default: [self.uid]_output)
smartCropping (object) – whether to intelligently crop the image
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
width (object) – the desired width of the image
-
setSmartCropping
(value)[source]¶ - Parameters
smartCropping (object) – whether to intelligently crop the image
-
setSmartCroppingCol
(value)[source]¶ - Parameters
smartCropping (object) – whether to intelligently crop the image
mmlspark.cognitive.GroupFaces module¶
-
class
mmlspark.cognitive.GroupFaces.
GroupFaces
(concurrency=1, concurrentTimeout=100.0, errorCol=None, faceIds=None, handler=None, outputCol=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
faceIds (object) – Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getFaceIds
()[source]¶ - Returns
Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
- Return type
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setFaceIds
(value)[source]¶ - Parameters
faceIds (object) – Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
-
setFaceIdsCol
(value)[source]¶ - Parameters
faceIds (object) – Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, errorCol=None, faceIds=None, handler=None, outputCol=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
faceIds (object) – Array of candidate faceId created by Face - Detect. The maximum is 1000 faces.
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
mmlspark.cognitive.IdentifyFaces module¶
-
class
mmlspark.cognitive.IdentifyFaces.
IdentifyFaces
(concurrency=1, concurrentTimeout=100.0, confidenceThreshold=None, errorCol=None, faceIds=None, handler=None, largePersonGroupId=None, maxNumOfCandidatesReturned=None, outputCol=None, personGroupId=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
confidenceThreshold (object) – Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
errorCol (str) – column to hold http errors (default: [self.uid]_error)
faceIds (object) – Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
largePersonGroupId (object) – largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
maxNumOfCandidatesReturned (object) – The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
outputCol (str) – The name of the output column (default: [self.uid]_output)
personGroupId (object) – personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getConfidenceThreshold
()[source]¶ - Returns
Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
- Return type
-
getFaceIds
()[source]¶ - Returns
Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
- Return type
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getLargePersonGroupId
()[source]¶ - Returns
largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- Return type
-
getMaxNumOfCandidatesReturned
()[source]¶ - Returns
The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getPersonGroupId
()[source]¶ - Returns
personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setConfidenceThreshold
(value)[source]¶ - Parameters
confidenceThreshold (object) – Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
-
setConfidenceThresholdCol
(value)[source]¶ - Parameters
confidenceThreshold (object) – Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setFaceIds
(value)[source]¶ - Parameters
faceIds (object) – Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
-
setFaceIdsCol
(value)[source]¶ - Parameters
faceIds (object) – Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setLargePersonGroupId
(value)[source]¶ - Parameters
largePersonGroupId (object) – largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
-
setLargePersonGroupIdCol
(value)[source]¶ - Parameters
largePersonGroupId (object) – largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
-
setMaxNumOfCandidatesReturned
(value)[source]¶ - Parameters
maxNumOfCandidatesReturned (object) – The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
-
setMaxNumOfCandidatesReturnedCol
(value)[source]¶ - Parameters
maxNumOfCandidatesReturned (object) – The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, confidenceThreshold=None, errorCol=None, faceIds=None, handler=None, largePersonGroupId=None, maxNumOfCandidatesReturned=None, outputCol=None, personGroupId=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
confidenceThreshold (object) – Optional parameter.Customized identification confidence threshold, in the range of [0, 1].Advanced user can tweak this value to override defaultinternal threshold for better precision on their scenario data.Note there is no guarantee of this threshold value workingon other data and after algorithm updates.
errorCol (str) – column to hold http errors (default: [self.uid]_error)
faceIds (object) – Array of query faces faceIds, created by the Face - Detect. Each of the faces are identified independently. The valid number of faceIds is between [1, 10].
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
largePersonGroupId (object) – largePersonGroupId of the target large person group, created by LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
maxNumOfCandidatesReturned (object) – The range of maxNumOfCandidatesReturned is between 1 and 100 (default is 10).
outputCol (str) – The name of the output column (default: [self.uid]_output)
personGroupId (object) – personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
setPersonGroupId
(value)[source]¶ - Parameters
personGroupId (object) – personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
-
setPersonGroupIdCol
(value)[source]¶ - Parameters
personGroupId (object) – personGroupId of the target person group, created by PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
mmlspark.cognitive.KeyPhraseExtractor module¶
-
class
mmlspark.cognitive.KeyPhraseExtractor.
KeyPhraseExtractor
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, language=None, outputCol=None, subscriptionKey=None, text=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getLanguage
()[source]¶ - Returns
the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getText
()[source]¶ - Returns
the text in the request body (default: ServiceParamData(Some(Right(text)),None))
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setLanguage
(value)[source]¶ - Parameters
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
-
setLanguageCol
(value)[source]¶ - Parameters
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, language=None, outputCol=None, subscriptionKey=None, text=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
setText
(value)[source]¶ - Parameters
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
-
setTextCol
(value)[source]¶ - Parameters
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
mmlspark.cognitive.LanguageDetector module¶
-
class
mmlspark.cognitive.LanguageDetector.
LanguageDetector
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, language=None, outputCol=None, subscriptionKey=None, text=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getLanguage
()[source]¶ - Returns
the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getText
()[source]¶ - Returns
the text in the request body (default: ServiceParamData(Some(Right(text)),None))
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setLanguage
(value)[source]¶ - Parameters
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
-
setLanguageCol
(value)[source]¶ - Parameters
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, language=None, outputCol=None, subscriptionKey=None, text=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
setText
(value)[source]¶ - Parameters
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
-
setTextCol
(value)[source]¶ - Parameters
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
mmlspark.cognitive.NER module¶
-
class
mmlspark.cognitive.NER.
NER
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, language=None, outputCol=None, subscriptionKey=None, text=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getLanguage
()[source]¶ - Returns
the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getText
()[source]¶ - Returns
the text in the request body (default: ServiceParamData(Some(Right(text)),None))
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setLanguage
(value)[source]¶ - Parameters
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
-
setLanguageCol
(value)[source]¶ - Parameters
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, language=None, outputCol=None, subscriptionKey=None, text=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
setText
(value)[source]¶ - Parameters
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
-
setTextCol
(value)[source]¶ - Parameters
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
mmlspark.cognitive.OCR module¶
-
class
mmlspark.cognitive.OCR.
OCR
(concurrency=1, concurrentTimeout=100.0, detectOrientation=None, errorCol=None, handler=None, imageBytes=None, imageUrl=None, language=None, outputCol=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
detectOrientation (object) – whether to detect image orientation prior to processing
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – the language to use
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getDetectOrientation
()[source]¶ - Returns
whether to detect image orientation prior to processing
- Return type
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setDetectOrientation
(value)[source]¶ - Parameters
detectOrientation (object) – whether to detect image orientation prior to processing
-
setDetectOrientationCol
(value)[source]¶ - Parameters
detectOrientation (object) – whether to detect image orientation prior to processing
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, detectOrientation=None, errorCol=None, handler=None, imageBytes=None, imageUrl=None, language=None, outputCol=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
detectOrientation (object) – whether to detect image orientation prior to processing
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – the language to use
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
mmlspark.cognitive.RecognizeDomainSpecificContent module¶
-
class
mmlspark.cognitive.RecognizeDomainSpecificContent.
RecognizeDomainSpecificContent
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, imageBytes=None, imageUrl=None, model=None, outputCol=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
model (object) – the domain specific model: celebrities, landmarks
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setModel
(value)[source]¶ - Parameters
model (object) – the domain specific model: celebrities, landmarks
-
setModelCol
(value)[source]¶ - Parameters
model (object) – the domain specific model: celebrities, landmarks
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, imageBytes=None, imageUrl=None, model=None, outputCol=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
model (object) – the domain specific model: celebrities, landmarks
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
mmlspark.cognitive.RecognizeText module¶
-
class
mmlspark.cognitive.RecognizeText.
RecognizeText
(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=100.0, errorCol=None, imageBytes=None, imageUrl=None, maxPollingRetries=1000, mode=None, outputCol=None, pollingDelay=300, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
backoffs (list) – array of backoffs to use in the handler (default: [I@1b3fa44)
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
maxPollingRetries (int) – number of times to poll (default: 1000)
mode (object) – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
outputCol (str) – The name of the output column (default: [self.uid]_output)
pollingDelay (int) – number of milliseconds to wait between polling (default: 300)
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getMode
()[source]¶ - Returns
If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getPollingDelay
()[source]¶ - Returns
number of milliseconds to wait between polling (default: 300)
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setMaxPollingRetries
(value)[source]¶ - Parameters
maxPollingRetries (int) – number of times to poll (default: 1000)
-
setMode
(value)[source]¶ - Parameters
mode (object) – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
-
setModeCol
(value)[source]¶ - Parameters
mode (object) – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(backoffs=[100, 500, 1000], concurrency=1, concurrentTimeout=100.0, errorCol=None, imageBytes=None, imageUrl=None, maxPollingRetries=1000, mode=None, outputCol=None, pollingDelay=300, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
backoffs (list) – array of backoffs to use in the handler (default: [I@1b3fa44)
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
maxPollingRetries (int) – number of times to poll (default: 1000)
mode (object) – If this parameter is set to ‘Printed’, printed text recognition is performed. If ‘Handwritten’ is specified, handwriting recognition is performed
outputCol (str) – The name of the output column (default: [self.uid]_output)
pollingDelay (int) – number of milliseconds to wait between polling (default: 300)
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
setPollingDelay
(value)[source]¶ - Parameters
pollingDelay (int) – number of milliseconds to wait between polling (default: 300)
mmlspark.cognitive.SimpleDetectAnomalies module¶
-
class
mmlspark.cognitive.SimpleDetectAnomalies.
SimpleDetectAnomalies
(concurrency=1, concurrentTimeout=100.0, customInterval=None, errorCol=None, granularity=None, groupbyCol=None, handler=None, maxAnomalyRatio=None, outputCol=None, period=None, sensitivity=None, series=None, subscriptionKey=None, timeout=60.0, timestampCol='timestamp', url=None, valueCol='value')[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
errorCol (str) – column to hold http errors (default: [self.uid]_error)
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
groupbyCol (str) – column that groups the series
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
outputCol (str) – The name of the output column (default: [self.uid]_output)
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
timestampCol (str) – column representing the time of the series (default: timestamp)
url (str) – Url of the service
valueCol (str) – column representing the value of the series (default: value)
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getCustomInterval
()[source]¶ - Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
-
getGranularity
()[source]¶ - Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getMaxAnomalyRatio
()[source]¶ - Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getPeriod
()[source]¶ - Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
-
getSensitivity
()[source]¶ - Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
-
getSeries
()[source]¶ - Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
getTimestampCol
()[source]¶ - Returns
column representing the time of the series (default: timestamp)
- Return type
-
getValueCol
()[source]¶ - Returns
column representing the value of the series (default: value)
- Return type
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setCustomInterval
(value)[source]¶ - Parameters
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
-
setCustomIntervalCol
(value)[source]¶ - Parameters
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setGranularity
(value)[source]¶ - Parameters
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
-
setGranularityCol
(value)[source]¶ - Parameters
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setMaxAnomalyRatio
(value)[source]¶ - Parameters
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
-
setMaxAnomalyRatioCol
(value)[source]¶ - Parameters
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, customInterval=None, errorCol=None, granularity=None, groupbyCol=None, handler=None, maxAnomalyRatio=None, outputCol=None, period=None, sensitivity=None, series=None, subscriptionKey=None, timeout=60.0, timestampCol='timestamp', url=None, valueCol='value')[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
customInterval (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
errorCol (str) – column to hold http errors (default: [self.uid]_error)
granularity (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
groupbyCol (str) – column that groups the series
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
maxAnomalyRatio (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
outputCol (str) – The name of the output column (default: [self.uid]_output)
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
timestampCol (str) – column representing the time of the series (default: timestamp)
url (str) – Url of the service
valueCol (str) – column representing the value of the series (default: value)
-
setPeriod
(value)[source]¶ - Parameters
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
-
setPeriodCol
(value)[source]¶ - Parameters
period (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
-
setSensitivity
(value)[source]¶ - Parameters
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
-
setSensitivityCol
(value)[source]¶ - Parameters
sensitivity (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
-
setSeries
(value)[source]¶ - Parameters
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
-
setSeriesCol
(value)[source]¶ - Parameters
series (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
-
setTimeout
(value)[source]¶ - Parameters
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
mmlspark.cognitive.SpeechToText module¶
-
class
mmlspark.cognitive.SpeechToText.
SpeechToText
(audioData=None, concurrency=1, concurrentTimeout=100.0, errorCol=None, format=None, handler=None, language=None, outputCol=None, profanity=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
audioData (object) – The data sent to the service must be a .wav files
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
language (object) – Identifies the spoken language that is being recognized.
outputCol (str) – The name of the output column (default: [self.uid]_output)
profanity (object) – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getFormat
()[source]¶ - Returns
Specifies the result format. Accepted values are simple and detailed. Default is simple.
- Return type
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getProfanity
()[source]¶ - Returns
Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setAudioData
(value)[source]¶ - Parameters
audioData (object) – The data sent to the service must be a .wav files
-
setAudioDataCol
(value)[source]¶ - Parameters
audioData (object) – The data sent to the service must be a .wav files
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setFormat
(value)[source]¶ - Parameters
format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
-
setFormatCol
(value)[source]¶ - Parameters
format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setLanguage
(value)[source]¶ - Parameters
language (object) – Identifies the spoken language that is being recognized.
-
setLanguageCol
(value)[source]¶ - Parameters
language (object) – Identifies the spoken language that is being recognized.
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(audioData=None, concurrency=1, concurrentTimeout=100.0, errorCol=None, format=None, handler=None, language=None, outputCol=None, profanity=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
audioData (object) – The data sent to the service must be a .wav files
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
format (object) – Specifies the result format. Accepted values are simple and detailed. Default is simple.
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
language (object) – Identifies the spoken language that is being recognized.
outputCol (str) – The name of the output column (default: [self.uid]_output)
profanity (object) – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
setProfanity
(value)[source]¶ - Parameters
profanity (object) – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
-
setProfanityCol
(value)[source]¶ - Parameters
profanity (object) – Specifies how to handle profanity in recognition results. Accepted values are masked, which replaces profanity with asterisks, removed, which remove all profanity from the result, or raw, which includes the profanity in the result. The default setting is masked.
mmlspark.cognitive.TagImage module¶
-
class
mmlspark.cognitive.TagImage.
TagImage
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, imageBytes=None, imageUrl=None, language=None, outputCol=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – The desired language for output generation. (default: ServiceParamData(None,Some(en)))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getLanguage
()[source]¶ - Returns
The desired language for output generation. (default: ServiceParamData(None,Some(en)))
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setLanguage
(value)[source]¶ - Parameters
language (object) – The desired language for output generation. (default: ServiceParamData(None,Some(en)))
-
setLanguageCol
(value)[source]¶ - Parameters
language (object) – The desired language for output generation. (default: ServiceParamData(None,Some(en)))
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, imageBytes=None, imageUrl=None, language=None, outputCol=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
imageBytes (object) – bytestream of the image to use
imageUrl (object) – the url of the image to use
language (object) – The desired language for output generation. (default: ServiceParamData(None,Some(en)))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
mmlspark.cognitive.TextSentiment module¶
-
class
mmlspark.cognitive.TextSentiment.
TextSentiment
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, language=None, outputCol=None, subscriptionKey=None, text=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getLanguage
()[source]¶ - Returns
the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getText
()[source]¶ - Returns
the text in the request body (default: ServiceParamData(Some(Right(text)),None))
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setLanguage
(value)[source]¶ - Parameters
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
-
setLanguageCol
(value)[source]¶ - Parameters
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, errorCol=None, handler=None, language=None, outputCol=None, subscriptionKey=None, text=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
language (object) – the language code of the text (optional for some services) (default: ServiceParamData(None,Some(List(en))))
outputCol (str) – The name of the output column (default: [self.uid]_output)
subscriptionKey (object) – the API key to use
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
setText
(value)[source]¶ - Parameters
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
-
setTextCol
(value)[source]¶ - Parameters
text (object) – the text in the request body (default: ServiceParamData(Some(Right(text)),None))
mmlspark.cognitive.VerifyFaces module¶
-
class
mmlspark.cognitive.VerifyFaces.
VerifyFaces
(concurrency=1, concurrentTimeout=100.0, errorCol=None, faceId=None, faceId1=None, faceId2=None, handler=None, largePersonGroupId=None, outputCol=None, personGroupId=None, personId=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Bases:
mmlspark.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaTransformer
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
faceId (object) – faceId of the face, comes from Face - Detect.
faceId1 (object) – faceId of one face, comes from Face - Detect.
faceId2 (object) – faceId of another face, comes from Face - Detect.
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
largePersonGroupId (object) – Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
outputCol (str) – The name of the output column (default: [self.uid]_output)
personGroupId (object) – Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
personId (object) – Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
getConcurrentTimeout
()[source]¶ - Returns
max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
- Return type
double
-
getHandler
()[source]¶ - Returns
Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
- Return type
-
getLargePersonGroupId
()[source]¶ - Returns
Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- Return type
-
getOutputCol
()[source]¶ - Returns
The name of the output column (default: [self.uid]_output)
- Return type
-
getPersonGroupId
()[source]¶ - Returns
Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
- Return type
-
getPersonId
()[source]¶ - Returns
Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
- Return type
-
getTimeout
()[source]¶ - Returns
number of seconds to wait before closing the connection (default: 60.0)
- Return type
double
-
setConcurrency
(value)[source]¶ - Parameters
concurrency (int) – max number of concurrent calls (default: 1)
-
setConcurrentTimeout
(value)[source]¶ - Parameters
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
-
setErrorCol
(value)[source]¶ - Parameters
errorCol (str) – column to hold http errors (default: [self.uid]_error)
-
setFaceId
(value)[source]¶ - Parameters
faceId (object) – faceId of the face, comes from Face - Detect.
-
setFaceId1
(value)[source]¶ - Parameters
faceId1 (object) – faceId of one face, comes from Face - Detect.
-
setFaceId1Col
(value)[source]¶ - Parameters
faceId1 (object) – faceId of one face, comes from Face - Detect.
-
setFaceId2
(value)[source]¶ - Parameters
faceId2 (object) – faceId of another face, comes from Face - Detect.
-
setFaceId2Col
(value)[source]¶ - Parameters
faceId2 (object) – faceId of another face, comes from Face - Detect.
-
setFaceIdCol
(value)[source]¶ - Parameters
faceId (object) – faceId of the face, comes from Face - Detect.
-
setHandler
(value)[source]¶ - Parameters
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
-
setLargePersonGroupId
(value)[source]¶ - Parameters
largePersonGroupId (object) – Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
-
setLargePersonGroupIdCol
(value)[source]¶ - Parameters
largePersonGroupId (object) – Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
-
setOutputCol
(value)[source]¶ - Parameters
outputCol (str) – The name of the output column (default: [self.uid]_output)
-
setParams
(concurrency=1, concurrentTimeout=100.0, errorCol=None, faceId=None, faceId1=None, faceId2=None, handler=None, largePersonGroupId=None, outputCol=None, personGroupId=None, personId=None, subscriptionKey=None, timeout=60.0, url=None)[source]¶ Set the (keyword only) parameters
- Parameters
concurrency (int) – max number of concurrent calls (default: 1)
concurrentTimeout (double) – max number seconds to wait on futures if concurrency >= 1 (default: 100.0)
errorCol (str) – column to hold http errors (default: [self.uid]_error)
faceId (object) – faceId of the face, comes from Face - Detect.
faceId1 (object) – faceId of one face, comes from Face - Detect.
faceId2 (object) – faceId of another face, comes from Face - Detect.
handler (object) – Which strategy to use when handling requests (default: UserDefinedFunction(<function2>,StringType,None))
largePersonGroupId (object) – Using existing largePersonGroupId and personId for fast adding a specified person. largePersonGroupId is created in LargePersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
outputCol (str) – The name of the output column (default: [self.uid]_output)
personGroupId (object) – Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
personId (object) – Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
subscriptionKey (object) – the API key to use
timeout (double) – number of seconds to wait before closing the connection (default: 60.0)
url (str) – Url of the service
-
setPersonGroupId
(value)[source]¶ - Parameters
personGroupId (object) – Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
-
setPersonGroupIdCol
(value)[source]¶ - Parameters
personGroupId (object) – Using existing personGroupId and personId for fast loading a specified person. personGroupId is created in PersonGroup - Create. Parameter personGroupId and largePersonGroupId should not be provided at the same time.
-
setPersonId
(value)[source]¶ - Parameters
personId (object) – Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
-
setPersonIdCol
(value)[source]¶ - Parameters
personId (object) – Specify a certain person in a person group or a large person group. personId is created in PersonGroup Person - Create or LargePersonGroup Person - Create.
Module contents¶
MicrosoftML is a library of Python classes to interface with the Microsoft scala APIs to utilize Apache Spark to create distibuted machine learning models.
MicrosoftML simplifies training and scoring classifiers and regressors, as well as facilitating the creation of models using the CNTK library, images, and text.