synapse.ml.services.anomaly package
Submodules
synapse.ml.services.anomaly.DetectAnomalies module
- class synapse.ml.services.anomaly.DetectAnomalies.DetectAnomalies(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectAnomalies_cef1266c81e2_error', granularity=None, granularityCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectAnomalies_cef1266c81e2_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
customInterval¶ (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
granularity¶ (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
handler¶ (object) – Which strategy to use when handling requests
imputeFixedValue¶ (object) – Optional argument, fixed value to use when imputeMode is set to “fixed”
imputeMode¶ (object) – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
maxAnomalyRatio¶ (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
period¶ (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity¶ (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series¶ (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
timeout¶ (float) – number of seconds to wait before closing the connection
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customInterval = Param(parent='undefined', name='customInterval', doc='ServiceParam: Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5. ')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns
A Custom Value for Authorization Header
- Return type
CustomAuthHeader
- getCustomInterval()[source]
- Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
customInterval
- getGranularity()[source]
- Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
granularity
- getImputeFixedValue()[source]
- Returns
Optional argument, fixed value to use when imputeMode is set to “fixed”
- Return type
imputeFixedValue
- getImputeMode()[source]
- Returns
Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- Return type
imputeMode
- getMaxAnomalyRatio()[source]
- Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
maxAnomalyRatio
- getPeriod()[source]
- Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
period
- getSensitivity()[source]
- Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
sensitivity
- getSeries()[source]
- Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
series
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- granularity = Param(parent='undefined', name='granularity', doc='ServiceParam: Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid. ')
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imputeFixedValue = Param(parent='undefined', name='imputeFixedValue', doc='ServiceParam: Optional argument, fixed value to use when imputeMode is set to "fixed" ')
- imputeMode = Param(parent='undefined', name='imputeMode', doc='ServiceParam: Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill ')
- maxAnomalyRatio = Param(parent='undefined', name='maxAnomalyRatio', doc='ServiceParam: Optional argument, advanced model parameter, max anomaly ratio in a time series. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- period = Param(parent='undefined', name='period', doc='ServiceParam: Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically. ')
- sensitivity = Param(parent='undefined', name='sensitivity', doc='ServiceParam: Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted ')
- series = Param(parent='undefined', name='series', doc='ServiceParam: Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned. ')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomInterval(value)[source]
- Parameters
customInterval¶ – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setCustomIntervalCol(value)[source]
- Parameters
customInterval¶ – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setGranularity(value)[source]
- Parameters
granularity¶ – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setGranularityCol(value)[source]
- Parameters
granularity¶ – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setImputeFixedValue(value)[source]
- Parameters
imputeFixedValue¶ – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeFixedValueCol(value)[source]
- Parameters
imputeFixedValue¶ – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeMode(value)[source]
- Parameters
imputeMode¶ – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setImputeModeCol(value)[source]
- Parameters
imputeMode¶ – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setMaxAnomalyRatio(value)[source]
- Parameters
maxAnomalyRatio¶ – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setMaxAnomalyRatioCol(value)[source]
- Parameters
maxAnomalyRatio¶ – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectAnomalies_cef1266c81e2_error', granularity=None, granularityCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectAnomalies_cef1266c81e2_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPeriod(value)[source]
- Parameters
period¶ – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setPeriodCol(value)[source]
- Parameters
period¶ – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setSensitivity(value)[source]
- Parameters
sensitivity¶ – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSensitivityCol(value)[source]
- Parameters
sensitivity¶ – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSeries(value)[source]
- Parameters
series¶ – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setSeriesCol(value)[source]
- Parameters
series¶ – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.services.anomaly.DetectLastAnomaly module
- class synapse.ml.services.anomaly.DetectLastAnomaly.DetectLastAnomaly(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectLastAnomaly_1965dde74e92_error', granularity=None, granularityCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectLastAnomaly_1965dde74e92_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
customInterval¶ (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
granularity¶ (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
handler¶ (object) – Which strategy to use when handling requests
imputeFixedValue¶ (object) – Optional argument, fixed value to use when imputeMode is set to “fixed”
imputeMode¶ (object) – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
maxAnomalyRatio¶ (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
period¶ (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity¶ (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series¶ (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
timeout¶ (float) – number of seconds to wait before closing the connection
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customInterval = Param(parent='undefined', name='customInterval', doc='ServiceParam: Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5. ')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns
A Custom Value for Authorization Header
- Return type
CustomAuthHeader
- getCustomInterval()[source]
- Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
customInterval
- getGranularity()[source]
- Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
granularity
- getImputeFixedValue()[source]
- Returns
Optional argument, fixed value to use when imputeMode is set to “fixed”
- Return type
imputeFixedValue
- getImputeMode()[source]
- Returns
Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- Return type
imputeMode
- getMaxAnomalyRatio()[source]
- Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
maxAnomalyRatio
- getPeriod()[source]
- Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
period
- getSensitivity()[source]
- Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
sensitivity
- getSeries()[source]
- Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
series
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- granularity = Param(parent='undefined', name='granularity', doc='ServiceParam: Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid. ')
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imputeFixedValue = Param(parent='undefined', name='imputeFixedValue', doc='ServiceParam: Optional argument, fixed value to use when imputeMode is set to "fixed" ')
- imputeMode = Param(parent='undefined', name='imputeMode', doc='ServiceParam: Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill ')
- maxAnomalyRatio = Param(parent='undefined', name='maxAnomalyRatio', doc='ServiceParam: Optional argument, advanced model parameter, max anomaly ratio in a time series. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- period = Param(parent='undefined', name='period', doc='ServiceParam: Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically. ')
- sensitivity = Param(parent='undefined', name='sensitivity', doc='ServiceParam: Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted ')
- series = Param(parent='undefined', name='series', doc='ServiceParam: Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned. ')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomInterval(value)[source]
- Parameters
customInterval¶ – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setCustomIntervalCol(value)[source]
- Parameters
customInterval¶ – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setGranularity(value)[source]
- Parameters
granularity¶ – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setGranularityCol(value)[source]
- Parameters
granularity¶ – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setImputeFixedValue(value)[source]
- Parameters
imputeFixedValue¶ – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeFixedValueCol(value)[source]
- Parameters
imputeFixedValue¶ – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeMode(value)[source]
- Parameters
imputeMode¶ – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setImputeModeCol(value)[source]
- Parameters
imputeMode¶ – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setMaxAnomalyRatio(value)[source]
- Parameters
maxAnomalyRatio¶ – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setMaxAnomalyRatioCol(value)[source]
- Parameters
maxAnomalyRatio¶ – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='DetectLastAnomaly_1965dde74e92_error', granularity=None, granularityCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='DetectLastAnomaly_1965dde74e92_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, url=None)[source]
Set the (keyword only) parameters
- setPeriod(value)[source]
- Parameters
period¶ – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setPeriodCol(value)[source]
- Parameters
period¶ – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setSensitivity(value)[source]
- Parameters
sensitivity¶ – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSensitivityCol(value)[source]
- Parameters
sensitivity¶ – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSeries(value)[source]
- Parameters
series¶ – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setSeriesCol(value)[source]
- Parameters
series¶ – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.services.anomaly.DetectLastMultivariateAnomaly module
- class synapse.ml.services.anomaly.DetectLastMultivariateAnomaly.DetectLastMultivariateAnomaly(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=300, concurrency=1, concurrentTimeout=None, diagnosticsInfo=None, errorCol='DetectLastMultivariateAnomaly_20e5c67a9dba_error', handler=None, inputVariablesCols=None, modelId=None, outputCol='DetectLastMultivariateAnomaly_20e5c67a9dba_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, timestampCol='timestamp', topContributorCount=10, url=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
diagnosticsInfo¶ (object) – diagnosticsInfo for training a multivariate anomaly detection model
handler¶ (object) – Which strategy to use when handling requests
inputVariablesCols¶ (list) – The names of the input variables columns
timeout¶ (float) – number of seconds to wait before closing the connection
topContributorCount¶ (int) – This is a number that you could specify N from 1 to 30, which will give you the details of top N contributed variables in the anomaly results. For example, if you have 100 variables in the model, but you only care the top five contributed variables in detection results, then you should fill this field with 5. The default number is 10.
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- batchSize = Param(parent='undefined', name='batchSize', doc='The max size of the buffer')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- diagnosticsInfo = Param(parent='undefined', name='diagnosticsInfo', doc='diagnosticsInfo for training a multivariate anomaly detection model')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns
A Custom Value for Authorization Header
- Return type
CustomAuthHeader
- getDiagnosticsInfo()[source]
- Returns
diagnosticsInfo for training a multivariate anomaly detection model
- Return type
diagnosticsInfo
- getInputVariablesCols()[source]
- Returns
The names of the input variables columns
- Return type
inputVariablesCols
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getTopContributorCount()[source]
- Returns
This is a number that you could specify N from 1 to 30, which will give you the details of top N contributed variables in the anomaly results. For example, if you have 100 variables in the model, but you only care the top five contributed variables in detection results, then you should fill this field with 5. The default number is 10.
- Return type
topContributorCount
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- inputVariablesCols = Param(parent='undefined', name='inputVariablesCols', doc='The names of the input variables columns')
- modelId = Param(parent='undefined', name='modelId', doc='Format - uuid. Model identifier.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setDiagnosticsInfo(value)[source]
- Parameters
diagnosticsInfo¶ – diagnosticsInfo for training a multivariate anomaly detection model
- setInputVariablesCols(value)[source]
- Parameters
inputVariablesCols¶ – The names of the input variables columns
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, batchSize=300, concurrency=1, concurrentTimeout=None, diagnosticsInfo=None, errorCol='DetectLastMultivariateAnomaly_20e5c67a9dba_error', handler=None, inputVariablesCols=None, modelId=None, outputCol='DetectLastMultivariateAnomaly_20e5c67a9dba_output', subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, timestampCol='timestamp', topContributorCount=10, url=None)[source]
Set the (keyword only) parameters
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- setTopContributorCount(value)[source]
- Parameters
topContributorCount¶ – This is a number that you could specify N from 1 to 30, which will give you the details of top N contributed variables in the anomaly results. For example, if you have 100 variables in the model, but you only care the top five contributed variables in detection results, then you should fill this field with 5. The default number is 10.
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- timestampCol = Param(parent='undefined', name='timestampCol', doc='Timestamp column name')
- topContributorCount = Param(parent='undefined', name='topContributorCount', doc='This is a number that you could specify N from 1 to 30, which will give you the details of top N contributed variables in the anomaly results. For example, if you have 100 variables in the model, but you only care the top five contributed variables in detection results, then you should fill this field with 5. The default number is 10.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.services.anomaly.SimpleDetectAnomalies module
- class synapse.ml.services.anomaly.SimpleDetectAnomalies.SimpleDetectAnomalies(java_obj=None, AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='SimpleDetectAnomalies_f1d2d59b0353_error', granularity=None, granularityCol=None, groupbyCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='SimpleDetectAnomalies_f1d2d59b0353_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, timestampCol='timestamp', url=None, valueCol='value')[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
CustomAuthHeader¶ (object) – A Custom Value for Authorization Header
concurrentTimeout¶ (float) – max number seconds to wait on futures if concurrency >= 1
customInterval¶ (object) – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
granularity¶ (object) – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
handler¶ (object) – Which strategy to use when handling requests
imputeFixedValue¶ (object) – Optional argument, fixed value to use when imputeMode is set to “fixed”
imputeMode¶ (object) – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
maxAnomalyRatio¶ (object) – Optional argument, advanced model parameter, max anomaly ratio in a time series.
period¶ (object) – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
sensitivity¶ (object) – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
series¶ (object) – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
timeout¶ (float) – number of seconds to wait before closing the connection
timestampCol¶ (str) – column representing the time of the series
valueCol¶ (str) – column representing the value of the series
- AADToken = Param(parent='undefined', name='AADToken', doc='ServiceParam: AAD Token used for authentication')
- CustomAuthHeader = Param(parent='undefined', name='CustomAuthHeader', doc='ServiceParam: A Custom Value for Authorization Header')
- concurrency = Param(parent='undefined', name='concurrency', doc='max number of concurrent calls')
- concurrentTimeout = Param(parent='undefined', name='concurrentTimeout', doc='max number seconds to wait on futures if concurrency >= 1')
- customInterval = Param(parent='undefined', name='customInterval', doc='ServiceParam: Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5. ')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getConcurrentTimeout()[source]
- Returns
max number seconds to wait on futures if concurrency >= 1
- Return type
concurrentTimeout
- getCustomAuthHeader()[source]
- Returns
A Custom Value for Authorization Header
- Return type
CustomAuthHeader
- getCustomInterval()[source]
- Returns
Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- Return type
customInterval
- getGranularity()[source]
- Returns
Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- Return type
granularity
- getImputeFixedValue()[source]
- Returns
Optional argument, fixed value to use when imputeMode is set to “fixed”
- Return type
imputeFixedValue
- getImputeMode()[source]
- Returns
Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- Return type
imputeMode
- getMaxAnomalyRatio()[source]
- Returns
Optional argument, advanced model parameter, max anomaly ratio in a time series.
- Return type
maxAnomalyRatio
- getPeriod()[source]
- Returns
Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- Return type
period
- getSensitivity()[source]
- Returns
Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- Return type
sensitivity
- getSeries()[source]
- Returns
Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- Return type
series
- getTimeout()[source]
- Returns
number of seconds to wait before closing the connection
- Return type
timeout
- getTimestampCol()[source]
- Returns
column representing the time of the series
- Return type
timestampCol
- granularity = Param(parent='undefined', name='granularity', doc='ServiceParam: Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid. ')
- groupbyCol = Param(parent='undefined', name='groupbyCol', doc='column that groups the series')
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- imputeFixedValue = Param(parent='undefined', name='imputeFixedValue', doc='ServiceParam: Optional argument, fixed value to use when imputeMode is set to "fixed" ')
- imputeMode = Param(parent='undefined', name='imputeMode', doc='ServiceParam: Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill ')
- maxAnomalyRatio = Param(parent='undefined', name='maxAnomalyRatio', doc='ServiceParam: Optional argument, advanced model parameter, max anomaly ratio in a time series. ')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- period = Param(parent='undefined', name='period', doc='ServiceParam: Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically. ')
- sensitivity = Param(parent='undefined', name='sensitivity', doc='ServiceParam: Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted ')
- series = Param(parent='undefined', name='series', doc='ServiceParam: Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned. ')
- setConcurrentTimeout(value)[source]
- Parameters
concurrentTimeout¶ – max number seconds to wait on futures if concurrency >= 1
- setCustomAuthHeader(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomAuthHeaderCol(value)[source]
- Parameters
CustomAuthHeader¶ – A Custom Value for Authorization Header
- setCustomInterval(value)[source]
- Parameters
customInterval¶ – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setCustomIntervalCol(value)[source]
- Parameters
customInterval¶ – Custom Interval is used to set non-standard time interval, for example, if the series is 5 minutes, request can be set as granularity=minutely, customInterval=5.
- setGranularity(value)[source]
- Parameters
granularity¶ – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setGranularityCol(value)[source]
- Parameters
granularity¶ – Can only be one of yearly, monthly, weekly, daily, hourly or minutely. Granularity is used for verify whether input series is valid.
- setImputeFixedValue(value)[source]
- Parameters
imputeFixedValue¶ – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeFixedValueCol(value)[source]
- Parameters
imputeFixedValue¶ – Optional argument, fixed value to use when imputeMode is set to “fixed”
- setImputeMode(value)[source]
- Parameters
imputeMode¶ – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setImputeModeCol(value)[source]
- Parameters
imputeMode¶ – Optional argument, impute mode of a time series. Possible values: auto, previous, linear, fixed, zero, notFill
- setMaxAnomalyRatio(value)[source]
- Parameters
maxAnomalyRatio¶ – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setMaxAnomalyRatioCol(value)[source]
- Parameters
maxAnomalyRatio¶ – Optional argument, advanced model parameter, max anomaly ratio in a time series.
- setParams(AADToken=None, AADTokenCol=None, CustomAuthHeader=None, CustomAuthHeaderCol=None, concurrency=1, concurrentTimeout=None, customInterval=None, customIntervalCol=None, errorCol='SimpleDetectAnomalies_f1d2d59b0353_error', granularity=None, granularityCol=None, groupbyCol=None, handler=None, imputeFixedValue=None, imputeFixedValueCol=None, imputeMode=None, imputeModeCol=None, maxAnomalyRatio=None, maxAnomalyRatioCol=None, outputCol='SimpleDetectAnomalies_f1d2d59b0353_output', period=None, periodCol=None, sensitivity=None, sensitivityCol=None, series=None, seriesCol=None, subscriptionKey=None, subscriptionKeyCol=None, timeout=60.0, timestampCol='timestamp', url=None, valueCol='value')[source]
Set the (keyword only) parameters
- setPeriod(value)[source]
- Parameters
period¶ – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setPeriodCol(value)[source]
- Parameters
period¶ – Optional argument, periodic value of a time series. If the value is null or does not present, the API will determine the period automatically.
- setSensitivity(value)[source]
- Parameters
sensitivity¶ – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSensitivityCol(value)[source]
- Parameters
sensitivity¶ – Optional argument, advanced model parameter, between 0-99, the lower the value is, the larger the margin value will be which means less anomalies will be accepted
- setSeries(value)[source]
- Parameters
series¶ – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setSeriesCol(value)[source]
- Parameters
series¶ – Time series data points. Points should be sorted by timestamp in ascending order to match the anomaly detection result. If the data is not sorted correctly or there is duplicated timestamp, the API will not work. In such case, an error message will be returned.
- setTimeout(value)[source]
- Parameters
timeout¶ – number of seconds to wait before closing the connection
- setTimestampCol(value)[source]
- Parameters
timestampCol¶ – column representing the time of the series
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- timeout = Param(parent='undefined', name='timeout', doc='number of seconds to wait before closing the connection')
- timestampCol = Param(parent='undefined', name='timestampCol', doc='column representing the time of the series')
- url = Param(parent='undefined', name='url', doc='Url of the service')
- valueCol = Param(parent='undefined', name='valueCol', doc='column representing the value of the series')
synapse.ml.services.anomaly.SimpleDetectMultivariateAnomaly module
- class synapse.ml.services.anomaly.SimpleDetectMultivariateAnomaly.SimpleDetectMultivariateAnomaly(java_obj=None, backoffs=[100, 500, 1000], diagnosticsInfo=None, endTime=None, errorCol='SimpleDetectMultivariateAnomaly_8002e2d283c7_error', handler=None, initialPollingDelay=300, inputCols=None, intermediateSaveDir=None, maxPollingRetries=1000, modelId=None, outputCol='SimpleDetectMultivariateAnomaly_8002e2d283c7_output', pollingDelay=300, startTime=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timestampCol='timestamp', topContributorCount=10, url=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
diagnosticsInfo¶ (object) – diagnosticsInfo for training a multivariate anomaly detection model
endTime¶ (str) – A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
handler¶ (object) – Which strategy to use when handling requests
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
intermediateSaveDir¶ (str) – Blob storage location in HDFS where intermediate data is saved while training.
pollingDelay¶ (int) – number of milliseconds to wait between polling
startTime¶ (str) – A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
topContributorCount¶ (int) – This is a number that you could specify N from 1 to 30, which will give you the details of top N contributed variables in the anomaly results. For example, if you have 100 variables in the model, but you only care the top five contributed variables in detection results, then you should fill this field with 5. The default number is 10.
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- diagnosticsInfo = Param(parent='undefined', name='diagnosticsInfo', doc='diagnosticsInfo for training a multivariate anomaly detection model')
- endTime = Param(parent='undefined', name='endTime', doc='A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- getDiagnosticsInfo()[source]
- Returns
diagnosticsInfo for training a multivariate anomaly detection model
- Return type
diagnosticsInfo
- getEndTime()[source]
- Returns
A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- Return type
endTime
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getIntermediateSaveDir()[source]
- Returns
Blob storage location in HDFS where intermediate data is saved while training.
- Return type
intermediateSaveDir
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getStartTime()[source]
- Returns
A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- Return type
startTime
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- getTopContributorCount()[source]
- Returns
This is a number that you could specify N from 1 to 30, which will give you the details of top N contributed variables in the anomaly results. For example, if you have 100 variables in the model, but you only care the top five contributed variables in detection results, then you should fill this field with 5. The default number is 10.
- Return type
topContributorCount
- handler = Param(parent='undefined', name='handler', doc='Which strategy to use when handling requests')
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- inputCols = Param(parent='undefined', name='inputCols', doc='The names of the input columns')
- intermediateSaveDir = Param(parent='undefined', name='intermediateSaveDir', doc='Blob storage location in HDFS where intermediate data is saved while training.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- modelId = Param(parent='undefined', name='modelId', doc='Format - uuid. Model identifier.')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setDiagnosticsInfo(value)[source]
- Parameters
diagnosticsInfo¶ – diagnosticsInfo for training a multivariate anomaly detection model
- setEndTime(value)[source]
- Parameters
endTime¶ – A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setIntermediateSaveDir(value)[source]
- Parameters
intermediateSaveDir¶ – Blob storage location in HDFS where intermediate data is saved while training.
- setParams(backoffs=[100, 500, 1000], diagnosticsInfo=None, endTime=None, errorCol='SimpleDetectMultivariateAnomaly_8002e2d283c7_error', handler=None, initialPollingDelay=300, inputCols=None, intermediateSaveDir=None, maxPollingRetries=1000, modelId=None, outputCol='SimpleDetectMultivariateAnomaly_8002e2d283c7_output', pollingDelay=300, startTime=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timestampCol='timestamp', topContributorCount=10, url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setStartTime(value)[source]
- Parameters
startTime¶ – A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- setTopContributorCount(value)[source]
- Parameters
topContributorCount¶ – This is a number that you could specify N from 1 to 30, which will give you the details of top N contributed variables in the anomaly results. For example, if you have 100 variables in the model, but you only care the top five contributed variables in detection results, then you should fill this field with 5. The default number is 10.
- startTime = Param(parent='undefined', name='startTime', doc='A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timestampCol = Param(parent='undefined', name='timestampCol', doc='Timestamp column name')
- topContributorCount = Param(parent='undefined', name='topContributorCount', doc='This is a number that you could specify N from 1 to 30, which will give you the details of top N contributed variables in the anomaly results. For example, if you have 100 variables in the model, but you only care the top five contributed variables in detection results, then you should fill this field with 5. The default number is 10.')
- url = Param(parent='undefined', name='url', doc='Url of the service')
synapse.ml.services.anomaly.SimpleFitMultivariateAnomaly module
- class synapse.ml.services.anomaly.SimpleFitMultivariateAnomaly.SimpleFitMultivariateAnomaly(java_obj=None, alignMode='Outer', backoffs=[100, 500, 1000], displayName=None, endTime=None, errorCol='SimpleFitMultivariateAnomaly_97bcc7de4a86_error', fillNAMethod='Linear', initialPollingDelay=300, inputCols=None, intermediateSaveDir=None, maxPollingRetries=1000, outputCol='SimpleFitMultivariateAnomaly_97bcc7de4a86_output', paddingValue=None, pollingDelay=300, slidingWindow=300, startTime=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timestampCol='timestamp', url=None)[source]
Bases:
pyspark.ml.util.MLReadable
[pyspark.ml.util.RL
]- Parameters
alignMode¶ (str) – An optional field, indicates how we align different variables into the same time-range which is required by the model.{Inner, Outer}
endTime¶ (str) – A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
fillNAMethod¶ (str) – An optional field, indicates how missed values will be filled with. Can not be set to NotFill, when alignMode is Outer.{Previous, Subsequent, Linear, Zero, Fixed}
initialPollingDelay¶ (int) – number of milliseconds to wait before first poll for result
intermediateSaveDir¶ (str) – Blob storage location in HDFS where intermediate data is saved while training.
paddingValue¶ (int) – optional field, is only useful if FillNAMethod is set to Fixed.
pollingDelay¶ (int) – number of milliseconds to wait between polling
slidingWindow¶ (int) – An optional field, indicates how many history points will be used to determine the anomaly score of one subsequent point.
startTime¶ (str) – A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
suppressMaxRetriesException¶ (bool) – set true to suppress the maxumimum retries exception and report in the error column
- alignMode = Param(parent='undefined', name='alignMode', doc='An optional field, indicates how we align different variables into the same time-range which is required by the model.{Inner, Outer}')
- backoffs = Param(parent='undefined', name='backoffs', doc='array of backoffs to use in the handler')
- displayName = Param(parent='undefined', name='displayName', doc='optional field, name of the model')
- endTime = Param(parent='undefined', name='endTime', doc='A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.')
- errorCol = Param(parent='undefined', name='errorCol', doc='column to hold http errors')
- fillNAMethod = Param(parent='undefined', name='fillNAMethod', doc='An optional field, indicates how missed values will be filled with. Can not be set to NotFill, when alignMode is Outer.{Previous, Subsequent, Linear, Zero, Fixed}')
- getAlignMode()[source]
- Returns
An optional field, indicates how we align different variables into the same time-range which is required by the model.{Inner, Outer}
- Return type
alignMode
- getEndTime()[source]
- Returns
A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- Return type
endTime
- getFillNAMethod()[source]
- Returns
An optional field, indicates how missed values will be filled with. Can not be set to NotFill, when alignMode is Outer.{Previous, Subsequent, Linear, Zero, Fixed}
- Return type
fillNAMethod
- getInitialPollingDelay()[source]
- Returns
number of milliseconds to wait before first poll for result
- Return type
initialPollingDelay
- getIntermediateSaveDir()[source]
- Returns
Blob storage location in HDFS where intermediate data is saved while training.
- Return type
intermediateSaveDir
- getPaddingValue()[source]
- Returns
optional field, is only useful if FillNAMethod is set to Fixed.
- Return type
paddingValue
- getPollingDelay()[source]
- Returns
number of milliseconds to wait between polling
- Return type
pollingDelay
- getSlidingWindow()[source]
- Returns
An optional field, indicates how many history points will be used to determine the anomaly score of one subsequent point.
- Return type
slidingWindow
- getStartTime()[source]
- Returns
A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- Return type
startTime
- getSuppressMaxRetriesException()[source]
- Returns
set true to suppress the maxumimum retries exception and report in the error column
- Return type
suppressMaxRetriesException
- initialPollingDelay = Param(parent='undefined', name='initialPollingDelay', doc='number of milliseconds to wait before first poll for result')
- inputCols = Param(parent='undefined', name='inputCols', doc='The names of the input columns')
- intermediateSaveDir = Param(parent='undefined', name='intermediateSaveDir', doc='Blob storage location in HDFS where intermediate data is saved while training.')
- maxPollingRetries = Param(parent='undefined', name='maxPollingRetries', doc='number of times to poll')
- outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
- paddingValue = Param(parent='undefined', name='paddingValue', doc='optional field, is only useful if FillNAMethod is set to Fixed.')
- pollingDelay = Param(parent='undefined', name='pollingDelay', doc='number of milliseconds to wait between polling')
- setAlignMode(value)[source]
- Parameters
alignMode¶ – An optional field, indicates how we align different variables into the same time-range which is required by the model.{Inner, Outer}
- setEndTime(value)[source]
- Parameters
endTime¶ – A required field, end time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- setFillNAMethod(value)[source]
- Parameters
fillNAMethod¶ – An optional field, indicates how missed values will be filled with. Can not be set to NotFill, when alignMode is Outer.{Previous, Subsequent, Linear, Zero, Fixed}
- setInitialPollingDelay(value)[source]
- Parameters
initialPollingDelay¶ – number of milliseconds to wait before first poll for result
- setIntermediateSaveDir(value)[source]
- Parameters
intermediateSaveDir¶ – Blob storage location in HDFS where intermediate data is saved while training.
- setPaddingValue(value)[source]
- Parameters
paddingValue¶ – optional field, is only useful if FillNAMethod is set to Fixed.
- setParams(alignMode='Outer', backoffs=[100, 500, 1000], displayName=None, endTime=None, errorCol='SimpleFitMultivariateAnomaly_97bcc7de4a86_error', fillNAMethod='Linear', initialPollingDelay=300, inputCols=None, intermediateSaveDir=None, maxPollingRetries=1000, outputCol='SimpleFitMultivariateAnomaly_97bcc7de4a86_output', paddingValue=None, pollingDelay=300, slidingWindow=300, startTime=None, subscriptionKey=None, subscriptionKeyCol=None, suppressMaxRetriesException=False, timestampCol='timestamp', url=None)[source]
Set the (keyword only) parameters
- setPollingDelay(value)[source]
- Parameters
pollingDelay¶ – number of milliseconds to wait between polling
- setSlidingWindow(value)[source]
- Parameters
slidingWindow¶ – An optional field, indicates how many history points will be used to determine the anomaly score of one subsequent point.
- setStartTime(value)[source]
- Parameters
startTime¶ – A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.
- setSuppressMaxRetriesException(value)[source]
- Parameters
suppressMaxRetriesException¶ – set true to suppress the maxumimum retries exception and report in the error column
- slidingWindow = Param(parent='undefined', name='slidingWindow', doc='An optional field, indicates how many history points will be used to determine the anomaly score of one subsequent point.')
- startTime = Param(parent='undefined', name='startTime', doc='A required field, start time of data to be used for detection/generating multivariate anomaly detection model, should be date-time.')
- subscriptionKey = Param(parent='undefined', name='subscriptionKey', doc='ServiceParam: the API key to use')
- suppressMaxRetriesException = Param(parent='undefined', name='suppressMaxRetriesException', doc='set true to suppress the maxumimum retries exception and report in the error column')
- timestampCol = Param(parent='undefined', name='timestampCol', doc='Timestamp column name')
- url = Param(parent='undefined', name='url', doc='Url of the service')
Module contents
SynapseML is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK), LightGBM and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of datasources.
SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. In this vein, SynapseML provides easy to use SparkML transformers for a wide variety of Microsoft Cognitive Services. For production grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.
SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+.