synapse.ml.automl package
Submodules
synapse.ml.automl.BestModel module
- class synapse.ml.automl.BestModel.BestModel(java_obj=None, allModelMetrics=None, bestModel=None, bestModelMetrics=None, rocCurve=None, scoredDataset=None)[source]
Bases:
synapse.ml.automl._BestModel._BestModel
synapse.ml.automl.FindBestModel module
- class synapse.ml.automl.FindBestModel.FindBestModel(java_obj=None, evaluationMetric='accuracy', models=None)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaEstimator
- Parameters
- evaluationMetric = Param(parent='undefined', name='evaluationMetric', doc='Metric to evaluate models with')
- models = Param(parent='undefined', name='models', doc='List of models to be evaluated')
synapse.ml.automl.HyperparamBuilder module
- class synapse.ml.automl.HyperparamBuilder.DiscreteHyperParam(values, seed=0)[source]
Bases:
object
Specifies a discrete list of values.
- class synapse.ml.automl.HyperparamBuilder.GridSpace(paramValues)[source]
Bases:
object
Specifies a predetermined grid of values to search through.
- class synapse.ml.automl.HyperparamBuilder.HyperparamBuilder[source]
Bases:
object
Specifies the search space for hyperparameters.
synapse.ml.automl.TuneHyperparameters module
- class synapse.ml.automl.TuneHyperparameters.TuneHyperparameters(java_obj=None, evaluationMetric=None, models=None, numFolds=None, numRuns=None, parallelism=None, paramSpace=None, seed=0)[source]
Bases:
synapse.ml.core.schema.Utils.ComplexParamsMixin
,pyspark.ml.util.JavaMLReadable
,pyspark.ml.util.JavaMLWritable
,pyspark.ml.wrapper.JavaEstimator
- Parameters
evaluationMetric (str) – Metric to evaluate models with
models (object) – Estimators to run
numFolds (int) – Number of folds
numRuns (int) – Termination criteria for randomized search
parallelism (int) – The number of models to run in parallel
paramSpace (object) – Parameter space for generating hyperparameters
seed (long) – Random number generator seed
- evaluationMetric = Param(parent='undefined', name='evaluationMetric', doc='Metric to evaluate models with')
- getParamSpace()[source]
- Returns
Parameter space for generating hyperparameters
- Return type
paramSpace
- models = Param(parent='undefined', name='models', doc='Estimators to run')
- numFolds = Param(parent='undefined', name='numFolds', doc='Number of folds')
- numRuns = Param(parent='undefined', name='numRuns', doc='Termination criteria for randomized search')
- parallelism = Param(parent='undefined', name='parallelism', doc='The number of models to run in parallel')
- paramSpace = Param(parent='undefined', name='paramSpace', doc='Parameter space for generating hyperparameters')
- seed = Param(parent='undefined', name='seed', doc='Random number generator seed')
- setParamSpace(value)[source]
- Parameters
paramSpace – Parameter space for generating hyperparameters
synapse.ml.automl.TuneHyperparametersModel module
Module contents
SynapseML is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK), LightGBM and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of datasources.
SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. In this vein, SynapseML provides easy to use SparkML transformers for a wide variety of Microsoft Cognitive Services. For production grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.
SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+.