synapse.ml.nn package

Submodules

synapse.ml.nn.ConditionalBallTree module

class synapse.ml.nn.ConditionalBallTree.ConditionalBallTree(keys, values, labels, leafSize, java_obj=None)[source]

Bases: object

findMaximumInnerProducts(queryPoint, conditioner, k)[source]

Find the best match to the queryPoint given the conditioner and k from self. :param queryPoint: array vector to use to query for NNs :param conditioner: set of labels that will subset or condition the NN query :param k: int representing the maximum number of neighbors to return :return: array of tuples representing the index of the match and its distance

static load(filename)[source]
save(filename)[source]

synapse.ml.nn.ConditionalKNN module

class synapse.ml.nn.ConditionalKNN.ConditionalKNN(java_obj=None, conditionerCol='conditioner', featuresCol='features', k=5, labelCol='labels', leafSize=50, outputCol='ConditionalKNN_2f37bc8ec7f9_output', valuesCol='values')[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaEstimator

Parameters
  • conditionerCol (object) – column holding identifiers for features that will be returned when queried

  • featuresCol (object) – The name of the features column

  • k (int) – number of matches to return

  • labelCol (object) – The name of the label column

  • leafSize (int) – max size of the leaves of the tree

  • outputCol (object) – The name of the output column

  • valuesCol (object) – column holding values for each feature (key) that will be returned when queried

conditionerCol = Param(parent='undefined', name='conditionerCol', doc='column holding identifiers for features that will be returned when queried')
featuresCol = Param(parent='undefined', name='featuresCol', doc='The name of the features column')
getConditionerCol()[source]
Returns

column holding identifiers for features that will be returned when queried

Return type

conditionerCol

getFeaturesCol()[source]
Returns

The name of the features column

Return type

featuresCol

static getJavaPackage()[source]

Returns package name String.

getK()[source]
Returns

number of matches to return

Return type

k

getLabelCol()[source]
Returns

The name of the label column

Return type

labelCol

getLeafSize()[source]
Returns

max size of the leaves of the tree

Return type

leafSize

getOutputCol()[source]
Returns

The name of the output column

Return type

outputCol

getValuesCol()[source]
Returns

column holding values for each feature (key) that will be returned when queried

Return type

valuesCol

k = Param(parent='undefined', name='k', doc='number of matches to return')
labelCol = Param(parent='undefined', name='labelCol', doc='The name of the label column')
leafSize = Param(parent='undefined', name='leafSize', doc='max size of the leaves of the tree')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setConditionerCol(value)[source]
Parameters

conditionerCol – column holding identifiers for features that will be returned when queried

setFeaturesCol(value)[source]
Parameters

featuresCol – The name of the features column

setK(value)[source]
Parameters

k – number of matches to return

setLabelCol(value)[source]
Parameters

labelCol – The name of the label column

setLeafSize(value)[source]
Parameters

leafSize – max size of the leaves of the tree

setOutputCol(value)[source]
Parameters

outputCol – The name of the output column

setParams(conditionerCol='conditioner', featuresCol='features', k=5, labelCol='labels', leafSize=50, outputCol='ConditionalKNN_2f37bc8ec7f9_output', valuesCol='values')[source]

Set the (keyword only) parameters

setValuesCol(value)[source]
Parameters

valuesCol – column holding values for each feature (key) that will be returned when queried

valuesCol = Param(parent='undefined', name='valuesCol', doc='column holding values for each feature (key) that will be returned when queried')

synapse.ml.nn.ConditionalKNNModel module

class synapse.ml.nn.ConditionalKNNModel.ConditionalKNNModel(java_obj=None, ballTree=None, conditionerCol=None, featuresCol=None, k=None, labelCol=None, leafSize=None, outputCol=None, valuesCol=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaModel

Parameters
  • ballTree (object) – the ballTree model used for perfoming queries

  • conditionerCol (object) – column holding identifiers for features that will be returned when queried

  • featuresCol (object) – The name of the features column

  • k (int) – number of matches to return

  • labelCol (object) – The name of the label column

  • leafSize (int) – max size of the leaves of the tree

  • outputCol (object) – The name of the output column

  • valuesCol (object) – column holding values for each feature (key) that will be returned when queried

ballTree = Param(parent='undefined', name='ballTree', doc='the ballTree model used for perfoming queries')
conditionerCol = Param(parent='undefined', name='conditionerCol', doc='column holding identifiers for features that will be returned when queried')
featuresCol = Param(parent='undefined', name='featuresCol', doc='The name of the features column')
getBallTree()[source]
Returns

the ballTree model used for perfoming queries

Return type

ballTree

getConditionerCol()[source]
Returns

column holding identifiers for features that will be returned when queried

Return type

conditionerCol

getFeaturesCol()[source]
Returns

The name of the features column

Return type

featuresCol

static getJavaPackage()[source]

Returns package name String.

getK()[source]
Returns

number of matches to return

Return type

k

getLabelCol()[source]
Returns

The name of the label column

Return type

labelCol

getLeafSize()[source]
Returns

max size of the leaves of the tree

Return type

leafSize

getOutputCol()[source]
Returns

The name of the output column

Return type

outputCol

getValuesCol()[source]
Returns

column holding values for each feature (key) that will be returned when queried

Return type

valuesCol

k = Param(parent='undefined', name='k', doc='number of matches to return')
labelCol = Param(parent='undefined', name='labelCol', doc='The name of the label column')
leafSize = Param(parent='undefined', name='leafSize', doc='max size of the leaves of the tree')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setBallTree(value)[source]
Parameters

ballTree – the ballTree model used for perfoming queries

setConditionerCol(value)[source]
Parameters

conditionerCol – column holding identifiers for features that will be returned when queried

setFeaturesCol(value)[source]
Parameters

featuresCol – The name of the features column

setK(value)[source]
Parameters

k – number of matches to return

setLabelCol(value)[source]
Parameters

labelCol – The name of the label column

setLeafSize(value)[source]
Parameters

leafSize – max size of the leaves of the tree

setOutputCol(value)[source]
Parameters

outputCol – The name of the output column

setParams(ballTree=None, conditionerCol=None, featuresCol=None, k=None, labelCol=None, leafSize=None, outputCol=None, valuesCol=None)[source]

Set the (keyword only) parameters

setValuesCol(value)[source]
Parameters

valuesCol – column holding values for each feature (key) that will be returned when queried

valuesCol = Param(parent='undefined', name='valuesCol', doc='column holding values for each feature (key) that will be returned when queried')

synapse.ml.nn.KNN module

class synapse.ml.nn.KNN.KNN(java_obj=None, featuresCol='features', k=5, leafSize=50, outputCol='KNN_12971ac2f3db_output', valuesCol='values')[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaEstimator

Parameters
  • featuresCol (object) – The name of the features column

  • k (int) – number of matches to return

  • leafSize (int) – max size of the leaves of the tree

  • outputCol (object) – The name of the output column

  • valuesCol (object) – column holding values for each feature (key) that will be returned when queried

featuresCol = Param(parent='undefined', name='featuresCol', doc='The name of the features column')
getFeaturesCol()[source]
Returns

The name of the features column

Return type

featuresCol

static getJavaPackage()[source]

Returns package name String.

getK()[source]
Returns

number of matches to return

Return type

k

getLeafSize()[source]
Returns

max size of the leaves of the tree

Return type

leafSize

getOutputCol()[source]
Returns

The name of the output column

Return type

outputCol

getValuesCol()[source]
Returns

column holding values for each feature (key) that will be returned when queried

Return type

valuesCol

k = Param(parent='undefined', name='k', doc='number of matches to return')
leafSize = Param(parent='undefined', name='leafSize', doc='max size of the leaves of the tree')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setFeaturesCol(value)[source]
Parameters

featuresCol – The name of the features column

setK(value)[source]
Parameters

k – number of matches to return

setLeafSize(value)[source]
Parameters

leafSize – max size of the leaves of the tree

setOutputCol(value)[source]
Parameters

outputCol – The name of the output column

setParams(featuresCol='features', k=5, leafSize=50, outputCol='KNN_12971ac2f3db_output', valuesCol='values')[source]

Set the (keyword only) parameters

setValuesCol(value)[source]
Parameters

valuesCol – column holding values for each feature (key) that will be returned when queried

valuesCol = Param(parent='undefined', name='valuesCol', doc='column holding values for each feature (key) that will be returned when queried')

synapse.ml.nn.KNNModel module

class synapse.ml.nn.KNNModel.KNNModel(java_obj=None, ballTree=None, featuresCol=None, k=None, leafSize=None, outputCol=None, valuesCol=None)[source]

Bases: synapse.ml.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaModel

Parameters
  • ballTree (object) – the ballTree model used for performing queries

  • featuresCol (object) – The name of the features column

  • k (int) – number of matches to return

  • leafSize (int) – max size of the leaves of the tree

  • outputCol (object) – The name of the output column

  • valuesCol (object) – column holding values for each feature (key) that will be returned when queried

ballTree = Param(parent='undefined', name='ballTree', doc='the ballTree model used for performing queries')
featuresCol = Param(parent='undefined', name='featuresCol', doc='The name of the features column')
getBallTree()[source]
Returns

the ballTree model used for performing queries

Return type

ballTree

getFeaturesCol()[source]
Returns

The name of the features column

Return type

featuresCol

static getJavaPackage()[source]

Returns package name String.

getK()[source]
Returns

number of matches to return

Return type

k

getLeafSize()[source]
Returns

max size of the leaves of the tree

Return type

leafSize

getOutputCol()[source]
Returns

The name of the output column

Return type

outputCol

getValuesCol()[source]
Returns

column holding values for each feature (key) that will be returned when queried

Return type

valuesCol

k = Param(parent='undefined', name='k', doc='number of matches to return')
leafSize = Param(parent='undefined', name='leafSize', doc='max size of the leaves of the tree')
outputCol = Param(parent='undefined', name='outputCol', doc='The name of the output column')
classmethod read()[source]

Returns an MLReader instance for this class.

setBallTree(value)[source]
Parameters

ballTree – the ballTree model used for performing queries

setFeaturesCol(value)[source]
Parameters

featuresCol – The name of the features column

setK(value)[source]
Parameters

k – number of matches to return

setLeafSize(value)[source]
Parameters

leafSize – max size of the leaves of the tree

setOutputCol(value)[source]
Parameters

outputCol – The name of the output column

setParams(ballTree=None, featuresCol=None, k=None, leafSize=None, outputCol=None, valuesCol=None)[source]

Set the (keyword only) parameters

setValuesCol(value)[source]
Parameters

valuesCol – column holding values for each feature (key) that will be returned when queried

valuesCol = Param(parent='undefined', name='valuesCol', doc='column holding values for each feature (key) that will be returned when queried')

Module contents

SynapseML is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. SynapseML adds many deep learning and data science tools to the Spark ecosystem, including seamless integration of Spark Machine Learning pipelines with Microsoft Cognitive Toolkit (CNTK), LightGBM and OpenCV. These tools enable powerful and highly-scalable predictive and analytical models for a variety of datasources.

SynapseML also brings new networking capabilities to the Spark Ecosystem. With the HTTP on Spark project, users can embed any web service into their SparkML models. In this vein, SynapseML provides easy to use SparkML transformers for a wide variety of Microsoft Cognitive Services. For production grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.

SynapseML requires Scala 2.12, Spark 3.0+, and Python 3.6+.