mmlspark.lime package

Submodules

mmlspark.lime.ImageLIME module

class mmlspark.lime.ImageLIME.ImageLIME(cellSize=16.0, inputCol=None, model=None, modifier=130.0, nSamples=900, outputCol=None, predictionCol='prediction', regularization=0.0, samplingFraction=0.3, superpixelCol='superpixels')[source]

Bases: mmlspark.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters
  • cellSize (double) – Number that controls the size of the superpixels (default: 16.0)

  • inputCol (str) – The name of the input column

  • model (object) – Model to try to locally approximate

  • modifier (double) – Controls the trade-off spatial and color distance (default: 130.0)

  • nSamples (int) – The number of samples to generate (default: 900)

  • outputCol (str) – The name of the output column

  • predictionCol (str) – prediction column name (default: prediction)

  • regularization (double) – regularization param for the lasso (default: 0.0)

  • samplingFraction (double) – The fraction of superpixels to keep on (default: 0.3)

  • superpixelCol (str) – The column holding the superpixel decompositions (default: superpixels)

getCellSize()[source]
Returns

Number that controls the size of the superpixels (default: 16.0)

Return type

double

getInputCol()[source]
Returns

The name of the input column

Return type

str

static getJavaPackage()[source]

Returns package name String.

getModel()[source]
Returns

Model to try to locally approximate

Return type

object

getModifier()[source]
Returns

Controls the trade-off spatial and color distance (default: 130.0)

Return type

double

getNSamples()[source]
Returns

The number of samples to generate (default: 900)

Return type

int

getOutputCol()[source]
Returns

The name of the output column

Return type

str

getPredictionCol()[source]
Returns

prediction column name (default: prediction)

Return type

str

getRegularization()[source]
Returns

regularization param for the lasso (default: 0.0)

Return type

double

getSamplingFraction()[source]
Returns

The fraction of superpixels to keep on (default: 0.3)

Return type

double

getSuperpixelCol()[source]
Returns

The column holding the superpixel decompositions (default: superpixels)

Return type

str

classmethod read()[source]

Returns an MLReader instance for this class.

setCellSize(value)[source]
Parameters

cellSize (double) – Number that controls the size of the superpixels (default: 16.0)

setInputCol(value)[source]
Parameters

inputCol (str) – The name of the input column

setModel(value)[source]
Parameters

model (object) – Model to try to locally approximate

setModifier(value)[source]
Parameters

modifier (double) – Controls the trade-off spatial and color distance (default: 130.0)

setNSamples(value)[source]
Parameters

nSamples (int) – The number of samples to generate (default: 900)

setOutputCol(value)[source]
Parameters

outputCol (str) – The name of the output column

setParams(cellSize=16.0, inputCol=None, model=None, modifier=130.0, nSamples=900, outputCol=None, predictionCol='prediction', regularization=0.0, samplingFraction=0.3, superpixelCol='superpixels')[source]

Set the (keyword only) parameters

Parameters
  • cellSize (double) – Number that controls the size of the superpixels (default: 16.0)

  • inputCol (str) – The name of the input column

  • model (object) – Model to try to locally approximate

  • modifier (double) – Controls the trade-off spatial and color distance (default: 130.0)

  • nSamples (int) – The number of samples to generate (default: 900)

  • outputCol (str) – The name of the output column

  • predictionCol (str) – prediction column name (default: prediction)

  • regularization (double) – regularization param for the lasso (default: 0.0)

  • samplingFraction (double) – The fraction of superpixels to keep on (default: 0.3)

  • superpixelCol (str) – The column holding the superpixel decompositions (default: superpixels)

setPredictionCol(value)[source]
Parameters

predictionCol (str) – prediction column name (default: prediction)

setRegularization(value)[source]
Parameters

regularization (double) – regularization param for the lasso (default: 0.0)

setSamplingFraction(value)[source]
Parameters

samplingFraction (double) – The fraction of superpixels to keep on (default: 0.3)

setSuperpixelCol(value)[source]
Parameters

superpixelCol (str) – The column holding the superpixel decompositions (default: superpixels)

mmlspark.lime.SuperpixelTransformer module

class mmlspark.lime.SuperpixelTransformer.SuperpixelTransformer(cellSize=16.0, inputCol=None, modifier=130.0, outputCol=None)[source]

Bases: mmlspark.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters
  • cellSize (double) – Number that controls the size of the superpixels (default: 16.0)

  • inputCol (str) – The name of the input column

  • modifier (double) – Controls the trade-off spatial and color distance (default: 130.0)

  • outputCol (str) – The name of the output column (default: [self.uid]_output)

getCellSize()[source]
Returns

Number that controls the size of the superpixels (default: 16.0)

Return type

double

getInputCol()[source]
Returns

The name of the input column

Return type

str

static getJavaPackage()[source]

Returns package name String.

getModifier()[source]
Returns

Controls the trade-off spatial and color distance (default: 130.0)

Return type

double

getOutputCol()[source]
Returns

The name of the output column (default: [self.uid]_output)

Return type

str

classmethod read()[source]

Returns an MLReader instance for this class.

setCellSize(value)[source]
Parameters

cellSize (double) – Number that controls the size of the superpixels (default: 16.0)

setInputCol(value)[source]
Parameters

inputCol (str) – The name of the input column

setModifier(value)[source]
Parameters

modifier (double) – Controls the trade-off spatial and color distance (default: 130.0)

setOutputCol(value)[source]
Parameters

outputCol (str) – The name of the output column (default: [self.uid]_output)

setParams(cellSize=16.0, inputCol=None, modifier=130.0, outputCol=None)[source]

Set the (keyword only) parameters

Parameters
  • cellSize (double) – Number that controls the size of the superpixels (default: 16.0)

  • inputCol (str) – The name of the input column

  • modifier (double) – Controls the trade-off spatial and color distance (default: 130.0)

  • outputCol (str) – The name of the output column (default: [self.uid]_output)

mmlspark.lime.TabularLIME module

class mmlspark.lime.TabularLIME.TabularLIME(inputCol=None, model=None, nSamples=1000, outputCol=None, predictionCol='prediction', regularization=0.0, samplingFraction=0.3)[source]

Bases: mmlspark.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaEstimator

Parameters
  • inputCol (str) – The name of the input column

  • model (object) – Model to try to locally approximate

  • nSamples (int) – The number of samples to generate (default: 1000)

  • outputCol (str) – The name of the output column

  • predictionCol (str) – prediction column name (default: prediction)

  • regularization (double) – regularization param for the lasso (default: 0.0)

  • samplingFraction (double) – The fraction of superpixels to keep on (default: 0.3)

getInputCol()[source]
Returns

The name of the input column

Return type

str

static getJavaPackage()[source]

Returns package name String.

getModel()[source]
Returns

Model to try to locally approximate

Return type

object

getNSamples()[source]
Returns

The number of samples to generate (default: 1000)

Return type

int

getOutputCol()[source]
Returns

The name of the output column

Return type

str

getPredictionCol()[source]
Returns

prediction column name (default: prediction)

Return type

str

getRegularization()[source]
Returns

regularization param for the lasso (default: 0.0)

Return type

double

getSamplingFraction()[source]
Returns

The fraction of superpixels to keep on (default: 0.3)

Return type

double

classmethod read()[source]

Returns an MLReader instance for this class.

setInputCol(value)[source]
Parameters

inputCol (str) – The name of the input column

setModel(value)[source]
Parameters

model (object) – Model to try to locally approximate

setNSamples(value)[source]
Parameters

nSamples (int) – The number of samples to generate (default: 1000)

setOutputCol(value)[source]
Parameters

outputCol (str) – The name of the output column

setParams(inputCol=None, model=None, nSamples=1000, outputCol=None, predictionCol='prediction', regularization=0.0, samplingFraction=0.3)[source]

Set the (keyword only) parameters

Parameters
  • inputCol (str) – The name of the input column

  • model (object) – Model to try to locally approximate

  • nSamples (int) – The number of samples to generate (default: 1000)

  • outputCol (str) – The name of the output column

  • predictionCol (str) – prediction column name (default: prediction)

  • regularization (double) – regularization param for the lasso (default: 0.0)

  • samplingFraction (double) – The fraction of superpixels to keep on (default: 0.3)

setPredictionCol(value)[source]
Parameters

predictionCol (str) – prediction column name (default: prediction)

setRegularization(value)[source]
Parameters

regularization (double) – regularization param for the lasso (default: 0.0)

setSamplingFraction(value)[source]
Parameters

samplingFraction (double) – The fraction of superpixels to keep on (default: 0.3)

class mmlspark.lime.TabularLIME.TabularLIMEModel(java_model=None)[source]

Bases: mmlspark.core.schema.Utils.ComplexParamsMixin, pyspark.ml.wrapper.JavaModel, pyspark.ml.util.JavaMLWritable, pyspark.ml.util.JavaMLReadable

Model fitted by TabularLIME.

This class is left empty on purpose. All necessary methods are exposed through inheritance.

static getJavaPackage()[source]

Returns package name String.

classmethod read()[source]

Returns an MLReader instance for this class.

mmlspark.lime.TabularLIMEModel module

class mmlspark.lime.TabularLIMEModel.TabularLIMEModel(columnMeans=None, columnSTDs=None, inputCol=None, model=None, nSamples=None, outputCol=None, predictionCol='prediction', regularization=None, samplingFraction=None)[source]

Bases: mmlspark.core.schema.Utils.ComplexParamsMixin, pyspark.ml.util.JavaMLReadable, pyspark.ml.util.JavaMLWritable, pyspark.ml.wrapper.JavaTransformer

Parameters
  • columnMeans (list) – the means of each of the columns for perturbation

  • columnSTDs (list) – the standard deviations of each of the columns for perturbation

  • inputCol (str) – The name of the input column

  • model (object) – Model to try to locally approximate

  • nSamples (int) – The number of samples to generate

  • outputCol (str) – The name of the output column

  • predictionCol (str) – prediction column name (default: prediction)

  • regularization (double) – regularization param for the lasso

  • samplingFraction (double) – The fraction of superpixels to keep on

getColumnMeans()[source]
Returns

the means of each of the columns for perturbation

Return type

list

getColumnSTDs()[source]
Returns

the standard deviations of each of the columns for perturbation

Return type

list

getInputCol()[source]
Returns

The name of the input column

Return type

str

static getJavaPackage()[source]

Returns package name String.

getModel()[source]
Returns

Model to try to locally approximate

Return type

object

getNSamples()[source]
Returns

The number of samples to generate

Return type

int

getOutputCol()[source]
Returns

The name of the output column

Return type

str

getPredictionCol()[source]
Returns

prediction column name (default: prediction)

Return type

str

getRegularization()[source]
Returns

regularization param for the lasso

Return type

double

getSamplingFraction()[source]
Returns

The fraction of superpixels to keep on

Return type

double

classmethod read()[source]

Returns an MLReader instance for this class.

setColumnMeans(value)[source]
Parameters

columnMeans (list) – the means of each of the columns for perturbation

setColumnSTDs(value)[source]
Parameters

columnSTDs (list) – the standard deviations of each of the columns for perturbation

setInputCol(value)[source]
Parameters

inputCol (str) – The name of the input column

setModel(value)[source]
Parameters

model (object) – Model to try to locally approximate

setNSamples(value)[source]
Parameters

nSamples (int) – The number of samples to generate

setOutputCol(value)[source]
Parameters

outputCol (str) – The name of the output column

setParams(columnMeans=None, columnSTDs=None, inputCol=None, model=None, nSamples=None, outputCol=None, predictionCol='prediction', regularization=None, samplingFraction=None)[source]

Set the (keyword only) parameters

Parameters
  • columnMeans (list) – the means of each of the columns for perturbation

  • columnSTDs (list) – the standard deviations of each of the columns for perturbation

  • inputCol (str) – The name of the input column

  • model (object) – Model to try to locally approximate

  • nSamples (int) – The number of samples to generate

  • outputCol (str) – The name of the output column

  • predictionCol (str) – prediction column name (default: prediction)

  • regularization (double) – regularization param for the lasso

  • samplingFraction (double) – The fraction of superpixels to keep on

setPredictionCol(value)[source]
Parameters

predictionCol (str) – prediction column name (default: prediction)

setRegularization(value)[source]
Parameters

regularization (double) – regularization param for the lasso

setSamplingFraction(value)[source]
Parameters

samplingFraction (double) – The fraction of superpixels to keep on

Module contents

MicrosoftML is a library of Python classes to interface with the Microsoft scala APIs to utilize Apache Spark to create distibuted machine learning models.

MicrosoftML simplifies training and scoring classifiers and regressors, as well as facilitating the creation of models using the CNTK library, images, and text.