mmlspark.lightgbm package

Submodules

mmlspark.lightgbm.LightGBMClassifier module

class mmlspark.lightgbm.LightGBMClassifier.LightGBMClassificationModel(java_model=None)[source]

Bases: mmlspark.lightgbm._LightGBMClassifier._LightGBMClassificationModel

getFeatureImportances(importance_type='split')[source]

Get the feature importances as a list. The importance_type can be “split” or “gain”.

static loadNativeModelFromFile(filename, labelColName='label', featuresColName='features', predictionColName='prediction', probColName='probability', rawPredictionColName='rawPrediction')[source]

Load the model from a native LightGBM text file.

static loadNativeModelFromString(model, labelColName='label', featuresColName='features', predictionColName='prediction', probColName='probability', rawPredictionColName='rawPrediction')[source]

Load the model from a native LightGBM model string.

saveNativeModel(filename, overwrite=True)[source]

Save the booster as string format to a local or WASB remote location.

class mmlspark.lightgbm.LightGBMClassifier.LightGBMClassifier(baggingFraction=1.0, baggingFreq=0, baggingSeed=3, boostFromAverage=True, boostingType='gbdt', categoricalSlotIndexes=None, categoricalSlotNames=None, defaultListenPort=12400, earlyStoppingRound=0, featureFraction=1.0, featuresCol='features', initScoreCol=None, isProvideTrainingMetric=False, isUnbalance=False, labelCol='label', lambdaL1=0.0, lambdaL2=0.0, learningRate=0.1, maxBin=255, maxDepth=-1, metric='', minSumHessianInLeaf=0.001, modelString='', numBatches=0, numIterations=100, numLeaves=31, objective='binary', parallelism='data_parallel', predictionCol='prediction', probabilityCol='probability', rawPredictionCol='rawPrediction', thresholds=None, timeout=1200.0, useBarrierExecutionMode=False, validationIndicatorCol=None, verbosity=1, weightCol=None)[source]

Bases: mmlspark.lightgbm._LightGBMClassifier._LightGBMClassifier

mmlspark.lightgbm.LightGBMRanker module

class mmlspark.lightgbm.LightGBMRanker.LightGBMRanker(baggingFraction=1.0, baggingFreq=0, baggingSeed=3, boostFromAverage=True, boostingType='gbdt', categoricalSlotIndexes=None, categoricalSlotNames=None, defaultListenPort=12400, earlyStoppingRound=0, evalAt=[1, 2, 3, 4, 5], featureFraction=1.0, featuresCol='features', groupCol=None, initScoreCol=None, isProvideTrainingMetric=False, labelCol='label', labelGain=[], lambdaL1=0.0, lambdaL2=0.0, learningRate=0.1, maxBin=255, maxDepth=-1, maxPosition=20, metric='', minSumHessianInLeaf=0.001, modelString='', numBatches=0, numIterations=100, numLeaves=31, objective='lambdarank', parallelism='data_parallel', predictionCol='prediction', timeout=1200.0, useBarrierExecutionMode=False, validationIndicatorCol=None, verbosity=1, weightCol=None)[source]

Bases: mmlspark.lightgbm._LightGBMRanker._LightGBMRanker

class mmlspark.lightgbm.LightGBMRanker.LightGBMRankerModel(java_model=None)[source]

Bases: mmlspark.lightgbm._LightGBMRanker._LightGBMRankerModel

getFeatureImportances(importance_type='split')[source]

Get the feature importances as a list. The importance_type can be “split” or “gain”.

static loadNativeModelFromFile(filename, labelColName='label', featuresColName='features', predictionColName='prediction')[source]

Load the model from a native LightGBM text file.

static loadNativeModelFromString(model, labelColName='label', featuresColName='features', predictionColName='prediction')[source]

Load the model from a native LightGBM model string.

saveNativeModel(filename, overwrite=True)[source]

Save the booster as string format to a local or WASB remote location.

mmlspark.lightgbm.LightGBMRegressor module

class mmlspark.lightgbm.LightGBMRegressor.LightGBMRegressionModel(java_model=None)[source]

Bases: mmlspark.lightgbm._LightGBMRegressor._LightGBMRegressionModel

getFeatureImportances(importance_type='split')[source]

Get the feature importances as a list. The importance_type can be “split” or “gain”.

static loadNativeModelFromFile(filename, labelColName='label', featuresColName='features', predictionColName='prediction')[source]

Load the model from a native LightGBM text file.

static loadNativeModelFromString(model, labelColName='label', featuresColName='features', predictionColName='prediction')[source]

Load the model from a native LightGBM model string.

saveNativeModel(filename, overwrite=True)[source]

Save the booster as string format to a local or WASB remote location.

class mmlspark.lightgbm.LightGBMRegressor.LightGBMRegressor(alpha=0.9, baggingFraction=1.0, baggingFreq=0, baggingSeed=3, boostFromAverage=True, boostingType='gbdt', categoricalSlotIndexes=None, categoricalSlotNames=None, defaultListenPort=12400, earlyStoppingRound=0, featureFraction=1.0, featuresCol='features', initScoreCol=None, isProvideTrainingMetric=False, labelCol='label', lambdaL1=0.0, lambdaL2=0.0, learningRate=0.1, maxBin=255, maxDepth=-1, metric='', minSumHessianInLeaf=0.001, modelString='', numBatches=0, numIterations=100, numLeaves=31, objective='regression', parallelism='data_parallel', predictionCol='prediction', timeout=1200.0, tweedieVariancePower=1.5, useBarrierExecutionMode=False, validationIndicatorCol=None, verbosity=1, weightCol=None)[source]

Bases: mmlspark.lightgbm._LightGBMRegressor._LightGBMRegressor

Module contents

MicrosoftML is a library of Python classes to interface with the Microsoft scala APIs to utilize Apache Spark to create distibuted machine learning models.

MicrosoftML simplifies training and scoring classifiers and regressors, as well as facilitating the creation of models using the CNTK library, images, and text.