mmlspark.lightgbm package¶

Submodules¶

mmlspark.lightgbm.LightGBMClassifier module¶

class mmlspark.lightgbm.LightGBMClassifier.LightGBMClassificationModel(java_model=None)[source]¶

Bases: mmlspark.lightgbm._LightGBMClassifier._LightGBMClassificationModel

getFeatureImportances(importance_type='split')[source]¶: Get the feature importances as a list. The importance_type can be “split” or “gain”.

static loadNativeModelFromFile(filename, labelColName='label', featuresColName='features', predictionColName='prediction', probColName='probability', rawPredictionColName='rawPrediction')[source]¶: Load the model from a native LightGBM text file.

static loadNativeModelFromString(model, labelColName='label', featuresColName='features', predictionColName='prediction', probColName='probability', rawPredictionColName='rawPrediction')[source]¶: Load the model from a native LightGBM model string.

saveNativeModel(filename, overwrite=True)[source]¶: Save the booster as string format to a local or WASB remote location.

class mmlspark.lightgbm.LightGBMClassifier.LightGBMClassifier(baggingFraction=1.0, baggingFreq=0, baggingSeed=3, boostFromAverage=True, boostingType='gbdt', categoricalSlotIndexes=None, categoricalSlotNames=None, defaultListenPort=12400, earlyStoppingRound=0, featureFraction=1.0, featuresCol='features', initScoreCol=None, isProvideTrainingMetric=False, isUnbalance=False, labelCol='label', lambdaL1=0.0, lambdaL2=0.0, learningRate=0.1, maxBin=255, maxDepth=-1, metric='', minSumHessianInLeaf=0.001, modelString='', numBatches=0, numIterations=100, numLeaves=31, objective='binary', parallelism='data_parallel', predictionCol='prediction', probabilityCol='probability', rawPredictionCol='rawPrediction', thresholds=None, timeout=1200.0, useBarrierExecutionMode=False, validationIndicatorCol=None, verbosity=1, weightCol=None)[source]¶: Bases: mmlspark.lightgbm._LightGBMClassifier._LightGBMClassifier

mmlspark.lightgbm.LightGBMRanker module¶

class mmlspark.lightgbm.LightGBMRanker.LightGBMRanker(baggingFraction=1.0, baggingFreq=0, baggingSeed=3, boostFromAverage=True, boostingType='gbdt', categoricalSlotIndexes=None, categoricalSlotNames=None, defaultListenPort=12400, earlyStoppingRound=0, evalAt=[1, 2, 3, 4, 5], featureFraction=1.0, featuresCol='features', groupCol=None, initScoreCol=None, isProvideTrainingMetric=False, labelCol='label', labelGain=[], lambdaL1=0.0, lambdaL2=0.0, learningRate=0.1, maxBin=255, maxDepth=-1, maxPosition=20, metric='', minSumHessianInLeaf=0.001, modelString='', numBatches=0, numIterations=100, numLeaves=31, objective='lambdarank', parallelism='data_parallel', predictionCol='prediction', timeout=1200.0, useBarrierExecutionMode=False, validationIndicatorCol=None, verbosity=1, weightCol=None)[source]¶: Bases: mmlspark.lightgbm._LightGBMRanker._LightGBMRanker

class mmlspark.lightgbm.LightGBMRanker.LightGBMRankerModel(java_model=None)[source]¶

Bases: mmlspark.lightgbm._LightGBMRanker._LightGBMRankerModel

getFeatureImportances(importance_type='split')[source]¶: Get the feature importances as a list. The importance_type can be “split” or “gain”.

static loadNativeModelFromFile(filename, labelColName='label', featuresColName='features', predictionColName='prediction')[source]¶: Load the model from a native LightGBM text file.

static loadNativeModelFromString(model, labelColName='label', featuresColName='features', predictionColName='prediction')[source]¶: Load the model from a native LightGBM model string.

saveNativeModel(filename, overwrite=True)[source]¶: Save the booster as string format to a local or WASB remote location.

mmlspark.lightgbm.LightGBMRegressor module¶

class mmlspark.lightgbm.LightGBMRegressor.LightGBMRegressionModel(java_model=None)[source]¶

Bases: mmlspark.lightgbm._LightGBMRegressor._LightGBMRegressionModel

getFeatureImportances(importance_type='split')[source]¶: Get the feature importances as a list. The importance_type can be “split” or “gain”.

static loadNativeModelFromFile(filename, labelColName='label', featuresColName='features', predictionColName='prediction')[source]¶: Load the model from a native LightGBM text file.

static loadNativeModelFromString(model, labelColName='label', featuresColName='features', predictionColName='prediction')[source]¶: Load the model from a native LightGBM model string.

saveNativeModel(filename, overwrite=True)[source]¶: Save the booster as string format to a local or WASB remote location.

class mmlspark.lightgbm.LightGBMRegressor.LightGBMRegressor(alpha=0.9, baggingFraction=1.0, baggingFreq=0, baggingSeed=3, boostFromAverage=True, boostingType='gbdt', categoricalSlotIndexes=None, categoricalSlotNames=None, defaultListenPort=12400, earlyStoppingRound=0, featureFraction=1.0, featuresCol='features', initScoreCol=None, isProvideTrainingMetric=False, labelCol='label', lambdaL1=0.0, lambdaL2=0.0, learningRate=0.1, maxBin=255, maxDepth=-1, metric='', minSumHessianInLeaf=0.001, modelString='', numBatches=0, numIterations=100, numLeaves=31, objective='regression', parallelism='data_parallel', predictionCol='prediction', timeout=1200.0, tweedieVariancePower=1.5, useBarrierExecutionMode=False, validationIndicatorCol=None, verbosity=1, weightCol=None)[source]¶: Bases: mmlspark.lightgbm._LightGBMRegressor._LightGBMRegressor

Module contents¶

MicrosoftML is a library of Python classes to interface with the Microsoft scala APIs to utilize Apache Spark to create distibuted machine learning models.

MicrosoftML simplifies training and scoring classifiers and regressors, as well as facilitating the creation of models using the CNTK library, images, and text.