Packages

package lightgbm

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. abstract class BasePartitionTask extends Serializable with Logging

    Class for handling the execution of Tasks on workers for each partition.

    Class for handling the execution of Tasks on workers for each partition. Only runs on worker Tasks.

  2. class BulkPartitionTask extends BasePartitionTask

    Class for handling the execution of bulk-based Tasks on workers for each partition.

  3. case class ColumnParams(labelColumn: String, featuresColumn: String, weightColumn: Option[String], initScoreColumn: Option[String], groupColumn: Option[String]) extends Product with Serializable
  4. class GroupIdManager extends AnyRef

    Class for converting column values to group ID.

    Class for converting column values to group ID.

    Ints can just be returned, but a map of Long and String values is maintained so that unique and consistent values can be returned.

  5. trait HasActualNumClasses extends Params

    Special parameter for classification model for actual number of classes in dataset

  6. class InstrumentationMeasures extends Serializable

    Class for encapsulating performance instrumentation measures of overall training.

  7. trait LightGBMBase[TrainedModel <: Model[TrainedModel] with LightGBMModelParams] extends Estimator[TrainedModel] with LightGBMParams with ComplexParamsWritable with HasFeaturesCol with HasLabelCol with LightGBMPerformance with SynapseMLLogging
  8. class LightGBMClassificationModel extends ProbabilisticClassificationModel[Vector, LightGBMClassificationModel] with LightGBMModelParams with LightGBMModelMethods with LightGBMPredictionParams with HasActualNumClasses with ComplexParamsWritable with SynapseMLLogging

    Model produced by LightGBMClassifier.

  9. class LightGBMClassifier extends ProbabilisticClassifier[Vector, LightGBMClassifier, LightGBMClassificationModel] with LightGBMBase[LightGBMClassificationModel] with SynapseMLLogging

    Trains a LightGBM Classification model, a fast, distributed, high performance gradient boosting framework based on decision tree algorithms.

    Trains a LightGBM Classification model, a fast, distributed, high performance gradient boosting framework based on decision tree algorithms. For more information please see here: https://github.com/Microsoft/LightGBM. For parameter information see here: https://github.com/Microsoft/LightGBM/blob/master/docs/Parameters.rst

  10. trait LightGBMDelegate extends Serializable
  11. trait LightGBMModelMethods extends LightGBMModelParams with Logging

    Contains common LightGBM model methods across all LightGBM learner types.

  12. trait LightGBMPerformance extends Serializable
  13. class LightGBMRanker extends Ranker[Vector, LightGBMRanker, LightGBMRankerModel] with LightGBMBase[LightGBMRankerModel] with SynapseMLLogging

    Trains a LightGBMRanker model, a fast, distributed, high performance gradient boosting framework based on decision tree algorithms.

    Trains a LightGBMRanker model, a fast, distributed, high performance gradient boosting framework based on decision tree algorithms. For more information please see here: https://github.com/Microsoft/LightGBM. For parameter information see here: https://github.com/Microsoft/LightGBM/blob/master/docs/Parameters.rst

  14. class LightGBMRankerModel extends RankerModel[Vector, LightGBMRankerModel] with LightGBMModelParams with LightGBMModelMethods with LightGBMPredictionParams with ComplexParamsWritable with SynapseMLLogging

    Model produced by LightGBMRanker.

  15. class LightGBMRegressionModel extends RegressionModel[Vector, LightGBMRegressionModel] with LightGBMModelParams with LightGBMModelMethods with LightGBMPredictionParams with ComplexParamsWritable with SynapseMLLogging

    Model produced by LightGBMRegressor.

  16. class LightGBMRegressor extends BaseRegressor[Vector, LightGBMRegressor, LightGBMRegressionModel] with LightGBMBase[LightGBMRegressionModel] with SynapseMLLogging

    Trains a LightGBM Regression model, a fast, distributed, high performance gradient boosting framework based on decision tree algorithms.

    Trains a LightGBM Regression model, a fast, distributed, high performance gradient boosting framework based on decision tree algorithms. For more information please see here: https://github.com/Microsoft/LightGBM. For parameter information see here: https://github.com/Microsoft/LightGBM/blob/master/docs/Parameters.rst Note: The application parameter supports the following values:

    • regression_l2, L2 loss, alias=regression, mean_squared_error, mse, l2_root, root_mean_squared_error, rmse
    • regression_l1, L1 loss, alias=mean_absolute_error, mae
    • huber, Huber loss
    • fair, Fair loss
    • poisson, Poisson regression
    • quantile, Quantile regression
    • mape, MAPE loss, alias=mean_absolute_percentage_error
    • gamma, Gamma regression with log-link. It might be useful, e.g., for modeling insurance claims severity, or for any target that might be gamma-distributed
    • tweedie, Tweedie regression with log-link. It might be useful, e.g., for modeling total loss in insurance, or for any target that might be tweedie-distributed
  17. case class NetworkManager(numTasks: Int, driverServerSocket: ServerSocket, host: String, port: Int, timeout: Double, useBarrierExecutionMode: Boolean) extends Logging with Product with Serializable

    Object to encapsulate all Spark/LightGBM network topology information, along with operations on the network.

  18. case class NetworkParams(defaultListenPort: Int, ipAddress: String, port: Int, barrierExecutionMode: Boolean) extends Product with Serializable
  19. case class NetworkTopologyInfo(lightgbmNetworkString: String, executorPartitionIdList: Array[Int], localListenPort: Int) extends Product with Serializable
  20. case class PartitionDataState(aggregatedTrainingData: Option[BaseAggregatedColumns], aggregatedValidationData: Option[BaseAggregatedColumns]) extends Product with Serializable

    Object to encapsulate all intermediate data calculations.

    Object to encapsulate all intermediate data calculations. Note tha only bulk mode uses these properties, but BasePartitionTask uses this class for consistent interfaces.

  21. case class PartitionResult(booster: Option[LightGBMBooster], taskMeasures: TaskInstrumentationMeasures) extends Product with Serializable

    Object to encapsulate results from mapPartitions call.

  22. case class PartitionTaskContext(trainingCtx: TrainingContext, partitionId: Int, taskId: Long, measures: TaskInstrumentationMeasures, networkTopologyInfo: NetworkTopologyInfo, shouldExecuteTraining: Boolean, isEmptyPartition: Boolean, shouldReturnBooster: Boolean, shouldCalcValidationDataset: Boolean) extends Product with Serializable

    Object to encapsulate most setup information about a particular partition Task

  23. case class PartitionTaskTrainingState(ctx: PartitionTaskContext, booster: LightGBMBooster) extends Product with Serializable

    Object to encapsulate all training state on a single partition, plus the actual Booster

  24. class SharedDatasetState extends AnyRef
  25. class SharedState extends AnyRef
  26. class StreamingPartitionTask extends BasePartitionTask

    Class for handling the execution of streaming-based Tasks on workers for each partition.

  27. case class StreamingState(ctx: PartitionTaskContext, dataset: LightGBMDataset, threadIndex: Int) extends Product with Serializable
  28. class TaskInstrumentationMeasures extends Serializable

    Class for encapsulating performance instrumentation measures of each partition Task.

  29. case class TaskMessageInfo(status: String, taskHost: String, localListenPort: Int, partitionId: Int, executorId: String) extends Product with Serializable
  30. case class TrainingContext(batchIndex: Int, sharedStateSingleton: SharedSingleton[SharedState], schema: StructType, numCols: Int, numInitScoreClasses: Int, trainingParams: BaseTrainParams, networkParams: NetworkParams, columnParams: ColumnParams, datasetParams: String, featureNames: Option[Array[String]], numTasksPerExecutor: Int, validationData: Option[Broadcast[Array[Row]]], serializedReferenceDataset: Option[Array[Byte]], partitionCounts: Option[Array[Long]]) extends Serializable with Product

    Object to encapsulate all information about a training session that does not change during execution and can be created on the driver.

    Object to encapsulate all information about a training session that does not change during execution and can be created on the driver. There is also a reference to the shared state in an executor, which can change over time.