class BulkPartitionTask extends BasePartitionTask
Class for handling the execution of bulk-based Tasks on workers for each partition.
- Alphabetic
- By Inheritance
- BulkPartitionTask
- BasePartitionTask
- Logging
- Serializable
- Serializable
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
- new BulkPartitionTask()
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
cleanupInternal(ctx: PartitionTaskContext): Unit
Cleanup the task
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
determineMatrixType(ctx: PartitionTaskContext, inputRows: Iterator[Row]): PeekingIterator[Row]
- Attributes
- protected
- Definition Classes
- BasePartitionTask
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
generateFinalDatasetInternal(ctx: PartitionTaskContext, dataState: PartitionDataState, forValidation: Boolean, referenceDataset: Option[LightGBMDataset]): LightGBMDataset
Generate the final dataset for this task.
Generate the final dataset for this task. Internal implementation for specific execution modes.
- ctx
The training context.
- dataState
Any intermediate data state (used mainly by bulk execution mode).
- forValidation
Whether to generate the final training dataset or the validation dataset.
- referenceDataset
A reference dataset to start with (used mainly for validation dataset).
- returns
LightGBM dataset Java wrapper.
- Attributes
- protected
- Definition Classes
- BulkPartitionTask → BasePartitionTask
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
getTaskContext(trainingCtx: TrainingContext, partitionId: Int, taskId: Long, measures: TaskInstrumentationMeasures, networkTopologyInfo: NetworkTopologyInfo, shouldExecuteTraining: Boolean, isEmptyPartition: Boolean, shouldReturnBooster: Boolean): PartitionTaskContext
Initialize and customize the context for the task.
Initialize and customize the context for the task.
- trainingCtx
The training context information.
- partitionId
The task context information.
- taskId
The task context information.
- measures
The task instrumentation measures.
- networkTopologyInfo
Information about the network.
- shouldExecuteTraining
Whether this task should participate in LightGBM training.
- isEmptyPartition
Whether the partition has rows.
- shouldReturnBooster
Whether the task should return a booster.
- returns
The updated context information for the task.
- Attributes
- protected
- Definition Classes
- BasePartitionTask
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
initializeInternal(ctx: TrainingContext, shouldExecuteTraining: Boolean, isEmptyPartition: Boolean): Unit
Initialize and customize the context for the task.
Initialize and customize the context for the task.
- ctx
The task context information.
- returns
The updated context information for the task.
- Attributes
- protected
- Definition Classes
- BulkPartitionTask → BasePartitionTask
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
mapPartitionTask(ctx: TrainingContext)(inputRows: Iterator[Row]): Iterator[PartitionResult]
This method will be passed to Spark's mapPartition method and handle execution of training on the workers.
This method will be passed to Spark's mapPartition method and handle execution of training on the workers. Main stages: (and each execution mode has an "Internal" version to perform mode-specific operations) initialize() preparePartitionData() finalizeDatasetAndTrain() cleanup()
- ctx
The training context.
- inputRows
The Spark rows as an iterator.
- returns
result iterator (to comply with Spark mapPartition API).
- Definition Classes
- BasePartitionTask
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
preparePartitionDataInternal(ctx: PartitionTaskContext, inputRows: Iterator[Row]): PartitionDataState
Prepare any data objects for this particular partition.
Prepare any data objects for this particular partition. Implement for specific execution modes.
- ctx
The task context information.
- inputRows
The Spark rows for a partition as an iterator.
- returns
Any intermediate data state (used mainly by bulk execution mode) to pass to future stages.
- Attributes
- protected
- Definition Classes
- BulkPartitionTask → BasePartitionTask
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()