package dataset
- Alphabetic
- Public
- All
Type Members
-
class
LightGBMDataset extends AutoCloseable
Represents a LightGBM dataset.
Represents a LightGBM dataset. Wraps the native implementation.
- class PeekingIterator[T] extends Iterator[T]
-
case class
SampledData(numRows: Int, numCols: Int) extends Product with Serializable
SampledData: Encapsulates the sampled data need to initialize a LightGBM dataset.
SampledData: Encapsulates the sampled data need to initialize a LightGBM dataset. . LightGBM expects sampled data to be an array of vectors, where each feature column has a sparse representation of non-zero values (i.e. indexes and data vector). It also needs a #features sized array of element count per feature to know how long each column is. . Since we create sampled data as a self-contained set with ONLY sampled data and nothing else, the indexes are trivial (0 until #elements). We don't need to maintain original raw indexes. LightGBM only uses this data to get distributions, and does not care about raw row indexes. . This class manages keeping all the indexing in sync so callers can just push rows of data into it and retrieve the resulting pointers at the end. . Note: sample data row count is not expected to exceed max(Int), so we index with Ints.
Value Members
- object DatasetUtils
- object ReferenceDatasetUtils