Packages

case class SampledData(numRows: Int, numCols: Int) extends Product with Serializable

SampledData: Encapsulates the sampled data need to initialize a LightGBM dataset. . LightGBM expects sampled data to be an array of vectors, where each feature column has a sparse representation of non-zero values (i.e. indexes and data vector). It also needs a #features sized array of element count per feature to know how long each column is. . Since we create sampled data as a self-contained set with ONLY sampled data and nothing else, the indexes are trivial (0 until #elements). We don't need to maintain original raw indexes. LightGBM only uses this data to get distributions, and does not care about raw row indexes. . This class manages keeping all the indexing in sync so callers can just push rows of data into it and retrieve the resulting pointers at the end. . Note: sample data row count is not expected to exceed max(Int), so we index with Ints.

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. SampledData
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new SampledData(numRows: Int, numCols: Int)

Value Members

  1. def delete(): Unit
  2. def getRowCounts: SWIGTYPE_p_int
  3. def getSampleData: SWIGTYPE_p_p_double
  4. def getSampleIndices: SWIGTYPE_p_p_int
  5. val numCols: Int
  6. val numRows: Int
  7. def pushRow(rowData: SparseVector, index: Int): Unit
  8. def pushRow(rowData: Array[Double], index: Int): Unit
  9. def pushRow(rowData: DenseVector, index: Int): Unit
  10. def pushRow(rowData: Row, index: Int, featureColName: String): Unit
  11. val rowCounts: IntSwigArray
  12. val sampleData: DoublePointerSwigArray
  13. val sampleIndexes: IntPointerSwigArray