Packages

object DatasetUtils

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DatasetUtils
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. case class CardinalityTriplet[T](groupCounts: List[Int], currentValue: T, currentCount: Int) extends Product with Serializable

Value Members

  1. def countCardinality[T](input: Seq[T]): Array[Int]
  2. def getArrayType(rowsIter: Iterator[Row], matrixType: String, featuresColumn: String): (Iterator[Row], Boolean)

    Get whether to use dense or sparse data, using configuration and/or data sampling.

    Get whether to use dense or sparse data, using configuration and/or data sampling.

    rowsIter

    Iterator of rows.

    matrixType

    Matrix type as configured by user..

    featuresColumn

    The name of the features column.

    returns

    A reconstructed iterator with the same original rows and whether the matrix should be sparse or dense.

  3. def getRowAsDoubleArray(row: Row, columnParams: ColumnParams): Array[Double]
  4. def sampleRowsForArrayType(rowsIter: Iterator[Row], featuresColumn: String): (Iterator[Row], Boolean)

    Sample the first several rows to determine whether to construct sparse or dense matrix in lightgbm native code.

    Sample the first several rows to determine whether to construct sparse or dense matrix in lightgbm native code.

    rowsIter

    Iterator of rows.

    featuresColumn

    The name of the features column.

    returns

    A reconstructed iterator with the same original rows and whether the matrix should be sparse or dense.

  5. def validateGroupColumn(col: String, schema: StructType): Unit