Package

com.microsoft.ml.spark

featurize

Permalink

package featurize

Visibility
  1. Public
  2. All

Type Members

  1. class AssembleFeatures extends Estimator[AssembleFeaturesModel] with HasFeaturesCol with Wrappable with DefaultParamsWritable

    Permalink

    Creates a vector column of features from a collection of feature columns

  2. class AssembleFeaturesModel extends Model[AssembleFeaturesModel] with Params with ConstructorWritable[AssembleFeaturesModel]

    Permalink

    Model produced by AssembleFeatures.

  3. class CleanMissingData extends Estimator[CleanMissingDataModel] with HasInputCols with HasOutputCols with Wrappable with DefaultParamsWritable

    Permalink

    Removes missing values from input dataset.

    Removes missing values from input dataset. The following modes are supported: Mean - replaces missings with mean of fit column Median - replaces missings with approximate median of fit column Custom - replaces missings with custom value specified by user For mean and median modes, only numeric column types are supported, specifically: Int, Long, Float, Double For custom mode, the types above are supported and additionally: String, Boolean

  4. class CleanMissingDataModel extends Model[CleanMissingDataModel] with ConstructorWritable[CleanMissingDataModel]

    Permalink

    Model produced by CleanMissingData.

  5. class ColumnNamesToFeaturize extends Serializable

    Permalink

    Class containing the list of column names to perform special featurization steps for.

    Class containing the list of column names to perform special featurization steps for. colNamesToHash - List of column names to hash. colNamesToDuplicateForMissings - List of column names containing doubles to duplicate so we can remove missing values from them. colNamesToTypes - Map of column names to their types. colNamesToCleanMissings - List of column names to clean missing values from (ignore). colNamesToVectorize - List of column names to vectorize using FastVectorAssembler. categoricalColumns - List of categorical columns to pass through or turn into indicator array. conversionColumnNamesMap - Map from old column names to new. addedColumnNamesMap - Map from old columns to newly generated columns for featurization.

    Annotations
    @SerialVersionUID()
  6. class DataConversion extends Transformer with Wrappable with DefaultParamsWritable

    Permalink

    Converts the specified list of columns to the specified type.

    Converts the specified list of columns to the specified type. Returns a new DataFrame with the converted columns

  7. class Featurize extends Estimator[PipelineModel] with Wrappable with DefaultParamsWritable

    Permalink

    Featurizes a dataset.

    Featurizes a dataset. Converts the specified columns to feature columns.

  8. class IndexToValue extends Transformer with HasInputCol with HasOutputCol with Wrappable with DefaultParamsWritable

    Permalink

    This class takes in a categorical column with MML style attributes and then transforms it back to the original values.

    This class takes in a categorical column with MML style attributes and then transforms it back to the original values. This extends sparkML IndexToString by allowing the transformation back to any types of values.

  9. class NullOrdering[T] extends Ordering[T]

    Permalink
  10. class ValueIndexer extends Estimator[ValueIndexerModel] with ValueIndexerParams

    Permalink

    Fits a dictionary of values from the input column.

    Fits a dictionary of values from the input column. Model then transforms a column to a categorical column of the given array of values. Similar to StringIndexer except it can be used on any value types.

  11. class ValueIndexerModel extends Model[ValueIndexerModel] with ValueIndexerParams with ComplexParamsWritable

    Permalink

    Model produced by ValueIndexer.

  12. trait ValueIndexerParams extends Wrappable with DefaultParamsWritable with HasInputCol with HasOutputCol

    Permalink

Value Members

  1. object AssembleFeatures extends DefaultParamsReadable[AssembleFeatures] with Serializable

    Permalink
  2. object AssembleFeaturesModel extends ConstructorReadable[AssembleFeaturesModel] with Serializable

    Permalink
  3. object CleanMissingData extends DefaultParamsReadable[CleanMissingData] with Serializable

    Permalink
  4. object CleanMissingDataModel extends ConstructorReadable[CleanMissingDataModel] with Serializable

    Permalink
  5. object Featurize extends DefaultParamsReadable[Featurize] with Serializable

    Permalink
  6. object IndexToValue extends DefaultParamsReadable[IndexToValue] with Serializable

    Permalink
  7. object NullOrdering extends Serializable

    Permalink
  8. object ValueIndexer extends DefaultParamsReadable[ValueIndexer] with Serializable

    Permalink
  9. object ValueIndexerModel extends ComplexParamsReadable[ValueIndexerModel] with Serializable

    Permalink
  10. package text

    Permalink

Ungrouped