Packages

class TextFeaturizer extends Estimator[PipelineModel] with TextFeaturizerParams with HasInputCol with HasOutputCol with SynapseMLLogging

Featurize text.

Linear Supertypes
SynapseMLLogging, HasOutputCol, HasInputCol, TextFeaturizerParams, DefaultParamsWritable, MLWritable, Wrappable, DotnetWrappable, RWrappable, PythonWrappable, BaseWrappable, Estimator[PipelineModel], PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. TextFeaturizer
  2. SynapseMLLogging
  3. HasOutputCol
  4. HasInputCol
  5. TextFeaturizerParams
  6. DefaultParamsWritable
  7. MLWritable
  8. Wrappable
  9. DotnetWrappable
  10. RWrappable
  11. PythonWrappable
  12. BaseWrappable
  13. Estimator
  14. PipelineStage
  15. Logging
  16. Params
  17. Serializable
  18. Serializable
  19. Identifiable
  20. AnyRef
  21. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new TextFeaturizer()
  2. new TextFeaturizer(uid: String)

    uid

    The id of the module

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T
    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  5. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  6. val binary: BooleanParam

    All nonnegative word counts are set to 1 when set to true

    All nonnegative word counts are set to 1 when set to true

    Definition Classes
    TextFeaturizerParams
  7. val caseSensitiveStopWords: BooleanParam

    Indicates whether a case sensitive comparison is performed on stop words.

    Indicates whether a case sensitive comparison is performed on stop words.

    Definition Classes
    TextFeaturizerParams
  8. lazy val classNameHelper: String
    Attributes
    protected
    Definition Classes
    BaseWrappable
  9. final def clear(param: Param[_]): TextFeaturizer.this.type
    Definition Classes
    Params
  10. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  11. def companionModelClassName: String
    Attributes
    protected
    Definition Classes
    BaseWrappable
  12. def copy(extra: ParamMap): TextFeaturizer.this.type
    Definition Classes
    TextFeaturizer → Estimator → PipelineStage → Params
  13. def copyValues[T <: Params](to: T, extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  14. lazy val copyrightLines: String
    Attributes
    protected
    Definition Classes
    BaseWrappable
  15. final def defaultCopy[T <: Params](extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  16. val defaultStopWordLanguage: Param[String]

    Specify the language to use for stop word removal.

    Specify the language to use for stop word removal. The Use the custom setting when using the stopWords input

    Definition Classes
    TextFeaturizerParams
  17. def dotnetAdditionalMethods: String
    Definition Classes
    DotnetWrappable
  18. def dotnetClass(): String
    Attributes
    protected
    Definition Classes
    DotnetWrappable
  19. lazy val dotnetClassName: String
    Attributes
    protected
    Definition Classes
    DotnetWrappable
  20. lazy val dotnetClassNameString: String
    Attributes
    protected
    Definition Classes
    DotnetWrappable
  21. lazy val dotnetClassWrapperName: String
    Attributes
    protected
    Definition Classes
    DotnetWrappable
  22. lazy val dotnetCopyrightLines: String
    Attributes
    protected
    Definition Classes
    DotnetWrappable
  23. def dotnetExtraEstimatorImports: String
    Attributes
    protected
    Definition Classes
    DotnetWrappable
  24. def dotnetExtraMethods: String
    Attributes
    protected
    Definition Classes
    DotnetWrappable
  25. lazy val dotnetInternalWrapper: Boolean
    Attributes
    protected
    Definition Classes
    DotnetWrappable
  26. def dotnetMLReadWriteMethods: String
    Attributes
    protected
    Definition Classes
    DotnetWrappable
  27. lazy val dotnetNamespace: String
    Attributes
    protected
    Definition Classes
    DotnetWrappable
  28. lazy val dotnetObjectBaseClass: String
    Attributes
    protected
    Definition Classes
    DotnetWrappable
  29. def dotnetParamGetter(p: Param[_]): String
    Attributes
    protected
    Definition Classes
    DotnetWrappable
  30. def dotnetParamGetters: String
    Attributes
    protected
    Definition Classes
    DotnetWrappable
  31. def dotnetParamSetter(p: Param[_]): String
    Attributes
    protected
    Definition Classes
    DotnetWrappable
  32. def dotnetParamSetters: String
    Attributes
    protected
    Definition Classes
    DotnetWrappable
  33. def dotnetWrapAsTypeMethod: String
    Attributes
    protected
    Definition Classes
    DotnetWrappable
  34. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  35. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  36. def explainParam(param: Param[_]): String
    Definition Classes
    Params
  37. def explainParams(): String
    Definition Classes
    Params
  38. final def extractParamMap(): ParamMap
    Definition Classes
    Params
  39. final def extractParamMap(extra: ParamMap): ParamMap
    Definition Classes
    Params
  40. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  41. def fit(dataset: Dataset[_]): PipelineModel
    Definition Classes
    TextFeaturizer → Estimator
  42. def fit(dataset: Dataset[_], paramMaps: Seq[ParamMap]): Seq[PipelineModel]
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  43. def fit(dataset: Dataset[_], paramMap: ParamMap): PipelineModel
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  44. def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): PipelineModel
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" ) @varargs()
  45. final def get[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  46. final def getBinary: Boolean

    Definition Classes
    TextFeaturizerParams
  47. final def getCaseSensitiveStopWords: Boolean

    Definition Classes
    TextFeaturizerParams
  48. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  49. final def getDefault[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  50. final def getDefaultStopWordLanguage: String

    Definition Classes
    TextFeaturizerParams
  51. def getInputCol: String

    Definition Classes
    HasInputCol
  52. final def getMinDocFreq: Int

    Definition Classes
    TextFeaturizerParams
  53. final def getMinTokenLength: Int

    Definition Classes
    TextFeaturizerParams
  54. final def getNGramLength: Int

    Definition Classes
    TextFeaturizerParams
  55. final def getNumFeatures: Int

    Definition Classes
    TextFeaturizerParams
  56. final def getOrDefault[T](param: Param[T]): T
    Definition Classes
    Params
  57. def getOutputCol: String

    Definition Classes
    HasOutputCol
  58. def getParam(paramName: String): Param[Any]
    Definition Classes
    Params
  59. def getParamInfo(p: Param[_]): ParamInfo[_]
    Definition Classes
    BaseWrappable
  60. def getPayload(methodName: String, numCols: Option[Int], executionSeconds: Option[Double], exception: Option[Exception]): Map[String, String]
    Attributes
    protected
    Definition Classes
    SynapseMLLogging
  61. final def getStopWords: String

    Definition Classes
    TextFeaturizerParams
  62. final def getToLowercase: Boolean

    Definition Classes
    TextFeaturizerParams
  63. final def getTokenizerGaps: Boolean

    Definition Classes
    TextFeaturizerParams
  64. final def getTokenizerPattern: String

    Definition Classes
    TextFeaturizerParams
  65. final def getUseIDF: Boolean

    Definition Classes
    TextFeaturizerParams
  66. final def getUseNGram: Boolean

    Definition Classes
    TextFeaturizerParams
  67. final def getUseStopWordsRemover: Boolean

    Definition Classes
    TextFeaturizerParams
  68. final def getUseTokenizer: Boolean

    Definition Classes
    TextFeaturizerParams
  69. final def hasDefault[T](param: Param[T]): Boolean
    Definition Classes
    Params
  70. def hasParam(paramName: String): Boolean
    Definition Classes
    Params
  71. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  72. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  73. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  74. val inputCol: Param[String]

    The name of the input column

    The name of the input column

    Definition Classes
    HasInputCol
  75. final def isDefined(param: Param[_]): Boolean
    Definition Classes
    Params
  76. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  77. final def isSet(param: Param[_]): Boolean
    Definition Classes
    Params
  78. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  79. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  80. def logBase(info: Map[String, String], featureName: Option[String]): Unit
    Attributes
    protected
    Definition Classes
    SynapseMLLogging
  81. def logBase(methodName: String, numCols: Option[Int], executionSeconds: Option[Double], featureName: Option[String]): Unit
    Attributes
    protected
    Definition Classes
    SynapseMLLogging
  82. def logClass(featureName: String): Unit
    Definition Classes
    SynapseMLLogging
  83. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  84. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  85. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  86. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  87. def logErrorBase(methodName: String, e: Exception): Unit
    Attributes
    protected
    Definition Classes
    SynapseMLLogging
  88. def logFit[T](f: ⇒ T, columns: Int): T
    Definition Classes
    SynapseMLLogging
  89. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  90. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  91. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  92. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  93. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  94. def logTransform[T](f: ⇒ T, columns: Int): T
    Definition Classes
    SynapseMLLogging
  95. def logVerb[T](verb: String, f: ⇒ T, columns: Option[Int] = None): T
    Definition Classes
    SynapseMLLogging
  96. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  97. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  98. def makeDotnetFile(conf: CodegenConfig): Unit
    Definition Classes
    DotnetWrappable
  99. def makePyFile(conf: CodegenConfig): Unit
    Definition Classes
    PythonWrappable
  100. def makeRFile(conf: CodegenConfig): Unit
    Definition Classes
    RWrappable
  101. val minDocFreq: IntParam

    Minimum number of documents in which a term should appear.

    Minimum number of documents in which a term should appear.

    Definition Classes
    TextFeaturizerParams
  102. val minTokenLength: IntParam

    Minumum token length; must be 0 or greater.

    Minumum token length; must be 0 or greater.

    Definition Classes
    TextFeaturizerParams
  103. val nGramLength: IntParam

    The size of the Ngrams

    The size of the Ngrams

    Definition Classes
    TextFeaturizerParams
  104. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  105. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  106. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  107. val numFeatures: IntParam

    Set the number of features to hash each document to

    Set the number of features to hash each document to

    Definition Classes
    TextFeaturizerParams
  108. val outputCol: Param[String]

    The name of the output column

    The name of the output column

    Definition Classes
    HasOutputCol
  109. lazy val params: Array[Param[_]]
    Definition Classes
    Params
  110. def pyAdditionalMethods: String
    Definition Classes
    PythonWrappable
  111. lazy val pyClassDoc: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  112. lazy val pyClassName: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  113. def pyExtraEstimatorImports: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  114. def pyExtraEstimatorMethods: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  115. lazy val pyInheritedClasses: Seq[String]
    Attributes
    protected
    Definition Classes
    PythonWrappable
  116. def pyInitFunc(): String
    Definition Classes
    PythonWrappable
  117. lazy val pyInternalWrapper: Boolean
    Attributes
    protected
    Definition Classes
    PythonWrappable
  118. lazy val pyObjectBaseClass: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  119. def pyParamArg[T](p: Param[T]): String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  120. def pyParamDefault[T](p: Param[T]): Option[String]
    Attributes
    protected
    Definition Classes
    PythonWrappable
  121. def pyParamGetter(p: Param[_]): String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  122. def pyParamSetter(p: Param[_]): String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  123. def pyParamsArgs: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  124. def pyParamsDefaults: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  125. lazy val pyParamsDefinitions: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  126. def pyParamsGetters: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  127. def pyParamsSetters: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  128. def pythonClass(): String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  129. def rClass(): String
    Attributes
    protected
    Definition Classes
    RWrappable
  130. def rDocString: String
    Attributes
    protected
    Definition Classes
    RWrappable
  131. def rExtraBodyLines: String
    Attributes
    protected
    Definition Classes
    RWrappable
  132. def rExtraInitLines: String
    Attributes
    protected
    Definition Classes
    RWrappable
  133. lazy val rFuncName: String
    Attributes
    protected
    Definition Classes
    RWrappable
  134. lazy val rInternalWrapper: Boolean
    Attributes
    protected
    Definition Classes
    RWrappable
  135. def rParamArg[T](p: Param[T]): String
    Attributes
    protected
    Definition Classes
    RWrappable
  136. def rParamsArgs: String
    Attributes
    protected
    Definition Classes
    RWrappable
  137. def rSetterLines: String
    Attributes
    protected
    Definition Classes
    RWrappable
  138. def save(path: String): Unit
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  139. final def set(paramPair: ParamPair[_]): TextFeaturizer.this.type
    Attributes
    protected
    Definition Classes
    Params
  140. final def set(param: String, value: Any): TextFeaturizer.this.type
    Attributes
    protected
    Definition Classes
    Params
  141. final def set[T](param: Param[T], value: T): TextFeaturizer.this.type
    Definition Classes
    Params
  142. def setBinary(value: Boolean): TextFeaturizer.this.type

  143. def setCaseSensitiveStopWords(value: Boolean): TextFeaturizer.this.type

  144. final def setDefault(paramPairs: ParamPair[_]*): TextFeaturizer.this.type
    Attributes
    protected
    Definition Classes
    Params
  145. final def setDefault[T](param: Param[T], value: T): TextFeaturizer.this.type
    Attributes
    protected
    Definition Classes
    Params
  146. def setDefaultStopWordLanguage(value: String): TextFeaturizer.this.type

  147. def setInputCol(value: String): TextFeaturizer.this.type

    Definition Classes
    HasInputCol
  148. def setMinDocFreq(value: Int): TextFeaturizer.this.type

  149. def setMinTokenLength(value: Int): TextFeaturizer.this.type

  150. def setNGramLength(value: Int): TextFeaturizer.this.type

  151. def setNumFeatures(value: Int): TextFeaturizer.this.type

  152. def setOutputCol(value: String): TextFeaturizer.this.type

    Definition Classes
    HasOutputCol
  153. def setStopWords(value: String): TextFeaturizer.this.type

  154. def setToLowercase(value: Boolean): TextFeaturizer.this.type

  155. def setTokenizerGaps(value: Boolean): TextFeaturizer.this.type

  156. def setTokenizerPattern(value: String): TextFeaturizer.this.type

  157. def setUseIDF(value: Boolean): TextFeaturizer.this.type

  158. def setUseNGram(value: Boolean): TextFeaturizer.this.type

  159. def setUseStopWordsRemover(value: Boolean): TextFeaturizer.this.type

  160. def setUseTokenizer(value: Boolean): TextFeaturizer.this.type
  161. val stopWords: Param[String]

    The words to be filtered out.

    The words to be filtered out. This is a comma separated list of words, encoded as a single string. For example, "a, the, and"

    Definition Classes
    TextFeaturizerParams
  162. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  163. val thisStage: Params
    Attributes
    protected
    Definition Classes
    BaseWrappable
  164. val toLowercase: BooleanParam

    Indicates whether to convert all characters to lowercase before tokenizing.

    Indicates whether to convert all characters to lowercase before tokenizing.

    Definition Classes
    TextFeaturizerParams
  165. def toString(): String
    Definition Classes
    Identifiable → AnyRef → Any
  166. val tokenizerGaps: BooleanParam

    Indicates whether the regex splits on gaps (true) or matches tokens (false)

    Indicates whether the regex splits on gaps (true) or matches tokens (false)

    Definition Classes
    TextFeaturizerParams
  167. val tokenizerPattern: Param[String]

    Regex pattern used to match delimiters if gaps (true) or tokens (false)

    Regex pattern used to match delimiters if gaps (true) or tokens (false)

    Definition Classes
    TextFeaturizerParams
  168. def transformSchema(schema: StructType): StructType
    Definition Classes
    TextFeaturizer → PipelineStage
  169. def transformSchema(schema: StructType, logging: Boolean): StructType
    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  170. val uid: String
    Definition Classes
    TextFeaturizerSynapseMLLogging → Identifiable
  171. val useIDF: BooleanParam

    Scale the Term Frequencies by IDF when set to true

    Scale the Term Frequencies by IDF when set to true

    Definition Classes
    TextFeaturizerParams
  172. val useNGram: BooleanParam

    Enumerate N grams when set

    Enumerate N grams when set

    Definition Classes
    TextFeaturizerParams
  173. val useStopWordsRemover: BooleanParam

    Indicates whether to remove stop words from tokenized data.

    Indicates whether to remove stop words from tokenized data.

    Definition Classes
    TextFeaturizerParams
  174. val useTokenizer: BooleanParam

    Tokenize the input when set to true

    Tokenize the input when set to true

    Definition Classes
    TextFeaturizerParams
  175. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  176. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  177. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  178. def write: MLWriter
    Definition Classes
    DefaultParamsWritable → MLWritable

Inherited from SynapseMLLogging

Inherited from HasOutputCol

Inherited from HasInputCol

Inherited from TextFeaturizerParams

Inherited from DefaultParamsWritable

Inherited from MLWritable

Inherited from Wrappable

Inherited from DotnetWrappable

Inherited from RWrappable

Inherited from PythonWrappable

Inherited from BaseWrappable

Inherited from Estimator[PipelineModel]

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

getParam

param

setParam

Ungrouped