Packages

class DistributionBalanceMeasure extends Transformer with DataBalanceParams with ComplexParamsWritable with Wrappable with SynapseMLLogging

This transformer computes data balance measures based on a reference distribution. For now, we only support a uniform reference distribution.

The output is a dataframe that contains two columns:

  • The sensitive feature name.
  • A struct containing measure names and their values showing differences between the observed and reference distributions. The following measures are computed:
    • Kullback-Leibler Divergence - https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence
    • Jensen-Shannon Distance - https://en.wikipedia.org/wiki/Jensen%E2%80%93Shannon_divergence
    • Wasserstein Distance - https://en.wikipedia.org/wiki/Wasserstein_metric
    • Infinity Norm Distance - https://en.wikipedia.org/wiki/Chebyshev_distance
    • Total Variation Distance - https://en.wikipedia.org/wiki/Total_variation_distance_of_probability_measures
    • Chi-Squared Test - https://en.wikipedia.org/wiki/Chi-squared_test

The output dataframe contains a row per sensitive feature.

Annotations
@Experimental()
Linear Supertypes
SynapseMLLogging, Wrappable, RWrappable, PythonWrappable, BaseWrappable, ComplexParamsWritable, MLWritable, DataBalanceParams, HasOutputCol, Transformer, PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DistributionBalanceMeasure
  2. SynapseMLLogging
  3. Wrappable
  4. RWrappable
  5. PythonWrappable
  6. BaseWrappable
  7. ComplexParamsWritable
  8. MLWritable
  9. DataBalanceParams
  10. HasOutputCol
  11. Transformer
  12. PipelineStage
  13. Logging
  14. Params
  15. Serializable
  16. Serializable
  17. Identifiable
  18. AnyRef
  19. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new DistributionBalanceMeasure()
  2. new DistributionBalanceMeasure(uid: String)

    uid

    The unique ID.

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T
    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  5. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  6. lazy val classNameHelper: String
    Attributes
    protected
    Definition Classes
    BaseWrappable
  7. final def clear(param: Param[_]): DistributionBalanceMeasure.this.type
    Definition Classes
    Params
  8. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  9. def companionModelClassName: String
    Attributes
    protected
    Definition Classes
    BaseWrappable
  10. def copy(extra: ParamMap): Transformer
    Definition Classes
    DistributionBalanceMeasure → Transformer → PipelineStage → Params
  11. def copyValues[T <: Params](to: T, extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  12. lazy val copyrightLines: String
    Attributes
    protected
    Definition Classes
    BaseWrappable
  13. final def defaultCopy[T <: Params](extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  14. val emptyReferenceDistribution: Array[Map[String, Double]]
  15. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  16. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  17. def explainParam(param: Param[_]): String
    Definition Classes
    Params
  18. def explainParams(): String
    Definition Classes
    Params
  19. final def extractParamMap(): ParamMap
    Definition Classes
    Params
  20. final def extractParamMap(extra: ParamMap): ParamMap
    Definition Classes
    Params
  21. val featureNameCol: Param[String]
  22. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  23. final def get[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  24. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  25. final def getDefault[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  26. def getFeatureNameCol: String
  27. final def getOrDefault[T](param: Param[T]): T
    Definition Classes
    Params
  28. final def getOutputCol: String
    Definition Classes
    HasOutputCol
  29. def getParam(paramName: String): Param[Any]
    Definition Classes
    Params
  30. def getParamInfo(p: Param[_]): ParamInfo[_]
    Definition Classes
    BaseWrappable
  31. def getPayload(methodName: String, numCols: Option[Int], executionSeconds: Option[Double], exception: Option[Exception]): Map[String, String]
    Attributes
    protected
    Definition Classes
    SynapseMLLogging
  32. def getReferenceDistribution: Array[Map[String, Double]]
  33. def getSensitiveCols: Array[String]
    Definition Classes
    DataBalanceParams
  34. def getVerbose: Boolean
    Definition Classes
    DataBalanceParams
  35. final def hasDefault[T](param: Param[T]): Boolean
    Definition Classes
    Params
  36. def hasParam(paramName: String): Boolean
    Definition Classes
    Params
  37. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  38. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  39. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  40. final def isDefined(param: Param[_]): Boolean
    Definition Classes
    Params
  41. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  42. final def isSet(param: Param[_]): Boolean
    Definition Classes
    Params
  43. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  44. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  45. def logBase(info: Map[String, String], featureName: Option[String]): Unit
    Attributes
    protected
    Definition Classes
    SynapseMLLogging
  46. def logBase(methodName: String, numCols: Option[Int], executionSeconds: Option[Double], featureName: Option[String]): Unit
    Attributes
    protected
    Definition Classes
    SynapseMLLogging
  47. def logClass(featureName: String): Unit
    Definition Classes
    SynapseMLLogging
  48. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  49. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  50. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  51. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  52. def logErrorBase(methodName: String, e: Exception): Unit
    Attributes
    protected
    Definition Classes
    SynapseMLLogging
  53. def logFit[T](f: ⇒ T, columns: Int): T
    Definition Classes
    SynapseMLLogging
  54. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  55. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  56. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  57. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  58. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  59. def logTransform[T](f: ⇒ T, columns: Int): T
    Definition Classes
    SynapseMLLogging
  60. def logVerb[T](verb: String, f: ⇒ T, columns: Option[Int] = None): T
    Definition Classes
    SynapseMLLogging
  61. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  62. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  63. def makePyFile(conf: CodegenConfig): Unit
    Definition Classes
    PythonWrappable
  64. def makeRFile(conf: CodegenConfig): Unit
    Definition Classes
    RWrappable
  65. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  66. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  67. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  68. final val outputCol: Param[String]
    Definition Classes
    HasOutputCol
  69. lazy val params: Array[Param[_]]
    Definition Classes
    Params
  70. def pyAdditionalMethods: String
    Definition Classes
    PythonWrappable
  71. lazy val pyClassDoc: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  72. lazy val pyClassName: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  73. def pyExtraEstimatorImports: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  74. def pyExtraEstimatorMethods: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  75. lazy val pyInheritedClasses: Seq[String]
    Attributes
    protected
    Definition Classes
    PythonWrappable
  76. def pyInitFunc(): String
    Definition Classes
    PythonWrappable
  77. lazy val pyInternalWrapper: Boolean
    Attributes
    protected
    Definition Classes
    PythonWrappable
  78. lazy val pyObjectBaseClass: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  79. def pyParamArg[T](p: Param[T]): String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  80. def pyParamDefault[T](p: Param[T]): Option[String]
    Attributes
    protected
    Definition Classes
    PythonWrappable
  81. def pyParamGetter(p: Param[_]): String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  82. def pyParamSetter(p: Param[_]): String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  83. def pyParamsArgs: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  84. def pyParamsDefaults: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  85. lazy val pyParamsDefinitions: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  86. def pyParamsGetters: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  87. def pyParamsSetters: String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  88. def pythonClass(): String
    Attributes
    protected
    Definition Classes
    PythonWrappable
  89. def rClass(): String
    Attributes
    protected
    Definition Classes
    RWrappable
  90. def rDocString: String
    Attributes
    protected
    Definition Classes
    RWrappable
  91. def rExtraBodyLines: String
    Attributes
    protected
    Definition Classes
    RWrappable
  92. def rExtraInitLines: String
    Attributes
    protected
    Definition Classes
    RWrappable
  93. lazy val rFuncName: String
    Attributes
    protected
    Definition Classes
    RWrappable
  94. lazy val rInternalWrapper: Boolean
    Attributes
    protected
    Definition Classes
    RWrappable
  95. def rParamArg[T](p: Param[T]): String
    Attributes
    protected
    Definition Classes
    RWrappable
  96. def rParamsArgs: String
    Attributes
    protected
    Definition Classes
    RWrappable
  97. def rSetterLines: String
    Attributes
    protected
    Definition Classes
    RWrappable
  98. val referenceDistribution: ArrayMapParam
  99. def save(path: String): Unit
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  100. val sensitiveCols: StringArrayParam
    Definition Classes
    DataBalanceParams
  101. final def set(paramPair: ParamPair[_]): DistributionBalanceMeasure.this.type
    Attributes
    protected
    Definition Classes
    Params
  102. final def set(param: String, value: Any): DistributionBalanceMeasure.this.type
    Attributes
    protected
    Definition Classes
    Params
  103. final def set[T](param: Param[T], value: T): DistributionBalanceMeasure.this.type
    Definition Classes
    Params
  104. final def setDefault(paramPairs: ParamPair[_]*): DistributionBalanceMeasure.this.type
    Attributes
    protected
    Definition Classes
    Params
  105. final def setDefault[T](param: Param[T], value: T): DistributionBalanceMeasure.this.type
    Attributes
    protected[org.apache.spark.ml]
    Definition Classes
    Params
  106. def setFeatureNameCol(value: String): DistributionBalanceMeasure.this.type
  107. def setOutputCol(value: String): DistributionBalanceMeasure.this.type
    Definition Classes
    DataBalanceParams
  108. def setReferenceDistribution(value: ArrayList[HashMap[String, Double]]): DistributionBalanceMeasure.this.type
  109. def setReferenceDistribution(value: Array[Map[String, Double]]): DistributionBalanceMeasure.this.type
  110. def setSensitiveCols(values: Array[String]): DistributionBalanceMeasure.this.type
    Definition Classes
    DataBalanceParams
  111. def setVerbose(value: Boolean): DistributionBalanceMeasure.this.type
    Definition Classes
    DataBalanceParams
  112. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  113. val thisStage: Params
    Attributes
    protected
    Definition Classes
    BaseWrappable
  114. def toString(): String
    Definition Classes
    Identifiable → AnyRef → Any
  115. def transform(dataset: Dataset[_]): DataFrame
    Definition Classes
    DistributionBalanceMeasure → Transformer
  116. def transform(dataset: Dataset[_], paramMap: ParamMap): DataFrame
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" )
  117. def transform(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): DataFrame
    Definition Classes
    Transformer
    Annotations
    @Since( "2.0.0" ) @varargs()
  118. def transformSchema(schema: StructType): StructType
    Definition Classes
    DistributionBalanceMeasure → PipelineStage
  119. def transformSchema(schema: StructType, logging: Boolean): StructType
    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  120. val uid: String
    Definition Classes
    DistributionBalanceMeasureSynapseMLLogging → Identifiable
  121. def validateSchema(schema: StructType): Unit
  122. val verbose: BooleanParam
    Definition Classes
    DataBalanceParams
  123. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  124. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  125. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  126. def write: MLWriter
    Definition Classes
    ComplexParamsWritable → MLWritable

Inherited from SynapseMLLogging

Inherited from Wrappable

Inherited from RWrappable

Inherited from PythonWrappable

Inherited from BaseWrappable

Inherited from ComplexParamsWritable

Inherited from MLWritable

Inherited from DataBalanceParams

Inherited from HasOutputCol

Inherited from Transformer

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

Ungrouped