package exploratory
- Alphabetic
- Public
- All
Type Members
-
class
AggregateBalanceMeasure extends Transformer with DataBalanceParams with ComplexParamsWritable with Wrappable with BasicLogging
This transformer computes a set of aggregated balance measures that represents how balanced the given dataframe is along the given sensitive features.
This transformer computes a set of aggregated balance measures that represents how balanced the given dataframe is along the given sensitive features.
The output is a dataframe that contains one column:
- A struct containing measure names and their values showing higher notions of inequality.
The following measures are computed:
- Atkinson Index - https://en.wikipedia.org/wiki/Atkinson_index
- Theil Index (L and T) - https://en.wikipedia.org/wiki/Theil_index
The output dataframe contains one row.
- Annotations
- @Experimental()
- A struct containing measure names and their values showing higher notions of inequality.
The following measures are computed:
- trait DataBalanceParams extends Params with HasOutputCol
-
class
DistributionBalanceMeasure extends Transformer with DataBalanceParams with ComplexParamsWritable with Wrappable with BasicLogging
This transformer computes data balance measures based on a reference distribution.
This transformer computes data balance measures based on a reference distribution. For now, we only support a uniform reference distribution.
The output is a dataframe that contains two columns:
- The sensitive feature name.
- A struct containing measure names and their values showing differences between
the observed and reference distributions. The following measures are computed:
- Kullback-Leibler Divergence - https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence
- Jensen-Shannon Distance - https://en.wikipedia.org/wiki/Jensen%E2%80%93Shannon_divergence
- Wasserstein Distance - https://en.wikipedia.org/wiki/Wasserstein_metric
- Infinity Norm Distance - https://en.wikipedia.org/wiki/Chebyshev_distance
- Total Variation Distance - https://en.wikipedia.org/wiki/Total_variation_distance_of_probability_measures
- Chi-Squared Test - https://en.wikipedia.org/wiki/Chi-squared_test
The output dataframe contains a row per sensitive feature.
- Annotations
- @Experimental()
-
class
FeatureBalanceMeasure extends Transformer with DataBalanceParams with HasLabelCol with ComplexParamsWritable with Wrappable with BasicLogging
This transformer computes a set of balance measures from the given dataframe and sensitive features.
This transformer computes a set of balance measures from the given dataframe and sensitive features.
The output is a dataframe that contains four columns:
- The sensitive feature name.
- A feature value within the sensitive feature.
- Another feature value within the sensitive feature.
- A struct containing measure names and their values showing parities between the two feature values.
The following measures are computed:
- Demographic Parity - https://en.wikipedia.org/wiki/Fairness_(machine_learning)
- Pointwise Mutual Information - https://en.wikipedia.org/wiki/Pointwise_mutual_information
- Sorensen-Dice Coefficient - https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient
- Jaccard Index - https://en.wikipedia.org/wiki/Jaccard_index
- Kendall Rank Correlation - https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient
- Log-Likelihood Ratio - https://en.wikipedia.org/wiki/Likelihood_function#Likelihood_ratio
- t-test - https://en.wikipedia.org/wiki/Student's_t-test
The output dataframe contains a row per combination of feature values for each sensitive feature.
- Annotations
- @Experimental()
Value Members
- object AggregateBalanceMeasure extends ComplexParamsReadable[AggregateBalanceMeasure] with Serializable
- object DistributionBalanceMeasure extends ComplexParamsReadable[DistributionBalanceMeasure] with Serializable
- object FeatureBalanceMeasure extends ComplexParamsReadable[FeatureBalanceMeasure] with Serializable