# DoubleMLEstimator 

### Companion object DoubleMLEstimator

#### class DoubleMLEstimator extends Estimator[DoubleMLModel] with ComplexParamsWritable with DoubleMLParams with SynapseMLLogging with Wrappable

Double ML estimators. The estimator follows the two stage process, where a set of nuisance functions are estimated in the first stage in a cross-fitting manner and a final stage estimates the average treatment effect (ATE) model. Our goal is to estimate the constant marginal ATE Theta(X)

In this estimator, the ATE is estimated by using the following estimating equations: .. math :: Y - \\E[Y | X, W] = \\Theta(X) \\cdot (T - \\E[T | X, W]) + \\epsilon

Thus if we estimate the nuisance functions :math:q(X, W) = \\E[Y | X, W] and :math:f(X, W)=\\E[T | X, W] in the first stage, we can estimate the final stage ate for each treatment t, by running a regression, minimizing the residual on residual square loss, estimating Theta(X) is a final regression problem, regressing tilde{Y} on X and tilde{T})

.. math :: \\hat{\\theta} = \\arg\\min_{\\Theta}\ \E_n\\left[ (\\tilde{Y} - \\Theta(X) \\cdot \\tilde{T})^2 \\right]

Where \\tilde{Y}=Y - \\E[Y | X, W] and :math:\\tilde{T}=T-\\E[T | X, W] denotes the residual outcome and residual treatment.

The nuisance function :math:q is a simple machine learning problem and user can use setOutcomeModel to set an arbitrary sparkML model that is internally used to solve this problem

The problem of estimating the nuisance function :math:f is also a machine learning problem and user can use setTreatmentModel to set an arbitrary sparkML model that is internally used to solve this problem.

### Instance Constructors

1. new DoubleMLEstimator()
2. new DoubleMLEstimator(uid: String)

### Value Members

11. val confidenceLevel: DoubleParam
Definition Classes
DoubleMLParams
41. def fit(dataset: Dataset[_])

Fits the DoubleML model.

Fits the DoubleML model.

dataset

The input dataset to train.

returns

The trained DoubleML model, from which you can get Ate and Ci values

Definition Classes
DoubleMLEstimator → Estimator
47. def getConfidenceLevel
Definition Classes
DoubleMLParams
53. def getOutcomeCol: String
Definition Classes
HasOutcomeCol
54. def getOutcomeModel: Estimator[_ <: Model[_]]
Definition Classes
DoubleMLParams
55. def getParallelism: Int
Definition Classes
HasParallelism
98. val outcomeCol: Param[String]
Definition Classes
HasOutcomeCol
99. val outcomeModel
Definition Classes
DoubleMLParams
130. val sampleSplitRatio: DoubleArrayParam
Definition Classes
DoubleMLParams
135. def setConfidenceLevel(value: Double): DoubleMLEstimator.this.type

138. def setFeaturesCol(value: String): DoubleMLEstimator.this.type

Definition Classes
HasFeaturesCol
139. def setMaxIter(value: Int): DoubleMLEstimator.this.type

Set the maximum number of confidence interval bootstrapping iterations.

Set the maximum number of confidence interval bootstrapping iterations. Default is 1, which means it does not calculate confidence interval. To get Ci values please set a meaningful value

Definition Classes
DoubleMLParams
Set name of the column which will be used as outcome

Set name of the column which will be used as outcome

Set name of the column which will be used as outcome

Definition Classes
HasOutcomeCol
141. def setOutcomeModel(value: Estimator[_ <: Model[_]]): DoubleMLEstimator.this.type

Set outcome model, it could be any model derived from 'org.apache.spark.ml.regression.Regressor' or 'org.apache.spark.ml.classification.ProbabilisticClassifier'

Set outcome model, it could be any model derived from 'org.apache.spark.ml.regression.Regressor' or 'org.apache.spark.ml.classification.ProbabilisticClassifier'

Definition Classes
DoubleMLParams
142. def setParallelism(value: Int): DoubleMLEstimator.this.type
Definition Classes
DoubleMLParams
Set the sample split ratio, default is Array(0.5, 0.5)

Set the sample split ratio, default is Array(0.5, 0.5)

Set the sample split ratio, default is Array(0.5, 0.5)

Definition Classes
DoubleMLParams
Set name of the column which will be used as treatment

Set name of the column which will be used as treatment

Set name of the column which will be used as treatment

Definition Classes
HasTreatmentCol
145. def setTreatmentModel(value: Estimator[_ <: Model[_]]): DoubleMLEstimator.this.type

Set treatment model, it could be any model derived from 'org.apache.spark.ml.regression.Regressor' or 'org.apache.spark.ml.classification.ProbabilisticClassifier'

Set treatment model, it could be any model derived from 'org.apache.spark.ml.regression.Regressor' or 'org.apache.spark.ml.classification.ProbabilisticClassifier'

Definition Classes
DoubleMLParams
146. def setWeightCol(value: String): DoubleMLEstimator.this.type

Definition Classes
HasWeightCol
152. val treatmentCol: Param[String]
Definition Classes
HasTreatmentCol
153. val treatmentModel
Definition Classes
DoubleMLParams
