Target
Average treatment effect
The target parameter on this page is the ATE, not a general collection of adaptive estimands.
Adaptive DML
Adaptive DML estimates the binary-treatment ATE by learning a treatment-effect summary, calibrating that one-dimensional score with isotonic regression, and debiasing within the induced working model.
What
The target on this page is the binary-treatment ATE.
calibrated_rlearner is the main adaptive mode:
it uses the estimated CATE, meaning how treatment effects vary
across units, as a scalar summary of effect heterogeneity and
targets the ATE through the induced one-dimensional working
model.
Target
The target parameter on this page is the ATE, not a general collection of adaptive estimands.
Working summary
The estimated CATE compresses the effect model to one dimension before calibration and debiasing.
Primary mode
calibrated_rlearnerThis is the main adaptive path documented on this page.
How
In the main documented path, a partially linear R-learner setup separates baseline outcome prediction from treatment-effect learning, then calibrates and debiases the learned effect summary.
Step 1
Estimate the outcome and treatment-assignment functions, then fit a preliminary CATE.
Step 2
Use the estimated CATE as a one-dimensional summary and calibrate along that summary.
Step 3
Debias in the adaptive working model and average to estimate the ATE.
Why
The key difference across estimators is how much inverse-overlap
variation they allow. Fully nonparametric estimators vary with
all of W, constant-effect partially linear models
collapse to one regime, and the adaptive estimator varies only
along the learned one-dimensional CATE summary.
Here, \( \pi(W) = \Pr(A=1 \mid W) \) is the propensity score and \( \tau(W) = \mathbb{E}[Y(1)-Y(0) \mid W] \) is the CATE.
Smaller effective overlap weights generally mean a more stable, lower-variance ATE estimator.
Fully nonparametric AIPW
This route allows rich heterogeneity but typically needs the most overlap and can have the highest variance.
Constant-effect PLM
This can work with much less overlap, but it can be biased if the constant-effect model is wrong.
Learned CATE model
Adaptive DML uses the estimated CATE as a scalar reduction, so it is more structured than the nonparametric route while remaining more flexible than a constant-effect model.
The adaptive estimator reduces the relevant overlap problem from the full covariate space to variation along the learned CATE summary. When that summary captures the treatment-effect structure well, this can substantially reduce variance. In favorable cases, this yields the super-efficiency phenomenon studied in the paper: lower variance than the fully nonparametric estimator at those truths.
When
Adaptive DML is mainly worth considering when overlap is too weak for reliable or precise fully nonparametric inference and the CATE may be close to constant or well-approximated by a simpler one-dimensional effect structure.
This page is most relevant when the fully nonparametric route is unstable because inverse-overlap weights are too variable.
Constant-effect models are contained in the adaptive working model, so the estimator adapts naturally when the CATE is close to constant.
Inference is based on the adaptive working model. The theory treats the learned reduction as approximating an oracle model built from the true CATE-based reduction; see the paper for the formal conditions.
Modes
The package exposes two adaptive modes for binary-treatment ATE estimation.
Primary mode
calibrated_rlearnerUses the estimated CATE as the adaptive one-dimensional summary for calibration and debiasing.
Alternative mode
pluginUses the outcome regression as the adaptive summary instead.
Python
AdaptiveCalibratedDMLSet mode="calibrated_rlearner".
R
adaptive_calibrated_dml()Set mode = "calibrated_rlearner".
References
Key references for the adaptive estimator.