Adaptive DML

Adaptive DML for binary-treatment ATE estimation.

Adaptive DML estimates the binary-treatment ATE by learning a treatment-effect summary, calibrating that one-dimensional score with isotonic regression, and debiasing within the induced working model.

  • Binary treatment only
  • ATE estimated through a scalar CATE summary
  • Isotonic calibration internally

What

Target and estimator

The target on this page is the binary-treatment ATE. calibrated_rlearner is the main adaptive mode: it uses the estimated CATE, meaning how treatment effects vary across units, as a scalar summary of effect heterogeneity and targets the ATE through the induced one-dimensional working model.

Target

Average treatment effect

The target parameter on this page is the ATE, not a general collection of adaptive estimands.

Working summary

Estimated CATE as a scalar reduction

The estimated CATE compresses the effect model to one dimension before calibration and debiasing.

Primary mode

calibrated_rlearner

This is the main adaptive path documented on this page.

How

Estimation steps

In the main documented path, a partially linear R-learner setup separates baseline outcome prediction from treatment-effect learning, then calibrates and debiases the learned effect summary.

Step 1

Fit nuisances and a preliminary CATE

Estimate the outcome and treatment-assignment functions, then fit a preliminary CATE.

Step 2

Calibrate along the estimated CATE

Use the estimated CATE as a one-dimensional summary and calibrate along that summary.

Step 3

Debias and average to the ATE

Debias in the adaptive working model and average to estimate the ATE.

Why

Overlap reduction

The key difference across estimators is how much inverse-overlap variation they allow. Fully nonparametric estimators vary with all of W, constant-effect partially linear models collapse to one regime, and the adaptive estimator varies only along the learned one-dimensional CATE summary.

Here, \( \pi(W) = \Pr(A=1 \mid W) \) is the propensity score and \( \tau(W) = \mathbb{E}[Y(1)-Y(0) \mid W] \) is the CATE.

Smaller effective overlap weights generally mean a more stable, lower-variance ATE estimator.

Fully nonparametric AIPW

Most flexible, strongest overlap demands

\[ \frac{1}{\pi(W)(1-\pi(W))} \]

This route allows rich heterogeneity but typically needs the most overlap and can have the highest variance.

Constant-effect PLM

Most structured, least overlap-sensitive

\[ \frac{1}{\mathbb{E}[\pi(W)(1-\pi(W))]} \]

This can work with much less overlap, but it can be biased if the constant-effect model is wrong.

Learned CATE model

Middle ground

\[ \frac{1}{\mathbb{E}[\pi(W)(1-\pi(W)) \mid \tau(W)]} \]

Adaptive DML uses the estimated CATE as a scalar reduction, so it is more structured than the nonparametric route while remaining more flexible than a constant-effect model.

The adaptive estimator reduces the relevant overlap problem from the full covariate space to variation along the learned CATE summary. When that summary captures the treatment-effect structure well, this can substantially reduce variance. In favorable cases, this yields the super-efficiency phenomenon studied in the paper: lower variance than the fully nonparametric estimator at those truths.

When

When it can help

Adaptive DML is mainly worth considering when overlap is too weak for reliable or precise fully nonparametric inference and the CATE may be close to constant or well-approximated by a simpler one-dimensional effect structure.

Weak overlap is the main use case

This page is most relevant when the fully nonparametric route is unstable because inverse-overlap weights are too variable.

Constant-effect models are nested inside it

Constant-effect models are contained in the adaptive working model, so the estimator adapts naturally when the CATE is close to constant.

Inference theory is model-based

Inference is based on the adaptive working model. The theory treats the learned reduction as approximating an oracle model built from the true CATE-based reduction; see the paper for the formal conditions.

Modes

Modes

The package exposes two adaptive modes for binary-treatment ATE estimation.

Primary mode

calibrated_rlearner

Uses the estimated CATE as the adaptive one-dimensional summary for calibration and debiasing.

Alternative mode

plugin

Uses the outcome regression as the adaptive summary instead.

Python

AdaptiveCalibratedDML

Set mode="calibrated_rlearner".

R

adaptive_calibrated_dml()

Set mode = "calibrated_rlearner".

References

References

Key references for the adaptive estimator.