PyPI install
python -m pip install ppi-aipw
Semisupervised Mean Inference
ppi_aipw is for the small-labeled, large-unlabeled
setting where predictions from a fixed model can improve power
and precision. It uses AIPW (Robins et al., 1994) as the safe
debiased baseline and adds calibration methods that go beyond
mean correction, together with uncertainty quantification in one
API.
A useful model score can help even if it is not perfectly calibrated. Calibration is the step that puts that prediction score on the right outcome scale before averaging it.
Install
PyPI for the package release, GitHub for the latest version, or a local editable install for development. If you want the native R package instead, head to the R package page.
python -m pip install ppi-aipw
python -m pip install "git+https://github.com/Larsvanderlaan/ppi-aipw.git"
python -m pip install -e .
Quickstart
mean_inference(...) is the main entry point. It
returns the point estimate, standard error, confidence
interval, fitted calibrator, and diagnostics in one call.
Prefer a runnable walkthrough? The
quickstart notebook opens directly in Colab
installs ppi-aipw automatically and covers both the mean API and the causal API in one compact example.
For data-adaptive selection, set method="auto" and
pass candidate_methods=("aipw", "linear", "monotone_spline", "isotonic").
By default, selection uses num_folds=100.
For a quick human-readable Wald summary, use
result.summary(). If you want an optional honest
out-of-fold calibration check after fitting, use
calibration_diagnostics(result, Y, Yhat), and
optionally plot_calibration(...) if
matplotlib is installed.
All numeric inputs must be finite. The package rejects
NaN and Inf values in outcomes,
predictions, covariates, and weights with clear validation
errors.
mean_inference(...) returns a result object with
pointestimate, se, ci,
diagnostics, and
result.summary() for a quick human-readable Wald
summary.
YObserved outcomes for the labeled sample.
YhatPredictions on the same labeled rows.
Yhat_unlabeledPredictions on the unlabeled sample.
methodChoose "aipw", "linear", "prognostic_linear", "sigmoid", "monotone_spline", "isotonic", or "auto".
candidate_methodsCandidate methods considered when method="auto" minimizes cross-validated influence-function variance. If "aipw" is included, the selector also compares an efficiency-maximized AIPW candidate.
num_foldsNumber of folds used by method="auto". The default is 100 and it is capped at the labeled sample size.
inferenceChoose "wald" for a fast analytic interval, "jackknife" for the recommended resampling-style normal interval, or "bootstrap" for percentile bootstrap intervals.
efficiency_maximizationOptional rescaling to lambda m(X), where m(X) is the raw prediction score for method="aipw" and the calibrated score otherwise. The scaling factor lambda is chosen by empirical efficiency maximization.
w, w_unlabeledOptional observation weights for labeled and unlabeled samples. These can be inverse probability of missingness weights to adjust for informative missingness, or balancing weights if you want to reweight toward a covariate-adjusted target population. Uniform weights reproduce the unweighted behavior.
X, X_unlabeledOptional extra covariates used by method="prognostic_linear". The score and intercept stay unpenalized, while the extra covariates get ridge tuning on the labeled sample.
Calibration
Calibration is about getting the prediction scale right, not just the ranking.
A prediction score is calibrated when its numeric values mean
what they say. If a model outputs values near 0.8,
we want outcomes near 0.8 on average for examples
scored around 0.8. A model output can still be
useful before calibration; calibration fixes the scale rather
than deciding whether the model helps at all.
Here we do not just rank units; we average a prediction score and use it inside AIPW-style estimators. That means miscalibration can directly affect bias correction and efficiency, while recalibration can improve semisupervised mean inference without retraining the original model.
Method Explorer
Click through the methods for a short description, typical use case, and main tradeoffs.
Fits a smooth monotone spline calibration curve, then plugs the calibrated predictions into the AIPW estimator.
Schematic view of how the raw prediction score
m(X) is transformed before the semisupervised
mean step.
Intervals
Jackknife and bootstrap both refit the calibration step under resampling; jackknife uses delete-a-group folds, while bootstrap uses classical resampling with replacement.
Fast analytic intervals for routine use.
References
The calibration methods implemented here can be viewed as special cases of calibrated debiased machine learning and targeted minimum loss estimation.
The AIPW baseline cited above goes back to Robins, Rotnitzky, and Zhao (1994), "Estimation of regression coefficients when some regressors are not always observed", Journal of the American Statistical Association 89(427): 846-866.
Main paper themes reflected here:
efficiency_maximization=True for method="aipw" targets the same empirical efficiency-maximization problem as PPI++; see Rubin and van der Laan (2008). The official ppi_py implementation of PPI++ may differ in finite samples because it clips the optimized scale to lie in [0,1]. Plain PPI is not implemented because raw-score AIPW is typically at least as efficient.Calibration references:
Semiparametric, debiased/targeted machine learning foundations:
Prognostic-score adjustment and efficiency maximization:
Semisupervised mean estimation: