Getting Started

Installation

Python:

python3 -m pip install -e python

R:

install.packages("r/causalCalibration", repos = NULL, type = "source")

For method="isotonic" in R, install reticulate and make sure the active Python environment includes lightgbm.

Mental model

The package calibrates predictions from a learner you already have. Bring:

  • a treatment-effect prediction vector,
  • observed treatment and outcome data,
  • nuisance estimates needed by the chosen loss,
  • and, for cross-calibration, a matrix of fold-specific predictions.
  • optionally, fold IDs that tell the package which column is the out-of-fold prediction for each observation.

Workflow

  1. Train or otherwise obtain a treatment-effect predictor.
  2. Compute the nuisance estimates required by the calibration loss.
  3. Decide whether you want standard calibration or cross-calibration.
  4. Fit the calibrator.
  5. Predict calibrated effect scores.
  6. Diagnose remaining miscalibration.
  7. Revisit the loss if overlap diagnostics suggest the original-population target is unstable.

Which function should I use?

Use fit_calibrator() when

  • you have one prediction per observation,
  • ordinary calibration is enough for your workflow,
  • or you are calibrating on a separate calibration sample.

Use fit_cross_calibrator() when

  • your learner produced cross-fitted predictions,
  • you want to fit and calibrate in sample,
  • and you have both pooled OOF predictions and fold-specific predictions.

Use diagnose_calibration() when

  • you want to quantify remaining calibration error,
  • compare raw and calibrated predictions,
  • check whether the score carries a non-flat treatment-effect signal through the BLP slope,
  • or understand whether your calibration choice is stable under your overlap conditions.

Workflow objects at a glance

Object Meaning Used by
predictions One score per observation fit_calibrator, fit_cross_calibrator, diagnose_calibration
fold_predictions n x K matrix of fold-specific predictions fit_cross_calibrator, predict(cross_calibrator, ...)
fold_ids length n integer vector fit_cross_calibrator, validate_crossfit_bundle
mu0, mu1, propensity DR-loss nuisance inputs loss="dr"
outcome_mean, propensity R-loss nuisance inputs loss="r"

First recommendation

For most users, start with:

  • loss="dr"
  • method="isotonic"
  • cross-calibration when cross-fitted predictions are available

Then use diagnostics to check remaining calibration error, read the BLP slope CI, and see whether weak overlap suggests moving to loss="r" for a more overlap-focused target. assess_overlap() is the package’s default screen for that decision; its thresholds are practical defaults, not universal rules.

Workflow helpers

If your upstream learner already produces vectors, fold matrices, and nuisance estimates, you can package them with:

  • calibration_bundle(...) or calibration_bundle_from_data_frame(...)
  • crossfit_bundle(...) or crossfit_bundle_from_data_frame(...)

Those helpers are optional. They make the downstream workflow easier to validate.

Full examples