R Package Guide

High-level API for CATE and CRR analysis

hte3 package badge showing a simple regression tree with subgroup treatment contrasts

hte3 is organized around a short sequence: create a task, fit the estimand, then generate predictions or summaries. This page is the primary guide to the public wrappers and their main arguments.

Use the API section for function-level details, the quickstart for a minimal runnable example, and the sl3 guide for learner customization.

  • hte_task() for setup
  • fit_cate() and fit_crr() for fitting
  • predict() and summary() for output

High-Level API

What to call, what to pass, and what you get back

These are the main public entrypoints. Start with hte_task(), fit with fit_cate() or fit_crr(), then inspect the result with predict() and summary().

For learner selection and customization, use the sl3 guide.

Starter data

hte3_example_data()

Creates a small synthetic data set so you can try the package before wiring in your own analysis data.

hte3_example_data(n = 200, seed = 123)
Key args n number of rows, seed random seed for reproducibility
Returns A data.table with simulated covariates, treatment, outcomes, and truth columns such as tau.
Use when You want a runnable example or a small test run before using real data.

Recommended entrypoint

hte_task()

Builds the main analysis task from raw data, modifier columns, confounders, treatment, and outcome.

hte_task(data, modifiers, confounders = modifiers, treatment, outcome, ...)
Key args data, modifiers, confounders, treatment, outcome, cross_fit
Returns An hte3_Task object that stores the analysis inputs and nuisance setup.
Use when You are starting a new analysis from raw data and want the main high-level workflow.
Default behavior Propensity and outcome learners default to get_autoML(); override them only when custom nuisance models are required. The stack always includes core learners and adds the optional earth, ranger, and xgboost components when they are installed.

CATE wrapper

fit_cate()

Fits a conditional average treatment effect model from an hte3_Task.

fit_cate(task, method = c("dr", "r", "t", "ep"), ...)
Key args task, method, base_learner, treatment_level, control_level, cross_validate
Methods Supported methods are "dr", "r", "t", and "ep". A character vector creates a cross-validated portfolio.
Returns An hte3_model object that you can pass to predict() and summary().
Use when Your target is a conditional mean difference. For continuous-treatment CATE tasks, the supported high-level method is currently "r", using the partially linear R-learner effect model A * tau(X). When modifiers are a strict subset of confounders, DR, EP, and the default two-stage T-learner target E[Y(1)-Y(0) | V]; the current R-learner warns because it does not generally target that reduced-modifier estimand.

CRR wrapper

fit_crr()

Fits a conditional risk-ratio model when a relative effect scale is more natural than a difference.

fit_crr(task, method = c("ep", "ipw", "t"), ...)
Key args task, method, base_learner, treatment_level, control_level, cross_validate
Methods Supported methods are "ep", "ipw", and "t". A character vector creates a cross-validated portfolio.
Returns An hte3_model object for CRR prediction and summary.
Use when You want a conditional risk ratio and the outcome is nonnegative.

Prediction

predict()

Generates fitted effect predictions on the training task or on a new data set.

predict(object, new_data = NULL)
Key args object, new_data
Returns A numeric vector of predicted heterogeneous treatment effects on the chosen data.
New data rules If new_data = NULL, predictions are returned on the training task. For a data frame or data table, the required modifier columns must be present; missing non-modifier analysis columns are filled internally when possible.
Use when You want fitted values on the training sample or effect predictions for a new cohort with the same modifier schema.

Model summary

summary()

Prints the main training metadata for a fitted high-level model.

summary(object)
Key args object
Returns A compact summary object containing the target, method, cross-validation flag, row count, modifiers, and treatment variable.
Use when You want a fast check of what model was fit before moving on to downstream analysis or comparison.

Advanced entrypoint

make_hte3_Task_tx()

The lower-level task constructor for users who want tighter control over nuisance learners or user-supplied nuisance estimates.

make_hte3_Task_tx(data, modifiers, confounders, treatment, outcome, ...)
Key args learner_pi, learner_mu, learner_m, user-supplied nuisances pi.hat/mu.hat/m.hat, and cross_fit_and_cv
Returns An hte3_Task object closer to the original advanced interface.
Use when You need direct nuisance control, external nuisance estimates, or a workflow closer to the original learner-level API.

Common Defaults

What you can usually leave alone

Most users can leave these settings at their defaults until they need method comparison or learner customization.

get_autoML()

hte_task() defaults the propensity and outcome learners to get_autoML(), and the fit wrappers default base_learner to the same stack. The stack always includes its core learners and adds earth, ranger, and xgboost components when those optional packages are available.

cross_fit in hte_task()

This controls nuisance cross-fitting. In most analyses, the default setting is appropriate.

cross_validate in fit_*()

This controls selection across candidate learners. It turns on automatically for learner portfolios and for stacked base learners.

When to ignore tuning

For an initial analysis, keep the default learners, fit one method at a time, and postpone EP tuning or learner-library customization until needed.

Quickstart

A minimal CATE analysis

library(hte3)

data <- hte3_example_data(n = 150, seed = 1)

task <- hte_task(
  data = data,
  modifiers = c("W1", "W2", "W3"),
  confounders = c("W1", "W2", "W3"),
  treatment = "A",
  outcome = "Y",
  cross_fit = FALSE
)

fit <- fit_cate(
  task,
  method = "dr",
  cross_validate = FALSE
)

pred <- predict(fit, data)
summary(fit)
For CRR: switch the outcome to a nonnegative column such as Y_binary and call fit_crr() instead.

Workflows

Choose the guide that matches where you are

The package vignettes are organized by analysis type, not only by internals.

CATE workflow

Examples of the high-level workflow for conditional average treatment effect estimation.

CRR workflow

Use this vignette when relative effects and nonnegative outcomes are the appropriate target.

EP-learner sieve tuning

Explains the Sieve-based EP construction, basis ordering, and paper-faithful sieve CV grids.

Advanced sl3 integration

Documents lower-level learner control and manual customization.

Legacy Reproducibility

Paper reproduction stays available without cluttering the main workflow

The package preserves the original paper reproduction workflow, but the website keeps it separate from the main workflow.

Legacy paper reproduction

Use the frozen reproducibility path when the goal is to rerun the original research scripts rather than begin a new analysis.