hte3 R Package Guide

High-Level API

What to call, what to pass, and what you get back

These are the main public entrypoints. Start with hte_task(), fit with fit_cate() or fit_crr(), then inspect the result with predict() and summary().

For learner selection and customization, use the sl3 guide.

Jump to quickstart See common defaults Open the sl3 guide

Starter data

`hte3_example_data()`

Creates a small synthetic data set so you can try the package before wiring in your own analysis data.

hte3_example_data(n = 200, seed = 123)

Key args n number of rows, seed random seed for reproducibility

Returns A data.table with simulated covariates, treatment, outcomes, and truth columns such as tau.

Use when You want a runnable example or a small test run before using real data.

Recommended entrypoint

`hte_task()`

Builds the main analysis task from raw data, modifier columns, confounders, treatment, and outcome.

hte_task(data, modifiers, confounders = modifiers, treatment, outcome, ...)

Key args data, modifiers, confounders, treatment, outcome, cross_fit

Returns An hte3_Task object that stores the analysis inputs and nuisance setup.

Use when You are starting a new analysis from raw data and want the main high-level workflow.

Default behavior Propensity and outcome learners default to get_autoML(); override them only when custom nuisance models are required. The stack always includes core learners and adds the optional earth, ranger, and xgboost components when they are installed.

CATE wrapper

`fit_cate()`

Fits a conditional average treatment effect model from an hte3_Task.

fit_cate(task, method = c("dr", "r", "t", "ep"), ...)

Key args task, method, base_learner, treatment_level, control_level, cross_validate

Methods Supported methods are "dr", "r", "t", and "ep". A character vector creates a cross-validated portfolio.

Returns An hte3_model object that you can pass to predict() and summary().

Use when Your target is a conditional mean difference. For continuous-treatment CATE tasks, the supported high-level method is currently "r", using the partially linear R-learner effect model A * tau(X). When modifiers are a strict subset of confounders, DR, EP, and the default two-stage T-learner target E[Y(1)-Y(0) | V]; the current R-learner warns because it does not generally target that reduced-modifier estimand.

CRR wrapper

`fit_crr()`

Fits a conditional risk-ratio model when a relative effect scale is more natural than a difference.

fit_crr(task, method = c("ep", "ipw", "t"), ...)

Key args task, method, base_learner, treatment_level, control_level, cross_validate

Methods Supported methods are "ep", "ipw", and "t". A character vector creates a cross-validated portfolio.

Returns An hte3_model object for CRR prediction and summary.

Use when You want a conditional risk ratio and the outcome is nonnegative.

Prediction

`predict()`

Generates fitted effect predictions on the training task or on a new data set.

predict(object, new_data = NULL)

Key args object, new_data

Returns A numeric vector of predicted heterogeneous treatment effects on the chosen data.

New data rules If new_data = NULL, predictions are returned on the training task. For a data frame or data table, the required modifier columns must be present; missing non-modifier analysis columns are filled internally when possible.

Use when You want fitted values on the training sample or effect predictions for a new cohort with the same modifier schema.

Model summary

`summary()`

Prints the main training metadata for a fitted high-level model.

summary(object)

Key args object

Returns A compact summary object containing the target, method, cross-validation flag, row count, modifiers, and treatment variable.

Use when You want a fast check of what model was fit before moving on to downstream analysis or comparison.

Advanced entrypoint

`make_hte3_Task_tx()`

The lower-level task constructor for users who want tighter control over nuisance learners or user-supplied nuisance estimates.

make_hte3_Task_tx(data, modifiers, confounders, treatment, outcome, ...)

Key args learner_pi, learner_mu, learner_m, user-supplied nuisances pi.hat/mu.hat/m.hat, and cross_fit_and_cv

Returns An hte3_Task object closer to the original advanced interface.

Use when You need direct nuisance control, external nuisance estimates, or a workflow closer to the original learner-level API.

Common Defaults

What you can usually leave alone

Most users can leave these settings at their defaults until they need method comparison or learner customization.

`get_autoML()`

hte_task() defaults the propensity and outcome learners to get_autoML(), and the fit wrappers default base_learner to the same stack. The stack always includes its core learners and adds earth, ranger, and xgboost components when those optional packages are available.

`cross_fit` in `hte_task()`

This controls nuisance cross-fitting. In most analyses, the default setting is appropriate.

`cross_validate` in `fit_*()`

This controls selection across candidate learners. It turns on automatically for learner portfolios and for stacked base learners.

When to ignore tuning

For an initial analysis, keep the default learners, fit one method at a time, and postpone EP tuning or learner-library customization until needed.

Quickstart

A minimal CATE analysis

library(hte3)

data <- hte3_example_data(n = 150, seed = 1)

task <- hte_task(
  data = data,
  modifiers = c("W1", "W2", "W3"),
  confounders = c("W1", "W2", "W3"),
  treatment = "A",
  outcome = "Y",
  cross_fit = FALSE
)

fit <- fit_cate(
  task,
  method = "dr",
  cross_validate = FALSE
)

pred <- predict(fit, data)
summary(fit)

For CRR: switch the outcome to a nonnegative column such as Y_binary and call fit_crr() instead.

Workflows

Choose the guide that matches where you are

The package vignettes are organized by analysis type, not only by internals.

Quickstart

Use this vignette for a minimal end-to-end example.

Open source vignette

CATE workflow

Examples of the high-level workflow for conditional average treatment effect estimation.

Open source vignette

CRR workflow

Use this vignette when relative effects and nonnegative outcomes are the appropriate target.

Open source vignette

EP-learner sieve tuning

Explains the Sieve-based EP construction, basis ordering, and paper-faithful sieve CV grids.

Open source vignette

Advanced sl3 integration

Documents lower-level learner control and manual customization.

Open source vignette

Legacy Reproducibility

Paper reproduction stays available without cluttering the main workflow

The package preserves the original paper reproduction workflow, but the website keeps it separate from the main workflow.

Legacy paper reproduction

Use the frozen reproducibility path when the goal is to rerun the original research scripts rather than begin a new analysis.

Open source vignette Install script

What to call, what to pass, and what you get back

hte3_example_data()

hte_task()

fit_cate()

fit_crr()

predict()

summary()

make_hte3_Task_tx()

What you can usually leave alone

get_autoML()

cross_fit in hte_task()

cross_validate in fit_*()

When to ignore tuning

A minimal CATE analysis

Choose the guide that matches where you are

Quickstart

CATE workflow

CRR workflow

EP-learner sieve tuning

Advanced sl3 integration

Paper reproduction stays available without cluttering the main workflow

Legacy paper reproduction

`hte3_example_data()`

`hte_task()`

`fit_cate()`

`fit_crr()`

`predict()`

`summary()`

`make_hte3_Task_tx()`

`get_autoML()`

`cross_fit` in `hte_task()`

`cross_validate` in `fit_*()`