hte3 sl3 Guide

Learner Slots

Where do learners plug into hte3?

`propensity_learner`

Estimates treatment assignment. This matters especially for DR, R, EP, and IPW style workflows.

`outcome_learner`

Estimates the outcome regression. This is often one of the main quality drivers for CATE and CRR performance.

`mean_learner`

Provides the marginal mean nuisance when the workflow requires it. In many analyses, the package default is sufficient.

`base_learner`

Used inside the chosen meta-learner to learn the final heterogeneity surface after the pseudo-outcome step.

Default Stack

What does `get_autoML()` use?

Stack$new(
  Lrnr_glmnet$new(),
  Lrnr_gam$new(),
  # Added when optional runtime packages are installed:
  Lrnr_earth$new(degree = 2),
  Lrnr_ranger$new(max.depth = 10),
  Lrnr_xgboost_early_stopping$new(min_child_weight = 15, max_depth = 2, eta = 0.2, subsample = 0.8, colsample_bytree = 0.8),
  Lrnr_xgboost_early_stopping$new(min_child_weight = 15, max_depth = 3, eta = 0.15, subsample = 0.9),
  Lrnr_xgboost_early_stopping$new(min_child_weight = 15, max_depth = 4, eta = 0.15, subsample = 0.9),
  Lrnr_xgboost_early_stopping$new(min_child_weight = 15, max_depth = 5, eta = 0.15, subsample = 0.9),
  Lrnr_xgboost_early_stopping$new(min_child_weight = 15, max_depth = 4, eta = 0.08, subsample = 0.8, colsample_bytree = 0.8),
)

Why this is useful

It mixes linear, additive, spline, tree, and boosting-style learners, which gives you a reasonable first-pass library without designing one from scratch.

When to override it

Override the default when you need shorter iteration cycles, stronger interpretability, or a task-specific learner library.

One caveat

Some sl3 learners depend on supporting packages being available in your R environment, so actual availability can vary by setup.

Stacking

How `Stack` and `Lrnr_sl` work in practice

In sl3, a Stack is just a library of candidate learners. A Super Learner is what you get when you wrap that library in Lrnr_sl with a metalearner that combines or selects among the candidates.

Step 1

Build a learner library

Use Stack$new(...) to enumerate the models you want to compare. This does not fit an ensemble by itself. It just defines the candidate set.

library(sl3)

learner_library <- Stack$new(
  Lrnr_glm_fast$new(),
  Lrnr_ranger$new(),
  Lrnr_xgboost$new()
)

Step 2

Wrap it in a Super Learner

Use Lrnr_sl$new(...) when you want cross-validated selection or weighted combination across that library.

sl_fit <- Lrnr_sl$new(
  learners = learner_library,
  metalearner = Lrnr_nnls$new(),
  cv_control = list(V = 5)
)$train(task)

What `Stack` means in hte3

You can pass a stacked library anywhere hte3 expects a learner, such as propensity_learner, outcome_learner, or base_learner.

What `Lrnr_sl` adds

Lrnr_sl adds outer learner selection or weighted combination on top of a library, using a metalearner and fold structure defined through cv_control.

Common hte3 default

get_autoML() returns a Stack built from its available safe default learners, which gives you a ready-made candidate library without manually listing learners.

Practical rule: use Stack when you want to define the candidates, and use Lrnr_sl when you want sl3 to cross-validate and combine them.

Cross-Validation

There are three different CV layers to keep straight

Users often say “cross-validation” when they mean different parts of the pipeline. In hte3, it helps to separate nuisance cross-fitting, sl3 learner-library CV, and outer HTE learner selection.

1. Nuisance cross-fitting

Controlled by cross_fit = TRUE in hte_task() or cross_fit_and_cv = TRUE in the low-level task builder. This is about estimating nuisance functions more robustly, not selecting among HTE methods.

task <- hte_task(
  data = df,
  modifiers = mods,
  confounders = confs,
  treatment = "A",
  outcome = "Y",
  cross_fit = TRUE
)

2. sl3 library CV

Controlled inside sl3 with Lrnr_sl and cv_control = list(V = ...). This is where a Super Learner compares members of a Stack.

sl_fit <- Lrnr_sl$new(
  learners = learner_library,
  metalearner = Lrnr_nnls$new(),
  cv_control = list(V = 5)
)$train(task)

3. Outer HTE learner CV

Controlled by cross_validate = TRUE in fit_cate() or fit_crr(), or explicitly with cross_validate_cate() and cross_validate_crr(). This is where you compare DR, R, T, EP, or CRR families.

fit <- fit_cate(
  task,
  method = c("dr", "r", "ep"),
  base_learner = learner_library,
  cross_validate = TRUE,
  cv_control = list(V = 5)
)

Recommended mental model: cross_fit is for nuisance estimation, Lrnr_sl is for sl3 Super Learner selection inside a learner library, and cross_validate = TRUE in the wrappers is for selection among HTE learner families.

Starter Recipes

Common learner strategies for practitioners

Fast baseline

Simple generalized linear learners

Use learners like Lrnr_glm_fast or Lrnr_mean for rapid iteration, debugging, or an interpretable baseline.

Balanced default

Keep `get_autoML()`

Use this when a broad learner library is needed without constructing one manually.

More nonlinear signal

Tree and boosting heavy

Favor learners such as Lrnr_ranger and Lrnr_xgboost when interactions and nonlinearities are likely to matter.

Custom library

Hand-built `Stack` or `Lrnr_sl`

Use this path when a preferred learner library already exists or when comparing a curated set of candidate learners.

Code Patterns

Small patterns you can copy into real analyses

Fast and explicit

hte_task(
  data = df,
  modifiers = mods,
  confounders = confs,
  treatment = "A",
  outcome = "Y",
  propensity_learner = Lrnr_glm_fast$new(),
  outcome_learner = Lrnr_glm_fast$new(),
  mean_learner = Lrnr_mean$new()
)

Custom base learner for CATE

fit_cate(
  task,
  method = "dr",
  base_learner = Lrnr_ranger$new(),
  cross_validate = FALSE
)

Continuous-treatment note: in the current package, the continuous-treatment CATE path is the R-learner. That implementation uses the partially linear effect-model view with an A * tau(X) term rather than a fully general treatment-response surface.

Reduced-modifier-set note: if the target modifiers are V and the nuisance adjustment set is W with V a strict subset of W, then the target of interest is E[Y(1)-Y(0) | V] = E[tau(W) | V]. In the supported binary/categorical-treatment setting, DR- and EP-learners target that surface. The current R-learner instead targets the overlap-weighted projection f_R(V) = E[Var(A|W) tau(W) | V] / E[Var(A|W) | V], which becomes E[e(W)(1-e(W)) tau(W) | V] / E[e(W)(1-e(W)) | V] for binary treatment.

Official sl3 Resources

References for further detail

These are the upstream references most likely to help an hte3 user make better modeling decisions.

sl3 package site

The main package homepage with reference docs and articles.

Open the site

Intro to sl3

A practical article for understanding tasks, learners, training, and prediction.

Open the intro article

Pipeline reference

Useful for chained learners, screening steps, or more customized workflows.

Open the pipeline reference

Super Learner reference

Use this reference when building or interpreting Lrnr_sl libraries explicitly.

Open the Super Learner reference

tlverse handbook chapter

A longer-form chapter that is often easier to learn from than jumping straight into reference pages.

Open the handbook chapter

Where do learners plug into hte3?

propensity_learner

outcome_learner

mean_learner

base_learner

What does get_autoML() use?

Why this is useful

When to override it

One caveat

How Stack and Lrnr_sl work in practice

Build a learner library

Wrap it in a Super Learner

What Stack means in hte3

What Lrnr_sl adds

Common hte3 default

There are three different CV layers to keep straight

1. Nuisance cross-fitting

2. sl3 library CV

3. Outer HTE learner CV

Common learner strategies for practitioners

Simple generalized linear learners

Keep get_autoML()

Tree and boosting heavy

Hand-built Stack or Lrnr_sl

Small patterns you can copy into real analyses

Fast and explicit

Custom base learner for CATE

References for further detail

sl3 package site

Intro to sl3

Pipeline reference

Super Learner reference

tlverse handbook chapter

`propensity_learner`

`outcome_learner`

`mean_learner`

`base_learner`

What does `get_autoML()` use?

How `Stack` and `Lrnr_sl` work in practice

What `Stack` means in hte3

What `Lrnr_sl` adds

Keep `get_autoML()`

Hand-built `Stack` or `Lrnr_sl`