Advanced Bayesian multilevel adjustment

Aim

The previous adjustment vignette introduced Bayesian multilevel modelling as one of the adjustment methods implemented in debiasR. This vignette describes the Bayesian model options in more detail. The aim is to explain how adjust_multilevel_bayes() can be used to adjust mobile-phone-derived origin-destination (OD) flows, what data inputs the function expects, which parameters define each model variant and how to interpret the outputs returned by debiasR.

The vignette focuses on the three observation_model models: coverage_offset, latent_two_level and reduced_form. These models share a common purpose. They use observed MPD flows, coverage, area characteristics and OD distance to produce adjusted flows from MPD. The three models differ in how they produce adjusted flows. The vignette also explains model_engine = "frequentist" as a deterministic way to test data structures, functional forms and covariates before fitting a Bayesian model. The examples use the local authority district (LAD) data supplied via debiasR_example_data().

LAD data inputs

The Bayesian adjustment function needs an OD table with observed MPD flows. Depending on the model variant, it can also use coverage, covariates and distances. The LAD example prepares a complete OD grid so every origin-destination pair has a row.

lad_data <- debiasR_example_data(
  n_areas = Inf,
  complete_grid = TRUE,
  geography = "lad"
)

mpd_od_df <- lad_data$mpd_od
mpd_od_df$mpd_time <- "single_period"

coverage_df <- lad_data$coverage
coverage_df$mpd_time <- "single_period"

covariates_df <- lad_data$covariates
distance_df <- lad_data$distance
benchmark_od_df <- lad_data$benchmark_od

mpd_od_df contains the observed MPD flows. flow is the observed MPD count for an origin-destination pair. mpd_source and mpd_time identify the digital source and period represented by the row.

head(mpd_od_df)

# A tibble: 6 × 8
  origin    destination mpd_source   mpd_observed mpd_zero_filled mpd_row_status
  <chr>     <chr>       <chr>        <lgl>        <lgl>           <chr>
1 E06000001 E06000001   locomizer_t… TRUE         FALSE           observed
2 E06000001 E06000002   locomizer_t… TRUE         FALSE           observed
3 E06000001 E06000003   locomizer_t… TRUE         FALSE           observed
4 E06000001 E06000004   locomizer_t… TRUE         FALSE           observed
5 E06000001 E06000005   locomizer_t… TRUE         FALSE           observed
6 E06000001 E06000006   locomizer_t… FALSE        TRUE            zero_filled
# ℹ 2 more variables: flow <dbl>, mpd_time <chr>

coverage_df contains the estimated MPD coverage for each area, source and period. In this example, coverage_rate is the share of the benchmark population observed in the MPD source.

head(coverage_df)

# A tibble: 6 × 6
  origin    destination population user_count mpd_source                mpd_time
  <chr>     <chr>            <dbl>      <dbl> <chr>                     <chr>
1 E06000001 E06000001        25042       1207 locomizer_travel_to_work… single_…
2 E06000002 E06000002        35807       1028 locomizer_travel_to_work… single_…
3 E06000003 E06000003        33845       1144 locomizer_travel_to_work… single_…
4 E06000004 E06000004        53104       1755 locomizer_travel_to_work… single_…
5 E06000005 E06000005        28299       1165 locomizer_travel_to_work… single_…
6 E06000006 E06000006        36310       1744 locomizer_travel_to_work… single_…

covariates_df provides origin and destination characteristics that can enter the mobility model. distance_df supplies the OD distance used in the gravity component of the model.

head(covariates_df)

# A tibble: 6 × 10
  area      name         year per_ukborn per_age_20.29 per_age_70plus per_level4
  <chr>     <chr>       <int>      <dbl>         <dbl>          <dbl>      <dbl>
1 E06000001 Hartlepool   2021       96.0          11.5           14.1       24.8
2 E06000002 Middlesbro…  2021       87.7          14.2           11.8       26.4
3 E06000003 Redcar and…  2021       97.1          10.3           17.0       24.9
4 E06000004 Stockton-o…  2021       93.8          11.0           13.4       29.5
5 E06000005 Darlington   2021       92.2          11.0           14.9       29.0
6 E06000006 Halton       2021       95.2          11.4           13.0       23.9
# ℹ 3 more variables: per_hh_no_centralheat <dbl>,
#   per_NS_SeC_L13_routine <dbl>, rural_pct <dbl>

head(distance_df)

# A tibble: 6 × 4
  origin    destination distance_km distance_source
  <chr>     <chr>             <dbl> <chr>
1 E06000001 E06000001           0   debiasRdata_lad_centroids
2 E06000001 E06000002          15.1 debiasRdata_lad_centroids
3 E06000001 E06000003          20.9 debiasRdata_lad_centroids
4 E06000001 E06000004          13.5 debiasRdata_lad_centroids
5 E06000001 E06000005          24.8 debiasRdata_lad_centroids
6 E06000001 E06000006         176.  debiasRdata_lad_centroids

benchmark_od_df is not used to fit the Bayesian adjustment model. It is used after adjustment to validate the adjusted flows against an external benchmark.

head(benchmark_od_df)

# A tibble: 6 × 6
  origin    destination benchmark_observed benchmark_zero_filled
  <chr>     <chr>       <lgl>              <lgl>
1 E06000001 E06000001   TRUE               FALSE
2 E06000001 E06000002   TRUE               FALSE
3 E06000001 E06000003   TRUE               FALSE
4 E06000001 E06000004   TRUE               FALSE
5 E06000001 E06000005   TRUE               FALSE
6 E06000001 E06000006   FALSE              TRUE
# ℹ 2 more variables: benchmark_row_status <chr>, flow <dbl>

Bayesian options

adjust_multilevel_bayes() exposes three Bayesian model variants through the observation_model argument.

Function argument	Main question	Adjusted flow scale	When to start here
`observation_model = "coverage_offset"`	What true-flow pattern is most consistent with the observed MPD counts, after accounting for coverage?	`flow_adj` and `flow_true_pred` are on the true-flow scale.	Use this as the main starting point when coverage rates are available.
`observation_model = "latent_two_level"`	What shared hidden OD or OD-time flow is consistent with repeated source/time observations?	`flow_adj` and `flow_true_pred` summarize the latent true-flow intensity.	Use this when the same OD or OD-time flow is observed through real repeated source or time rows.
`observation_model = "reduced_form"`	What fitted MPD flow would be expected if the fitted bias term were neutral?	`flow_adj` is an MPD-scale counterfactual.	Use this for compatibility checks or sensitivity analyses on the MPD scale.

The function also offers a deterministic fitting option: model_engine = "frequentist". This is not a Bayesian model. It is a useful way to test joins, formulas, functional forms, covariates and count-family choices before fitting a Bayesian model. The frequentist option is available for coverage_offset and reduced_form; latent_two_level requires model_engine = "bayesian".

Model equations

This section gives the general equations behind the three model variants. The notation is deliberately close to the debiasR inputs and outputs:

(Y_{ij}) is the observed MPD flow from origin (i) to destination (j).
(Y_{ijst}) is the observed MPD flow for origin (i), destination (j), source (s) and time period (t).
(q) is the observation_probability derived from the selected coverage_scale.
(F^{true}) is the true-flow intensity targeted by the true-flow model variants.
flow_mpd_pred is the fitted MPD-scale prediction.
flow_true_pred is the fitted true-flow-scale prediction where that scale is defined by the selected model variant.
flow_adj is the adjusted flow returned by adjust_multilevel_bayes().

The equations are there to clarify what each function option is doing. The key idea is simple: each model starts with an observed MPD count, then decides whether to treat that count as a coverage-scaled observation of a true flow, a repeated observation of a shared latent flow, or an MPD-scale counterfactual.

`coverage_offset`

The coverage_offset model treats the observed MPD flow as a coverage-scaled observation of a true OD flow:

[ Y_{ij} (q_{ij} F^{true}_{ij}) ]

[ = (q_{ij}) + ^{true}_{ij} ]

[ ^{true}_{ij} = + X_i _o + X_j d + (d{ij}) + u_i ]

Here, (X_i) and (X_j) are origin and destination covariates, (d_{ij}) is distance and (u_i) is an optional origin random intercept. The offset ((q_{ij})) enters the observation equation, so the model estimates a true-flow intensity and then maps it back to the MPD observation scale.

For this model variant, flow_adj is the adjusted true-flow prediction. It is reported on the same scale as flow_true_pred. flow_mpd_pred keeps the coverage-adjusted fitted value on the MPD observation scale.

`latent_two_level`

The latent_two_level model is designed for repeated observations. For example, the same OD pair may be observed by more than one MPD source, or the same OD pair may be observed across several time periods. The model separates a shared hidden flow from the source/time observation process:

[ Y_{ijst} (_{ijst}) ]

[ ({ijst}) = (q{ijst}) + {g(i,j,t)} + z{ijst}]

[ {g(i,j,t)} (^{true}{g(i,j,t)}, _) ]

The term (_{g(i,j,t)}) is the latent true-flow state shared by the rows that refer to the same OD or OD-time unit. The argument latent_flow_unit controls whether that state is defined at the OD level or the OD-time level. The returned column latent_flow_id identifies the shared state.

For this model variant, flow_adj and flow_true_pred summarize the latent true-flow intensity. flow_mpd_pred remains source/time-specific because it includes the observation layer.

`reduced_form`

The reduced_form model stays on the MPD observation scale. It models the observed MPD flow directly:

[ Y_{ij} (^{mpd}_{ij}) ]

[ (^{mpd}_{ij}) = + X_i o + X_j d + (d{ij}) + B{ij}]

The term (B_{ij}) represents the fitted bias component supplied through bias_formula. The adjusted value is a counterfactual MPD prediction in which the fitted bias component is set to its neutral value.

For this model variant, flow_adj is an MPD-scale counterfactual. It should not be interpreted as the same true-flow quantity returned by coverage_offset or latent_two_level.

Source and time structures

Origin-destination flows can be produced by different digital sources and reported in different time units. One provider may supply daily OD flows, another may supply weekly flows, and a research project may aggregate either of them to a monthly or study-period total. adjust_multilevel_bayes() uses source_col, time_col, scenario and repeated_observation to describe this structure.

Data structure	Function arguments	Interpretation
One source, one time period	`scenario = "s1"`, `repeated_observation = "none"`	Each OD pair appears once. This is the LAD structure used below.
One source, multiple time periods	`scenario = "s2"`, `repeated_observation = "time"`	The same source observes OD flows repeatedly over time.
Multiple sources, one time period	`scenario = "s3"`, `repeated_observation = "source"`	Several sources observe the same OD flow for the same period.
Multiple sources, multiple time periods	`scenario = "s4"`, `repeated_observation = "source_time"`	Several sources observe OD flows across several periods.

The source/time structure matters most for latent_two_level. A latent model needs repeated rows that genuinely observe the same underlying OD or OD-time flow. In contrast, coverage_offset can be fitted to a single source and a single period, as in the LAD example below.

Implementation on LAD data

This section demonstrates how the model variants are called and how the returned objects should be read. The implementation uses the LAD inputs created above.

Check the data and formula first

A useful first step is to fit the coverage_offset model with model_engine = "frequentist". This tests the data joins, formula and output structure before the Bayesian fit.

adj_frequentist <- adjust_multilevel_bayes(
  mpd_od_df = mpd_od_df,
  coverage_df = coverage_df,
  covariates_df = covariates_df,
  distance_df = distance_df,
  mobility_formula = ~ log_pop_o + log_pop_d + log_distance +
    rural_pct_o + rural_pct_d + per_level4_o + per_level4_d,
  bias_formula = ~ 0,
  target_scale = "true_flow",
  observation_model = "coverage_offset",
  coverage_scale = "origin",
  model_engine = "frequentist",
  scenario = "s1",
  source_col = "mpd_source",
  time_col = "mpd_time",
  repeated_observation = "none",
  prediction_scope = "complete_grid",
  random_intercept = "none",
  model_family = "poisson"
)

head(
  adj_frequentist[
    c(
      "origin", "destination", "flow", "flow_adj",
      "flow_mpd_pred", "flow_true_pred", "observation_probability"
    )
  ]
)

# A tibble: 6 × 7
  origin    destination  flow flow_adj flow_mpd_pred flow_true_pred
  <chr>     <chr>       <dbl>    <dbl>         <dbl>          <dbl>
1 E06000001 E06000001     854  14140.         682.          14140.
2 E06000001 E06000002      33    117.           5.62          117.
3 E06000001 E06000003      14    101.           4.87          101.
4 E06000001 E06000004      95    138.           6.68          138.
5 E06000001 E06000005      15     99.8          4.81           99.8
6 E06000001 E06000006       0     83.9          4.04           83.9
# ℹ 1 more variable: observation_probability <dbl>

The returned columns are the same column names users see from adjust_multilevel_bayes(). In this example, flow is the observed MPD count, observation_probability is the coverage-derived observation probability, flow_mpd_pred is the fitted MPD-scale value, and flow_adj is the adjusted true-flow prediction.

`coverage_offset` with `model_engine = "bayesian"`

The Bayesian coverage_offset call uses the same data and formula, but returns posterior summaries for the adjusted flows. The origin random intercept (1 | origin) allows the model to partially pool origins with similar evidence.

adj_coverage_offset <- adjust_multilevel_bayes(
  mpd_od_df = mpd_od_df,
  coverage_df = coverage_df,
  covariates_df = covariates_df,
  distance_df = distance_df,
  mobility_formula = ~ log_pop_o + log_pop_d + log_distance +
    rural_pct_o + rural_pct_d + per_level4_o + per_level4_d +
    (1 | origin),
  bias_formula = ~ 0,
  target_scale = "true_flow",
  observation_model = "coverage_offset",
  coverage_scale = "origin",
  model_engine = "bayesian",
  scenario = "s1",
  source_col = "mpd_source",
  time_col = "mpd_time",
  repeated_observation = "none",
  prediction_scope = "complete_grid",
  random_intercept = "origin",
  model_family = "poisson",
  flow_adj_summary = "median",
  iter = 1000,
  chains = 4,
  seed = 20260630,
  refresh = 0
)

The output below shows the column names returned by debiasR for the LAD coverage_offset fit.

origin	destination	flow	flow_adj	flow_mpd_pred	flow_true_pred	flow_adj_mean	flow_adj_median	flow_adj_q2.5	flow_adj_q97.5	observation_probability
E06000001	E06000001	854	14461.65	697.04	14461.65	14461.65	14459.63	14164.86	14788.37	0.05
E06000001	E06000002	33	120.80	5.82	120.80	120.80	120.79	118.21	123.54	0.05
E06000001	E06000003	14	104.62	5.04	104.62	104.62	104.61	102.37	106.94	0.05
E06000001	E06000004	95	143.62	6.92	143.62	143.62	143.60	140.53	146.81	0.05
E06000001	E06000005	15	103.58	4.99	103.58	103.58	103.56	101.35	105.95	0.05

The first displayed row is the LAD flow from E06000001 to E06000001. The observed MPD flow is 854. The model estimates a larger flow_adj because the observation_probability is about 0.048: only a share of the benchmark population is observed in the MPD source. The posterior interval is reported through flow_adj_q2.5 and flow_adj_q97.5.

The fit also carries metadata and diagnostics as attributes:

attribute	field	value
attr(result, "result_metadata")	model_engine	bayesian
attr(result, "result_metadata")	backend	rstanarm
attr(result, "result_metadata")	model_family	poisson
attr(result, "result_metadata")	target_scale	true_flow
attr(result, "result_metadata")	observation_model	coverage_offset
attr(result, "result_metadata")	prediction_scope	complete_grid
attr(result, "result_metadata")	coverage_scale	origin
attr(result, "result_metadata")	random_intercept	origin
attr(result, "result_metadata")	n_fit_rows	42455
attr(result, "result_metadata")	n_prediction_rows	97969
attr(result, "diagnostics")$convergence	status	available
attr(result, "diagnostics")$convergence	rhat_max	1.030
attr(result, "diagnostics")$convergence	n_eff_min	233
validate_flow_overall(result, benchmark_od)	mae	187.0
validate_flow_overall(result, benchmark_od)	rmse	725.5
validate_flow_overall(result, benchmark_od)	pearson_r	0.933
validate_flow_overall(result, benchmark_od)	spearman_rho	0.476

These rows correspond to the information users access with:

attr(adj_coverage_offset, "result_metadata")
attr(adj_coverage_offset, "diagnostics")
attr(adj_coverage_offset, "model_terms")

The same adjusted object can be compared with the benchmark OD flows:

validate_flow_overall(adj_coverage_offset, benchmark_od_df)

`latent_two_level`

Use latent_two_level when the input data contain repeated observations of the same underlying OD or OD-time flow. The LAD input above is an S1 data structure: one source and one period. That structure is appropriate for coverage_offset, but it does not give a latent two-level model repeated source/time evidence to separate a shared latent flow from observation-specific variation.

For data with repeated source/time rows, the call has the same shape but uses observation_model = "latent_two_level" and a repeated-observation setting that matches the data.

adj_latent_two_level <- adjust_multilevel_bayes(
  mpd_od_df = mpd_source_time_df,
  coverage_df = coverage_source_time_df,
  covariates_df = covariates_df,
  distance_df = distance_df,
  mobility_formula = ~ log_pop_o + log_pop_d + log_distance +
    rural_pct_o + rural_pct_d + per_level4_o + per_level4_d,
  bias_formula = ~ 0,
  target_scale = "true_flow",
  observation_model = "latent_two_level",
  coverage_scale = "origin",
  model_engine = "bayesian",
  scenario = "s4",
  source_col = "mpd_source",
  time_col = "mpd_time",
  repeated_observation = "source_time",
  prediction_scope = "observed",
  latent_flow_unit = "od_time",
  model_family = "poisson",
  iter = 1000,
  chains = 4,
  seed = 20260630,
  refresh = 0
)

The key additional returned column is latent_flow_id. It identifies the shared OD or OD-time state used by the latent model. Rows that share a latent_flow_id share the same latent true-flow component, while flow_mpd_pred can still differ across source/time rows because the observation layer includes source/time-specific information.

attribute	field	value
attr(result, "result_metadata")	observation_model	latent_two_level
attr(result, "result_metadata")	scenario	s3 and s4
attr(result, "result_metadata")	prediction_scope	observed
attr(result, "result_metadata")	backend	stan_latent
attr(result, "result_metadata")	latent_max_treedepth	15
attr(result, "diagnostics")$convergence	status	available
attr(result, "diagnostics")$convergence	divergences	0
attr(result, "diagnostics")$convergence	treedepth_hits	0
attr(result, "diagnostics")$convergence	ebfmi_min	> 0.91
attr(result, "diagnostics")$convergence	rhat_max	about 1.023
attr(result, "diagnostics")$convergence	n_eff_min	about 190
provenance	recorded_decision_date	2026-06-25
provenance	source_note	TASK_BOARD.md and STATUS.md empirical latent approval notes

Users should inspect attr(adj_latent_two_level, "diagnostics") carefully. For latent models, important fields include divergent transitions, treedepth hits, effective sample sizes and ().

`reduced_form`

The reduced_form model is useful when users want continuity with earlier MPD-scale analyses. It uses bias_formula to model the fitted bias component and then returns an MPD-scale counterfactual through flow_adj.

adj_reduced_form <- adjust_multilevel_bayes(
  mpd_od_df = mpd_od_df,
  coverage_df = coverage_df,
  covariates_df = covariates_df,
  distance_df = distance_df,
  mobility_formula = ~ log_pop_o + log_pop_d + log_distance +
    rural_pct_o + rural_pct_d + per_level4_o + per_level4_d,
  bias_formula = ~ bias_e_origin,
  target_scale = "mpd_counterfactual",
  observation_model = "reduced_form",
  model_engine = "bayesian",
  scenario = "s1",
  source_col = "mpd_source",
  time_col = "mpd_time",
  repeated_observation = "none",
  prediction_scope = "complete_grid",
  random_intercept = "none",
  model_family = "poisson",
  flow_adj_summary = "median",
  iter = 1000,
  chains = 4,
  seed = 20260630,
  refresh = 0
)

A deterministic reduced_form check can be fitted on the same LAD inputs.

adj_reduced_form_check <- adjust_multilevel_bayes(
  mpd_od_df = mpd_od_df,
  coverage_df = coverage_df,
  covariates_df = covariates_df,
  distance_df = distance_df,
  mobility_formula = ~ log_pop_o + log_pop_d + log_distance +
    rural_pct_o + rural_pct_d + per_level4_o + per_level4_d,
  bias_formula = ~ bias_e_origin,
  target_scale = "mpd_counterfactual",
  observation_model = "reduced_form",
  model_engine = "frequentist",
  scenario = "s1",
  source_col = "mpd_source",
  time_col = "mpd_time",
  repeated_observation = "none",
  prediction_scope = "complete_grid",
  random_intercept = "none",
  model_family = "poisson"
)

head(
  adj_reduced_form_check[
    c("origin", "destination", "flow", "flow_adj", "flow_mpd_pred")
  ]
)

# A tibble: 6 × 5
  origin    destination  flow    flow_adj flow_mpd_pred
  <chr>     <chr>       <dbl>       <dbl>         <dbl>
1 E06000001 E06000001     854 1395706302.            NA
2 E06000001 E06000002      33   11596630.            NA
3 E06000001 E06000003      14   10043934.            NA
4 E06000001 E06000004      95   13770281.            NA
5 E06000001 E06000005      15    9936053.            NA
6 E06000001 E06000006       0    8333877.            NA

For reduced_form, flow_adj is on the MPD scale. It is not the same target quantity as flow_adj from coverage_offset or latent_two_level, where the adjusted flow is on a true-flow scale.

Reading `adjust_multilevel_bayes()` results

The returned object is a data frame with OD identifiers, fitted values and model-specific columns. Some columns appear for every model variant; others are specific to the observation model or engine.

Name	Where it appears	How to read it
`flow`	All model variants	Observed MPD flow from `mpd_od_df`.
`flow_adj`	All model variants	Main adjusted flow returned by `debiasR`; its scale depends on `observation_model`.
`flow_mpd_pred`	All model variants	Fitted prediction on the MPD observation scale.
`flow_true_pred`	`coverage_offset` and `latent_two_level`; present for some compatibility paths	Fitted true-flow-scale prediction where the selected model defines a true-flow target.
`flow_adj_mean`	Bayesian fits when posterior flow summaries are available	Posterior mean of the adjusted flow.
`flow_adj_median`	Bayesian fits when posterior flow summaries are available	Posterior median of the adjusted flow.
`flow_adj_q2.5`	Bayesian fits when posterior flow summaries are available	Lower 2.5 percent posterior quantile for the adjusted flow.
`flow_adj_q97.5`	Bayesian fits when posterior flow summaries are available	Upper 97.5 percent posterior quantile for the adjusted flow.
`observation_probability`	`coverage_offset` and `latent_two_level`	Coverage-derived probability used in the observation equation.
`latent_flow_id`	`latent_two_level`	Identifier for the shared OD or OD-time latent state.
`attr(result, "result_metadata")`	All model variants	Model choices, data structure and prediction scope recorded with the result.
`attr(result, "diagnostics")`	All model variants	Fit diagnostics, including convergence information when available.
`attr(result, "model_terms")`	All model variants	Resolved model terms used by the fitted model.

The most important interpretation rule is to read flow_adj together with observation_model:

observation_model	What the LAD application shows	How to read the output
coverage_offset	The LAD example uses observed MPD flows, active-user coverage, area covariates and OD distances to estimate true-flow-scale predictions. Larger S4 LAD/LTLA applications use the same returned columns with source/time coverage in the observation model.	`flow_adj` is the true-flow-scale prediction and equals `flow_true_pred`; `flow_mpd_pred` keeps the fitted MPD observation scale with the coverage offset included. Benchmark OD cells are not used to fit the Bayesian coverage-offset model.
reduced_form	Real LAD S1 applications use the same OD identifiers, covariates and distance inputs, but this model variant is retained as a compatibility and sensitivity option rather than as the recommended true-flow model.	`flow_adj` is an MPD-scale counterfactual with the fitted bias term neutralised. `flow_true_pred` is not a true-flow quantity for this variant.
latent_two_level	Repeated-source S3/S4 applications use source/time MPD rows, active-user coverage, Census benchmark validation and LAD centroid distances to estimate shared latent OD or OD-time true-flow states.	`latent_flow_id` identifies the shared OD or OD-time state. `flow_adj` and `flow_true_pred` summarize the latent true-flow intensity; `flow_mpd_pred` remains source/time-specific because it includes the observation layer.

Choosing a variant in practice

Start with coverage_offset when you have coverage rates and want adjusted flows on a true-flow scale. Use model_engine = "frequentist" first when you want to test formulas and covariates, then fit model_engine = "bayesian" when you want posterior summaries and uncertainty intervals.

Use latent_two_level when the data structure contains real repeated source/time observations of the same OD or OD-time flow. This model can be more informative than coverage_offset, but only when the repeated rows provide information about a shared latent state.

Use reduced_form when the goal is an MPD-scale counterfactual or a compatibility check with earlier reduced-form analyses. Do not interpret its flow_adj column as the same true-flow quantity returned by coverage_offset or latent_two_level.

Aim

LAD data inputs

Bayesian options

Model equations

coverage_offset

latent_two_level

reduced_form

Source and time structures

Implementation on LAD data

Check the data and formula first

coverage_offset with model_engine = "bayesian"

latent_two_level

reduced_form

Reading adjust_multilevel_bayes() results

Choosing a variant in practice

`coverage_offset`

`latent_two_level`

`reduced_form`

`coverage_offset` with `model_engine = "bayesian"`

`latent_two_level`

`reduced_form`

Reading `adjust_multilevel_bayes()` results