Global coefficient calibration with multiple model families

Calibrate a multiplicative coefficient that links MPD-derived flows to benchmark flows, following the "coefficient" approach in Chi et al.

Usage

adjust_coefficient(
  mpd_od_df,
  benchmark_od_df,
  flow_col_mpd = "flow",
  flow_col_bench = "flow",
  model_family = c("ols", "poisson", "negbin", "zinb"),
  level = c("od", "origin", "destination"),
  fit_intercept = FALSE,
  by_source = FALSE,
  keep_cols = character()
)

Arguments

mpd_od_df

Data frame of MPD flows with at least: origin, destination and flow_col_mpd. Optionally mpd_source for by_source = TRUE.

benchmark_od_df

Data frame of benchmark flows with at least: origin, destination and flow_col_bench.

flow_col_mpd

Name of MPD flow column. Default "flow".

flow_col_bench

Name of benchmark flow column. Default "flow".

model_family

One of "ols", "poisson", "negbin", "zinb". Default "ols".

level

Aggregation level for calibration:

"od" (default): use OD pairs.
"origin": use origin totals.
"destination": use destination totals.

fit_intercept

For model_family = "ols" only: if FALSE (default), fit through the origin $F^{bench} = \beta F^{mpd}$. if TRUE, fit $F^{bench} = \alpha + \beta F^{mpd}$ and derive a flow-specific correction factor $CF_{ij} = \hat{F}_{ij}^{bench}/F_{ij}^{mpd}$. Ignored for count models, where the functional form is fixed as above.

by_source

Logical; if TRUE and both inputs contain mpd_source, estimate separate coefficients per source.

keep_cols

Optional character vector of extra columns from mpd_od_df to retain.

Value

A tibble with:

origin, destination, (and mpd_source if present),
flow: original MPD flow,
flow_adj: adjusted flow,
coef_factor: applied multiplicative factor.

Attributes:

"coef": estimated coefficient(s)
"model": data frame summarising fits

Details

Notation used throughout:

$F_{ij}^{mpd}$: observed MPD flow from origin $i$ to destination $j$
$F_{ij}^{bench}$: benchmark flow for the same OD pair
$F_{ij}^{adj}$: adjusted flow
$\beta > 0$: multiplicative calibration coefficient

All supported families enforce a proportional structure $$E[F_{ij}^{bench}] = \beta F_{ij}^{mpd}$$ but differ in the assumed distribution for counts:

"ols": linear regression (baseline).
"poisson": Poisson GLM with offset(log(F_{ij}^{mpd})).
"negbin": Negative Binomial GLM with offset(log(F_{ij}^{mpd})).
"zinb": Zero-inflated NB with offset(log(F_{ij}^{mpd})).

For "poisson", "negbin", and "zinb", we fit: $$\log(\mu_{ij}) = \alpha + \log(F_{ij}^{mpd})$$ so that $\mu_{ij} = \exp(\alpha) F_{ij}^{mpd}$ and $\beta = \exp(\alpha)$.

Requires overlapping positive flows in MPD and benchmark after aggregation.
For "negbin", requires MASS (Suggested).
For "zinb", requires pscl (Suggested).
If a required package is unavailable, an informative error is thrown.