brms is an R package for fitting Bayesian regression models using Stan as a backend. This post is a practical walkthrough of fitting a nonlinear model in brms using the epilepsy count dataset. The dataset has a kind of skewed, overdispersed structure that makes it a good test case. At the time of writing, brms had some limitations around the identifiability of latent parameters in certain nonlinear specifications, but this is fixed in newer versions.
library(bayesian)
library(patchwork)
epilepsy <- epilepsy %>% tibble()
epilepsy_longer <- epilepsy %>% pivot_longer(c(Base, count))
p_1 <- epilepsy %>%
ggplot(aes(Base, count)) +
geom_point(alpha = 0.5)
p_2 <- epilepsy_longer %>%
ggplot(aes(x = value)) +
geom_histogram() +
facet_wrap(~name) +
labs(y = "n", x = "value")
p_3 <- epilepsy_longer %>%
ggplot(aes(x = value)) +
geom_histogram() +
scale_x_log10() +
facet_wrap(~name) +
labs(y = "n", x = expression(log[10](value)))
p_4 <- epilepsy_longer %>% ggplot(aes(value, col = name)) +
stat_ecdf()
(p_1 + p_2 + p_4 + p_3)
Using an poisson fit with uninformed priors.
model_poisson <- brm(
data = epilepsy,
family = poisson(),
count ~ Base,
prior = set_prior("normal(0,5)"),
iter = 2000,
warmup = 1000,
chains = 1,
cores = 4,
seed = 123,
file = "./models/model_poisson.rds"
)
summary(model_poisson)
## Family: poisson
## Links: mu = log
## Formula: count ~ Base
## Data: epilepsy (Number of observations: 236)
## Draws: 1 chains, each with iter = 2000; warmup = 1000; thin = 1;
## total post-warmup draws = 1000
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 1.19 0.04 1.12 1.26 1.00 211 276
## Base 0.02 0.00 0.02 0.02 1.00 415 562
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
new_data <- tibble(Base = seq(from = 0, to = 150, length.out = 1000))
epilepsy_predictions <- model_poisson %>%
predict(new_data) %>%
data.frame() %>%
bind_cols(new_data)
epilepsy %>%
ggplot(aes(Base, count)) +
geom_smooth(
data = epilepsy_predictions,
aes(y = Estimate, ymin = Q2.5, ymax = Q97.5),
stat = "identity",
fill = "lightblue",
color = "blue",
size = 1 / 4
) +
geom_point(size = 0.3) +
coord_cartesian(ylim = c(0, 110)) +
labs(title = "nls fit with brms")