The nimbleMacros
package has two goals:
We intend the macros in this package to be widely useful, but also
just a starting point and example for NIMBLE users to write their own
macros. To begin, load the nimbleMacros
package, which will
also load the NIMBLE
package:
Macro functionality is currently enabled by default in
NIMBLE
. Loading nimbleMacros
will also
activate it automatically. If you need to do so yourself, you can set
the option manually:
We’ll begin by showing off the macros available in the package, and then discuss building your own macros at the end.
LM
: A macro for complete linear modelsThe LM
macro generates complete NIMBLE model code for a
variety of linear, generalized linear, and generalized linear mixed
models. It uses very similar syntax to R functions like lm
,
glm
, lme4::lmer
, and
lme4::glmer
.
mtcars
Here’s a logistic regression in R using the built-in
mtcars
dataset. We’ll model transmission type
(am
, where am = 1
is manual transmission) as a
function of standardized miles per gallon.
##
## Call:
## glm(formula = am ~ scale(mpg), family = binomial, data = mtcars)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.4351 0.4543 -0.958 0.33816
## scale(mpg) 1.8504 0.6921 2.673 0.00751 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 43.230 on 31 degrees of freedom
## Residual deviance: 29.675 on 30 degrees of freedom
## AIC: 33.675
##
## Number of Fisher Scoring iterations: 5
To create this model with the LM
macro, we’ll start by
organizing the data and constants into a list.
Next we’ll specify the prior distributions we want to use for the
intercept and the slope coefficient(s). This is done using the
setPriors
function. We’ll specify a uniform(-10, 10)
distribution for the intercept and a normal distribution with mean 0 and
precision 0.1 for the slope coefficient. Note that these distribution
definitions are wrapped in quote
; you can also supply them
as strings.
Finally, we specify the NIMBLE code, which is a single call to the
LM
macro that should look familiar to our call to
glm
above.
Two things to note: 1. We include a call to scale
in the
formula as with glm()
; LM
will do the scaling
automatically. 2. You don’t need to specify dimensions of the variables
with LM
, as this is handled automatically, with
LM
assuming all data and constants are vectors of the same
length. In more complex situations you can use the LINPRED
macro; see the next section.
We create a NIMBLE model with this input code, and show what the final code looks like after macro expansion.
## {
## "# LM"
## " ## FORLOOP"
## for (i_1 in 1:32) {
## am[i_1] ~ dbinom(mu_[i_1], size = 1)
## }
## " ## ----"
## " ## LINPRED"
## " ### nimbleMacros::FORLOOP"
## for (i_2 in 1:32) {
## logit(mu_[i_2]) <- beta_Intercept + beta_mpg_scaled *
## mpg_scaled[i_2]
## }
## " ### ----"
## " ### nimbleMacros::LINPRED_PRIORS"
## beta_Intercept ~ dunif(-10, 10)
## beta_mpg_scaled ~ dnorm(0, 0.1)
## " ### ----"
## " ## ----"
## "# ----"
## }
Finally, we’ll fit the model using MCMC, which yields similar results
to glm
.
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
##
## Iterations = 1:1000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 1000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## beta_Intercept -0.4651 0.4693 0.01484 0.03274
## beta_mpg_scaled 2.0391 0.7063 0.02234 0.04881
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## beta_Intercept -1.4296 -0.7595 -0.4548 -0.134 0.4101
## beta_mpg_scaled 0.7872 1.5560 2.0039 2.464 3.5055
ChickWeights
The LM
macro can also generate code for random effects
models. To illustrate this we will use the built-in
ChickWeight
dataset, which contains repeated measurements
of chick weights over time. First, fit the model in R using the
lme4
package:
## Linear mixed model fit by REML ['lmerMod']
## Formula: weight ~ Time + (1 | Chick)
## Data: ChickWeight
##
## REML criterion at convergence: 5619.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.0086 -0.5528 -0.0888 0.4975 3.4588
##
## Random effects:
## Groups Name Variance Std.Dev.
## Chick (Intercept) 717.9 26.79
## Residual 799.4 28.27
## Number of obs: 578, groups: Chick, 50
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 27.8451 4.3877 6.346
## Time 8.7261 0.1755 49.716
##
## Correlation of Fixed Effects:
## (Intr)
## Time -0.422
To fit this model with NIMBLE using LM
, we’ll first
organize the ChickWeight
data into a list of constants and
data for NIMBLE.
chick_const <- list(weight = ChickWeight$weight, Time = ChickWeight$Time, Chick = ChickWeight$Chick)
We can create the model with LM
using the same formula
notation as lme4
with the (1|Chick)
part of
the formula indicating random intercepts by chick.
Next we create the model object and show the expanded code.
## {
## "# LM"
## " ## FORLOOP"
## for (i_1 in 1:578) {
## weight[i_1] ~ dnorm(mu_[i_1], sd = sd_residual)
## }
## " ## ----"
## " ## LINPRED"
## " ### nimbleMacros::FORLOOP"
## for (i_2 in 1:578) {
## mu_[i_2] <- beta_Intercept + beta_Time * Time[i_2] +
## beta_Chick[Chick[i_2]]
## }
## " ### ----"
## " ### nimbleMacros::LINPRED_PRIORS"
## beta_Intercept ~ dnorm(0, sd = 1000)
## beta_Time ~ dnorm(0, sd = 1000)
## sd_Chick ~ dunif(0, 100)
## " #### nimbleMacros::FORLOOP"
## for (i_3 in 1:50) {
## beta_Chick[i_3] ~ dnorm(0, sd = sd_Chick)
## }
## " #### ----"
## " ### ----"
## " ## ----"
## sd_residual ~ dunif(0, 100)
## "# ----"
## }
Fit the model:
## |-------------|-------------|-------------|-------------|
## |-------------------------------------------------------|
##
## Iterations = 1:1000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 1000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## beta_Intercept 28.294 5.1556 0.163035 0.95444
## beta_Time 8.727 0.1888 0.005971 0.01540
## sd_Chick 27.444 3.1085 0.098299 0.25186
## sd_residual 28.376 0.9509 0.030069 0.06735
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## beta_Intercept 18.807 24.475 28.33 31.810 38.099
## beta_Time 8.375 8.593 8.73 8.858 9.098
## sd_Chick 21.609 25.371 27.27 29.478 34.189
## sd_residual 26.579 27.802 28.31 29.031 30.293
LINPRED
As the name suggests, the LINPRED
macro generates a
linear predictor from a provided R formula. It is used internally by the
LM
macro and can be used separately as well. The macro acts
something like the model.matrix
function in R, although
instead of creating a model matrix, it creates the NIMBLE model code for
the linear predictor. The macro supports random effects using the same
syntax as the lme4
package.
To demonstrate the macro, we’ll use the built-in
ChickWeight
dataset again. In this dataset there are
repeated weight measurements (weight
) of chicks
(Chick
) over time (Time
).
## weight Time Chick Diet
## 1 42 0 1 1
## 2 51 2 1 1
## 3 59 4 1 1
## 4 64 6 1 1
## 5 76 8 1 1
## 6 93 10 1 1
To create a linear predictor with time as a covariate in R, we would
specify the formula as ~Time
.
## (Intercept) Time
## 1 1 0
## 2 1 2
## 3 1 4
## 4 1 6
## 5 1 8
## 6 1 10
Based on the model matrix, the formula implies two parameters: the intercept and the coefficient/slope for time.
To generate the linear predictor NIMBLE model code, we start by
setting up the constants/data list. We’ll use the
chick_const
list we used with LM
, and add the
sample size N
so we can use it to define dimensions.
In the model code, we specify the parameter name for the linear
predictor (mu
) and calculate it with the
LINPRED
macro and the same formula as above. Here we also
set an argument priors = NULL
; we’ll come back to this
later. Notice that unlike with LM
, we explicitly define the
dimensions of mu
and Time
.
Next create the model object and view the code.
## {
## "# LINPRED"
## " ## nimbleMacros::FORLOOP"
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1]
## }
## " ## ----"
## "# ----"
## }
NIMBLE expands the LINPRED
macro into a for
loop calculating a value of mu
for each sample in the
dataset, where mu
is a function of time. Notice two
parameters were created: beta_Intercept
(the intercept) and
beta_Time
, the coefficient associated with time.
We can get a list of the parameters created by macros in the model
using the getMacroParameters()
function:
## $LINPRED
## $LINPRED[[1]]
## [1] "beta_Intercept" "beta_Time"
##
##
## $`nimbleMacros::FORLOOP`
## $`nimbleMacros::FORLOOP`[[1]]
## NULL
When all the data and constants used by LINPRED
have the
same dimensions, you can specify the dimensions for the left-hand-side
only:
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time, priors = NULL)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## "# LINPRED"
## " ## nimbleMacros::FORLOOP"
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1]
## }
## " ## ----"
## "# ----"
## }
In more complicated situations you may have different dimensions for
different data/constants. For example, we’ll convert Time
into a one-column matrix, but keep mu
as a vector.
chick_const2 <- chick_const
chick_const2$Time <- matrix(chick_const2$Time, ncol = 1)
str(chick_const2)
## List of 4
## $ weight: num [1:578] 42 51 59 64 76 93 106 125 149 171 ...
## $ Time : num [1:578, 1] 0 2 4 6 8 10 12 14 16 18 ...
## $ Chick : Ord.factor w/ 50 levels "18"<"16"<"15"<..: 15 15 15 15 15 15 15 15 15 15 ...
## $ N : int 578
To handle this we just need to specify different dimensions for each
in the call to LINPRED
.
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N,1], priors = NULL)
})
mod <- nimbleModel(code, chick_const2)
mod$getCode()
## {
## "# LINPRED"
## " ## nimbleMacros::FORLOOP"
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1, 1]
## }
## " ## ----"
## "# ----"
## }
The LINPRED
macro can automatically handle categorical
covariates (what R calls factors). A categorical covariate can be
provided in the constants as either an R factor, or as a character
vector/matrix/array (which will be converted automatically to a factor).
For example, suppose we included a covariate for diet type
(Diet
) in the model. The Diet
covariate is
already coded as an R factor:
## [1] "factor"
## [1] "1" "2" "3" "4"
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N] + Diet[1:N], priors = NULL)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## "# LINPRED"
## " ## nimbleMacros::FORLOOP"
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1] + beta_Diet[Diet[i_1]]
## }
## " ## ----"
## "# ----"
## }
A new vector parameter beta_Diet
is created with four
elements, one per level of Diet
. The actual parameter
Diet
is then converted to numeric and used as the index for
this vector parameter. It isn’t obvious from this code, but the first
element of beta_Diet
will be fixed at 0 (serving as the
reference level). We’ll show this below.
We could also add a link function to the left-hand-side of the linear
predictor calculation. Suppose we wanted to force mu
to be
positive; we could use a log link function, specifying this with the
link
argument.
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N], priors = NULL, link = log)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## "# LINPRED"
## " ## nimbleMacros::FORLOOP"
## for (i_1 in 1:N) {
## log(mu[i_1]) <- beta_Intercept + beta_Time * Time[i_1]
## }
## " ## ----"
## "# ----"
## }
The LINPRED
macro supports a handful of functions
commonly included in formulas: offset()
,
scale()
, log()
, and I()
. For
example, to build a linear predictor with the Time
covariate scaled to have mean 0 and SD 1:
code <- nimbleCode({
mu[1:N] <- LINPRED(~scale(Time[1:N]), priors = NULL, link = log)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## "# LINPRED"
## " ## nimbleMacros::FORLOOP"
## for (i_1 in 1:N) {
## log(mu[i_1]) <- beta_Intercept + beta_Time_scaled * Time_scaled[i_1]
## }
## " ## ----"
## "# ----"
## }
This adds a new constant Time_scaled
to the
constants:
## [1] -1.5858774 -1.2899493 -0.9940213 -0.6980932 -0.4021652 -0.1062371
LINPRED
uses a modular system to handle formula
functions. You can add support for additional functions in
LINPRED
formulas by writing a function with a specific
name, class, and output. See the source code of the package,
specifically the file formulaFunction.R
, for examples of
how to create these functions.
If we do not manually set the priors
argument,
LINPRED
will automatically generate a default set of priors
for the two parameters it created (beta_Intercept
and
beta_Time
). Internally, LINPRED
is calling the
LINPRED_PRIORS
macro (described below).
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N] + Diet[1:N])
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## "# LINPRED"
## " ## nimbleMacros::FORLOOP"
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1] + beta_Diet[Diet[i_1]]
## }
## " ## ----"
## " ## nimbleMacros::LINPRED_PRIORS"
## beta_Intercept ~ dnorm(0, sd = 1000)
## beta_Time ~ dnorm(0, sd = 1000)
## beta_Diet[1] <- 0
## beta_Diet[2] ~ dnorm(0, sd = 1000)
## beta_Diet[3] ~ dnorm(0, sd = 1000)
## beta_Diet[4] ~ dnorm(0, sd = 1000)
## " ## ----"
## "# ----"
## }
We now have priors for each parameter. Note that as mentioned
previously, the first value of the beta_Diet
vector is
fixed at 0, serving as the reference level.
Warning: one should check the default priors carefully to make sure that they make sense for your problem. In particular, if the values in your data are of magnitudes such that parameters of the linear predictor could be large in magnitude, the default priors could have a strong influence on your results.
We can also suppress the added macro comments to make things a little more concise.
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N])
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1]
## }
## beta_Intercept ~ dnorm(0, sd = 1000)
## beta_Time ~ dnorm(0, sd = 1000)
## }
The LINPRED
macro supports random effects using
lme4
syntax. For example, a random intercept by
Chick
:
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N] + (1|Chick[1:N]))
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1] + beta_Chick[Chick[i_1]]
## }
## beta_Intercept ~ dnorm(0, sd = 1000)
## beta_Time ~ dnorm(0, sd = 1000)
## sd_Chick ~ dunif(0, 100)
## for (i_2 in 1:50) {
## beta_Chick[i_2] ~ dnorm(0, sd = sd_Chick)
## }
## }
Note that by default, all random effects are mean 0, the same
approach used by lme4
. We can also get both random
intercepts and (time) slopes by Chick
; by default these are
correlated.
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N] + (Time[1:N]|Chick[1:N]))
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1] + beta_Chick[Chick[i_1]] +
## beta_Time_Chick[Chick[i_1]] * Time[i_1]
## }
## beta_Intercept ~ dnorm(0, sd = 1000)
## beta_Time ~ dnorm(0, sd = 1000)
## sd_Chick ~ dunif(0, 100)
## sd_Time_Chick ~ dunif(0, 100)
## re_sds_Chick[1] <- sd_Chick
## re_sds_Chick[2] <- sd_Time_Chick
## Ustar_Chick[1:2, 1:2] ~ dlkj_corr_cholesky(1, 2)
## U_Chick[1:2, 1:2] <- uppertri_mult_diag(Ustar_Chick[1:2,
## 1:2], re_sds_Chick[1:2])
## re_means_Chick[1] <- 0
## re_means_Chick[2] <- 0
## for (i_2 in 1:50) {
## B_Chick[i_2, 1:2] ~ dmnorm(re_means_Chick[1:2], cholesky = U_Chick[1:2,
## 1:2], prec_param = 0)
## beta_Chick[i_2] <- B_Chick[i_2, 1]
## beta_Time_Chick[i_2] <- B_Chick[i_2, 2]
## }
## }
The correlated random effects model here uses an LKJ distribution prior.
As with lme4
we can also specify uncorrelated random
slopes and intercepts using ||
instead of
|
.
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N] + (Time[1:N]||Chick[1:N]))
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1] + beta_Chick[Chick[i_1]] +
## beta_Time_Chick[Chick[i_1]] * Time[i_1]
## }
## beta_Intercept ~ dnorm(0, sd = 1000)
## beta_Time ~ dnorm(0, sd = 1000)
## sd_Chick ~ dunif(0, 100)
## for (i_2 in 1:50) {
## beta_Chick[i_2] ~ dnorm(0, sd = sd_Chick)
## }
## sd_Time_Chick ~ dunif(0, 100)
## for (i_3 in 1:50) {
## beta_Time_Chick[i_3] ~ dnorm(0, sd = sd_Time_Chick)
## }
## }
LINPRED
in a complete modelUp to this point we’ve generated just the code needed to calculate
the linear predictor. Here’s a complete linear regression model of chick
weight as a function of time making use of LINPRED
(similar
to the code generated by LM
):
code <- nimbleCode({
mu[1:N] <- LINPRED(~Time[1:N])
weight[1:N] ~ FORLOOP(dnorm(mu[1:N], sd = sigma))
sigma ~ dunif(0, 100)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## for (i_1 in 1:N) {
## mu[i_1] <- beta_Intercept + beta_Time * Time[i_1]
## }
## beta_Intercept ~ dnorm(0, sd = 1000)
## beta_Time ~ dnorm(0, sd = 1000)
## for (i_2 in 1:N) {
## weight[i_2] ~ dnorm(mu[i_2], sd = sigma)
## }
## sigma ~ dunif(0, 100)
## }
LINPRED_PRIORS
This macro generates a set of priors for parameters of a linear
predictor, also based on the R formula. In most cases, we won’t use this
macro directly; rather it will be called automatically when we use
LINPRED
(see discussion of priors
argument
above). However, if for some reason we want to generate code for only
the priors, and not the corresponding linear predictor, we can use
LINPRED_PRIORS
separately as we show below.
The formula ~Time
implies two parameters as we described
above, and LINPRED_PRIORS
will generate a prior for each
depending on the type of the parameter (e.g. intercept vs. slope). It is
not necessary to specify the dimensions in the formula when using
LINPRED_PRIORS
directly.
## {
## beta_Intercept ~ dnorm(0, sd = 1000)
## beta_Time ~ dnorm(0, sd = 1000)
## }
We can also manually set the values for different types of priors
such as intercepts and coefficients (slopes). See
?setPriors
for more details.
pr <- setPriors(intercept = quote(dnorm(-5, 5)),
coefficient = quote(dnorm(0, sd = 5)))
code <- nimbleCode({
LINPRED_PRIORS(~Time + Diet, priors = pr)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## beta_Intercept ~ dnorm(-5, 5)
## beta_Time ~ dnorm(0, sd = 5)
## beta_Diet[1] <- 0
## beta_Diet[2] ~ dnorm(0, sd = 5)
## beta_Diet[3] ~ dnorm(0, sd = 5)
## beta_Diet[4] ~ dnorm(0, sd = 5)
## }
Treating beta_Diet
as a vector parameter makes sense in
the NIMBLE model code, but it does not match the way categorical
parameters are handled by model.matrix
; i.e., dummy
variables:
## [1] "(Intercept)" "Time" "Diet2" "Diet3" "Diet4"
Here model.matrix
has created several separate
parameters Diet2
, Diet3
, and
Diet4
rather than a vector. We can replicate this structure
in NIMBLE model code by setting the LINPRED_PRIORS
option
modelMatrixNames = TRUE
.
code <- nimbleCode({
LINPRED_PRIORS(~Time + Diet, priors = pr, modelMatrixNames = TRUE)
})
mod <- nimbleModel(code, chick_const)
mod$getCode()
## {
## beta_Intercept ~ dnorm(-5, 5)
## beta_Time ~ dnorm(0, sd = 5)
## beta_Diet[1] <- 0
## beta_Diet[2] <- beta_Diet2
## beta_Diet2 ~ dnorm(0, sd = 5)
## beta_Diet[3] <- beta_Diet3
## beta_Diet3 ~ dnorm(0, sd = 5)
## beta_Diet[4] <- beta_Diet4
## beta_Diet4 ~ dnorm(0, sd = 5)
## }
When this option is set, a few extra lines of NIMBLE model code are
added to yield parameter names that match model.matrix
.
FORLOOP
The FORLOOP
macro is used by LM
,
LINPRED
, and LINPRED_PRIORS
and is a shorthand
way of creating for
loops (possibly nested) from a single
line of code. For example, suppose you want to calculate a linear
predictor mu
from N
values of a covariate
x
. The NIMBLE model code would be:
The FORLOOP
macro condenses this information into a
single line using index information inside parameter brackets:
Here’s the result after expansion, after creating some example values for the constants.
## {
## for (i_1 in 1:N) {
## mu[i_1] <- alpha + beta * x[i_1]
## }
## }
Note that one can use vectorization in NIMBLE model code directly, such as
mu[1:N] <- alpha + beta * X[1:N]
The distinction is that that creates a single multivariate node,
mu[1:N]
, while the loop creates N
separate
scalar nodes. Which is best will depend on the model structure and the
algorithm used with the model as
discussed in the NIMBLE user manual.
It may seem a bit trivial to make a macro for such a simple chunk of code. However it can be nice to use this when you have nested for loops, which are a bit more complicated.
constants <- list(x = matrix(rnorm(10), 5, 2), N = 5, J = 2)
code <- nimbleCode({
mu[1:N, 1:J] <- FORLOOP(alpha + beta * x[1:N, 1:J])
})
mod <- nimbleModel(code, constants)
mod$getCode()
## {
## for (i_1 in 1:N) {
## for (i_2 in 1:J) {
## mu[i_1, i_2] <- alpha + beta * x[i_1, i_2]
## }
## }
## }
Note that one place you can’t use this macro directly are situations
when you use the same index in multiple nested loops. For example,
suppose the parameter mu
shown above was a square
matrix.
constants <- list(x = matrix(rnorm(25), 5, 5), N = 5)
code <- nimbleCode({
mu[1:N, 1:N] <- FORLOOP(alpha + beta * x[1:N, 1:N])
})
mod <- nimbleModel(code, constants)
## Error : Not sure how to expand duplicated index 1:N in code:
## mu[1:N, 1:N]
## Error: Model macro FORLOOP failed.
The structure of this for
loop with duplicated indices
could be ambiguous in more complex situations, so FORLOOP
errors. The solution in this case would simply be to provide different
variables for each dimension, even if they are the same value.
constants <- list(x = matrix(rnorm(25), 5, 5), N = 5, K = 5)
code <- nimbleCode({
mu[1:N, 1:K] <- FORLOOP(alpha + beta * x[1:N, 1:K])
})
mod <- nimbleModel(code, constants)
mod$getCode()
## {
## for (i_1 in 1:N) {
## for (i_2 in 1:K) {
## mu[i_1, i_2] <- alpha + beta * x[i_1, i_2]
## }
## }
## }
Suppose you wanted to create a macro called MYDIST
,
which creates a normal distribution with mean 0 and provided standard
deviation. The macro would look like this in the NIMBLE model code:
Note that by convention, we define the macro name in all capital
letters. All the provided macros in nimbleMacros
follow
this convention. We want the final, processed output to look like:
A macro is, fundamentally, a specially-formatted R function that converts one line of code to another. For example:
fun <- function(input){
lhs <- input[[2]]
sdev <- input[[3]]$sdev
output <- substitute(LHS ~ dnorm(0, sd = SDEV),
list(LHS = lhs, SDEV = sdev))
output
}
Note that we have to do a bit of clunky code-processing to extract
the left-hand-side of the assignment and the argument value. (The use of
the [[
extraction with a code object extracts information
from the abstract syntax tree representation of an R expression.)
## y ~ dnorm(0, sd = 2)
In practice, nimble
expects the macro function converter
(fun
in the example above) to have two additional
arguments: modelInfo
and .env
. When processing
a macro, nimble
will provide a named list to the
modelInfo
argument which contains some additional model
information. Most importantly, this list includes
constants
, which can be used and/or changed by the macro.
And, as you can probably guess, nimble
provides the R
environment in which nimbleModel
was called to the
.env
argument, to help with scoping issues.
In addition to these two arguments, the function also needs to return
a particular structure in the output. The output should be a named list
with two elements: code
, which contains the output code (as
in the original function above), and modelInfo
which is the
list of model information, which we could modify or leave as-is.
fun <- function(input_code, modelInfo, .env){
LHS <- input_code[[2]]
sdev <- input_code[[3]]$sdev
output_code <- substitute(LHS ~ dnorm(0, sd = SDEV),
list(LHS = LHS, SDEV = sdev))
# Modify the constants
modelInfo$constants$new <- 1
list(code = output_code, modelInfo = modelInfo)
}
input <- quote(y ~ MYDIST(sdev = 2))
fun(input, modelInfo = list(constants = NULL), .env = environment())
## $code
## y ~ dnorm(0, sd = 2)
##
## $modelInfo
## $modelInfo$constants
## $modelInfo$constants$new
## [1] 1
Finally, nimble
needs a way to know that this function
is intended to be a macro, and that it is called MYDIST
.
The final step then is to create a wrapper list
with our
function inserted as an element called process
, and assign
the class "model_macro"
. (In the next section we’ll see
that the helper function buildMacro
will create this
wrapper list for us.)
Now, we can test the macro in a NIMBLE model:
code <- nimbleCode({
y ~ MYDIST(sdev = 2)
})
mod <- nimbleModel(code = code, data = list(y = 1))
mod$getCode()
## {
## y ~ dnorm(0, sd = 2)
## }
buildMacro
to create the macroManually creating the macro is fine, but it gets complicated if we
want to extract a lot of separate pieces of information from the code
line, e.g., if our macro has many arguments. As an alternative, we can
use the buildMacro
function included with
nimble
, which handles some of this code processing for us.
The first argument to buildMacro
, is a function structured
like the one we wrote above fun
. The other two arguments
use3pieces
and unpackArgs
determine how the
arguments of fun
should be structured.
First, suppose both use3pieces
and
unpackArgs
are set to FALSE
. In this case, the
first argument to fun
is the entire original line of code
containing the macro, i.e., y ~ MYDIST(sdev = 2)
. This is
identical to how we wrote the function above. In this case, all
buildMacro
will do is create the required macro list
structure and assign the class for us.
code <- nimbleCode({
y ~ MYDIST2(sdev = 2)
})
mod <- nimbleModel(code = code, data = list(y = 1))
mod$getCode()
## {
## y ~ dnorm(0, sd = 2)
## }
Now we will set use3pieces = TRUE
. In this case,
buildMacro
will automatically do some initial processing on
the code line for us. Specifically, it will identify if the assignment
is stochastic (stoch
, TRUE
if ~
and FALSE
if <-
), and separate and return
the left-hand-side (LHS
) and right-hand-side
(RHS
) of the assignment. With these settings, our
function’s first three arguments need to be stoch
,
LHS
, and RHS
, in that order (plus we keep the
always-required modelInfo
and .env
).
fun3 <- function(stoch, lhs, rhs, modelInfo, .env){
sdev <- rhs$sdev
output_code <- substitute(LHS ~ dnorm(0, sd = SDEV),
list(LHS = lhs, SDEV = sdev))
# Modify the constants
modelInfo$constants$new <- 1
list(code = output_code, modelInfo = modelInfo)
}
We do a little less code processing this time since
buildMacro
has already separated LHS and RHS for us. Note
that in this simple example we aren’t using stoch
because
we want our output code to always be stochastic.
## Error in (function (stoch, lhs, rhs, modelInfo, .env) :
## unused arguments (LHS = quote(y), RHS = quote(MYDIST3(sdev = 2)))
## Error: Model macro MYDIST3 failed.
## {
## y ~ dnorm(0, sd = 2)
## }
Finally, consider the case where we also set
unpackArgs = TRUE
. Now, buildMacro
will dig
into the right-hand-side of the macro and extract the argument values
for us (in this case just sdev
). Thus, we need to include
all these arguments explicitly as arguments in our function.
fun4 <- function(stoch, lhs, sdev, modelInfo, .env){
output_code <- substitute(LHS ~ dnorm(0, sd = SDEV),
list(LHS = lhs, SDEV = sdev))
# Modify the constants
modelInfo$constants$new <- 1
list(code = output_code, modelInfo = modelInfo)
}
Even less code processing is required in this case, since
buildMacro
will split out the sdev
argument
for us.
## Error in (function (stoch, lhs, sdev, modelInfo, .env) :
## unused argument (LHS = quote(y))
## Error: Model macro MYDIST4 failed.
## {
## y ~ dnorm(0, sd = 2)
## }
The choice of arguments will depend on the macro: sometimes the
short-cuts enabled by buildMacro
will be useful, other
times you may need more manual control over how the input code line is
processed. See ?buildMacro
for additional detailed
documentation and examples, including an example when
use3pieces = FALSE
and unpackArgs = TRUE
. You
may also find the source code for the macros included in
nimbleMacros
, found here,
a useful resource for writing your own.