Lecture 6—Wednesday, September 12, 2012

Topics

Randomized block design
An example of a randomized complete block design (RCBD)
The fixed effects approach to analyzing randomized block designs
Using random effects to account for blocking
Estimating a model with random effects
- Estimating a RCBD model using the nlme package
- Estimating a RCBD model using the lme4 package
The p-value problem in mixed effects models
The parametric bootstrap
Markov chain Monte Carlo methods
Cited references
R code

R functions and commands demonstrated

as.data.frame checks if an object is a data frame and if not tries to coerce it into one.
detach can be used to remove packages from memory.
fixef extracts estimates of the fixed effects from an lmer or lme object.
for defines a for loop, a programming tool that allows a set of instructions to be executed repeatedly.
HPDinterval (from the lme4 package) obtains highest probability density credible intervals from the posterior distributions generated by the mcmcsamp function.
lme (from the nlme package) fits mixed effects models with a normally distributed response variable.
lmer (from the lme4 package) fits generalized mixed effects models with hierarchical or crossed random effects.
mcmcsamp (from the lme4 package) carries out Markov chain Monte Carlo sampling to approximate the posterior distributions of parameters from an lmer object.
numeric creates a vector or coerces an object into a vector of a specified length.
quantile estimates specified quantiles from the empirical distribution of a variable.
ranef extracts predictions of the random effects from an lmer or lme object.

R function options

auto.key= (argument to many lattice plotting functions) adds a crude key to the graph.
groups= (argument to many lattice plotting functions) specifies a variable that identifies different groups within a panel for plotting.
level= (argument to predict) when predict is applied to an lme object level controls which random effects are used in predicting the mean. The choice level=0 corresponds to no random effects at all and yields the marginal (population average) prediction.
random= (argument to the lme function) is used to specify the random effects structure of an lme model. The basic syntax is random=~x|group where x is the variable whose coefficient is to be made random and group is the variable or variables that identifies the hierarchical structure of the data.

Additional R packages used

lattice for the dotplot function.
lme4 for the lmer, mcmcsamp, and HPDinterval functions.
nlme for the lme function.

Randomized block design

Up until now we’ve worked only with completely randomized statistical designs. In a completely randomized design treatments are randomly assigned to units and the only recognizable differences between those units are the different treatments that have been applied to them. If the units are in fact heterogeneous then the hope is that the random assignment of treatments to units has mixed things up so that on balance the heterogeneity of units assigned to the different treatment groups will be roughly the same.

This is not always optimal. Sometimes we can recognize the heterogeneity of the units ahead of time and wish to take advantage of it. For instance, if we can group similar units together such that each of the similar units gets a different treatment, we could restrict ourselves to making comparisons between the similar units and thus remove some of the background variation that might otherwise make detecting a treatment effect more difficult. If the units are similar then the differences we detect are more likely due to treatment differences. A common example of this is to use genetically related individuals as experimental subjects in which individuals from the same litter or cuttings from the same plant are assigned different treatments. In statistics we use the term block to refer to a group of similar experimental units.

Sometimes the creation of blocks is accidental. The set-up of the experiment may cause some units to be more similar to each other than to others.

Consider a greenhouse experiment with plants in pots. To save space we grow multiple plants in the same pot. As a result plants in the same plot share a common environment that is different from other plants that are growing in different pots.
In this same greenhouse experiment groups of pots are placed benches. Some of the benches are near a door or a window and it's thought that this may affect the results..
In a field experiment in which the experimental unit is a plot, some plots occur on a slope. To prevent the confounding of treatment effects with slope effects we may choose to partition the hillside into blocks to ensure that all the treatments are represented at each height of the hill.

In all of these examples the variable that identifies the similar units is called a block. When we include a blocking variable in an analysis we are trying to account for all the things we failed to measure that makes one block different from the next block (and make the individuals in the same block more similar to each other). Statistically a block appears in a regression model as just another categorical variable. So, it would seem that a block is no different than an ordinary factor in analysis of variance. This is more or less true but there are some fundamental differences in how we treat blocks and treatments in regression models.

Blocks are usually small and typically, but not always, there are only enough experimental units for a single replication of the individual treatments in a block. This is because blocks are supposed to be homogeneous units. The larger and more expansive we make the block the less homogeneous it is likely to be. Without replication of the treatments within a block we can't formally estimate a block × treatment interaction. Graphical methods and various analytical methods that test for specific kinds of interactions are useful in this case.
Unlike treatment effects the estimates of the block effects are generally of no interest to us. Because there are often a lot of blocks the presence of block effect estimates in a summary table tends to create a confusing mess.
As is the case with any categorical variable, when we treat block as a factor variable in a regression analysis one block gets chosen as the reference block and all the other block estimates become effects—deviations from this reference group. Consequently the estimates of the mean response for a given treatment will be for this specific reference block rather than for an average block.
If we plan to use the model to predict the mean response for new individuals we will need to supply the value of the block variable for these individuals in order to obtain a prediction. Because the blocks appearing in the analysis correspond to unique categories and refer only to the subjects in the current study, prediction for new individuals is essentially impossible.

An example of a randomized complete block design (RCBD)

This example comes from Sokal and Rohlf (1995), p. 365. "Blakeslee (1921) studied length/width ratios of second seedling leaves of two types of Jimsonweed called globe (G) and nominal (N). Three seeds of each type were planted in 16 pots. Is there sufficient evidence to conclude that globe and nominal differ in length/width ratio?" The file jimsonweed.txt is a tab-delimited text file.

plants <- read.table('ecol 563/jimsonweed.txt', header=T, sep='\t')

plants[1:8,]

lw.rat   pot type
1   1.67 16533    G
2   1.53 16533    G
3   1.61 16533    G
4   2.18 16533    N
5   2.23 16533    N
6   2.32 16533    N
7   1.68 16534    G
8   1.70 16534    G

We can examine the structure of the data set by using a dot plot in which we display the length-width ratios of plants from different pots using different colors to denote the different treatments. For this I use the dotplot function from lattice. To color observations by the levels of the treatment variable "type" I specify that variable in the groups argument of dotplot. To get a crude key that identifies the groups I include the argument auto.key=T.

library(lattice)

dotplot(factor(pot)~lw.rat, groups=type, data=plants, auto.key=T)

Fig. 1 Dot plot of length-width ratios separately by pot colored by treatment

Even if we didn't know that this was a randomized block design the blocking structure is immediately apparent from Fig. 1. In a completely randomized design with treatments randomly assigned to plants and pot playing no role in the treatment assignment we would not be seeing an equal number of G and N treatment assignments in each pot. It's also obvious from Fig. 1 that there is a marked treatment effect with nominal (N) plants having a larger length-width ratio than globe (G) plants. This pattern holds up in every pot which suggests that there is no interaction between treatment and block.

Fig. 1 does reveal some systematic differences across pots (whole distributions are shifted to the left or the right) so accounting for the blocking in the analysis is potentially useful. One unusual thing about this experiment is that treatments are replicated within blocks (pots). Thus we will be able to formally test for a block × treatment interaction.

The fixed effects approach to analyzing randomized block designs

Because a block is just a categorical variable we can use analysis of variance type models to analyze the randomized block design. Below I fit an additive model (pot + type) and an interaction model (pot*type) to the data. Because the variable that labels the pot categories is numeric I have to convert it explicitly to a factor in the function call.

mod1 <- lm(lw.rat~factor(pot)+type, data=plants)

mod2 <- lm(lw.rat~factor(pot)*type, data=plants)

anova(mod2)

Analysis of Variance Table

Response: lw.rat
                 Df Sum Sq Mean Sq F value    Pr(>F)
factor(pot)      15 0.8977 0.0598   3.373 0.000342 ***
type              1 7.3206 7.3206 412.575 < 2.2e-16 ***
factor(pot):type 15 0.3050 0.0203   1.146 0.336418
Residuals        64 1.1356 0.0177
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

From the ANOVA table we see that block × treatment interaction is not significant. On the other hand both the treatment and block effects are statistically significant.

anova(mod1)

Analysis of Variance Table

Response: lw.rat
            Df Sum Sq Mean Sq F value    Pr(>F)
factor(pot) 15 0.8977 0.0598   3.282 0.0002944 ***
type         1 7.3206 7.3206 401.444 < 2.2e-16 ***
Residuals   79 1.4406 0.0182
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

It's worthwhile at this point to compare the randomized block design results with what we would have obtained had we ignored the blocking structure of the experiment and had instead carried out a one-way analysis of variance (which in this example is just an independent samples t-test).

anova(mod0)

Analysis of Variance Table

Response: lw.rat
          Df Sum Sq Mean Sq F value    Pr(>F)
type       1 7.3206 7.3206 294.28 < 2.2e-16 ***
Residuals 94 2.3384 0.0249
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The crucial thing to look at in comparing these models is the mean squared error. This is the estimate of the background noise that is then used as the gold standard in the construction of the reported F-tests. For the randomized block design the mean squared error is reported to be 0.0182; for the one-way analysis of variance it is 0.0249. So there's about a 25% reduction in the variance that has been accounted for by the blocking.

If we examine the summary table of the model we see that we've estimated a large number of uninteresting terms, the block effects.

summary(mod1)

Call:
lm(formula = lw.rat ~ factor(pot) + type, data = plants)

Residuals:
Min 1Q Median 3Q Max
-0.30719 -0.09349 0.01333 0.08010 0.36052

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)
(Intercept)       1.64719    0.05683 28.986 < 2e-16 ***
factor(pot)16534 -0.06167    0.07797 -0.791 0.43134
factor(pot)16550 0.04000    0.07797   0.513 0.60935
factor(pot)16668 -0.13000    0.07797 -1.667 0.09939 .
factor(pot)16767 -0.07667    0.07797 -0.983 0.32844
factor(pot)16768 0.02833    0.07797   0.363 0.71727
factor(pot)16770 -0.10833    0.07797 -1.390 0.16858
factor(pot)16771 -0.08167    0.07797 -1.047 0.29807
factor(pot)16773 -0.17667    0.07797 -2.266 0.02619 *
factor(pot)16775 -0.23833    0.07797 -3.057 0.00305 **
factor(pot)16776 -0.21167    0.07797 -2.715 0.00814 **
factor(pot)16777 -0.25000    0.07797 -3.207 0.00194 **
factor(pot)16780 -0.25833    0.07797 -3.313 0.00139 **
factor(pot)16781 -0.09333    0.07797 -1.197 0.23484
factor(pot)16787 -0.19000    0.07797 -2.437 0.01706 *
factor(pot)16789 -0.24000    0.07797 -3.078 0.00286 **
typeN             0.55229    0.02756 20.036 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.135 on 79 degrees of freedom
Multiple R-squared: 0.8509, Adjusted R-squared: 0.8206
F-statistic: 28.17 on 16 and 79 DF, p-value: < 2.2e-16

The only interesting estimate, the treatment effect, appears at the bottom. From the output we conclude that the mean length-width ratio of nominal (N) plants is 0.553 units bigger than it is for globe (G) plants.

So the above analysis provides an estimate of the treatment effect but if we want to know what the average length-width ratios are for the two treatment groups then we have a problem. If this were a one-way analysis of variance model then the intercept would represent the mean length-width ratio for globe plants and to obtain the mean for nominal plants we would just add the treatment effect 0.552 to the intercept. In a two-way design the intercept represents the mean response when both factors are set to their reference values. Similarly in a randomized block design the intercept represents the mean response when the treatment is at its reference value and block is at its reference value. So, in the above design the intercept, 1.647, is the mean length-ratio for a globe plant that was reared in pot #16533, the reference block. When we add the treatment effect to this value we obtain the mean length-ratio for a nominal plant but again only for those plants reared in pot #16533.

So, in order to obtain the length-width ratio for a given treatment we have to specify a pot number too. If we try to use the predict function on the model just specifying values for type, R complains that it doesn't have enough information.

predict(mod1, newdata=data.frame(type=c('G','N')))

Error in factor(pot) : object 'pot' not found

If instead we create a data frame consisting of values of type and pot in all possible combinations we can obtain the treatment means for the specific pots used in the experiment.

nd <- expand.grid(pot=levels(factor(plants$pot)), type=levels(factor(plants$type)))

out.p <- predict(mod1, newdata=nd)

data.frame(nd, prediction=out.p)[1:8,]

    pot type prediction
1 16533    G   1.647187
2 16534    G   1.585521
3 16550    G   1.687187
4 16668    G   1.517188
5 16767    G   1.570521
6 16768    G   1.675521
7 16770    G   1.538854
8 16771    G   1.565521

So while the randomized design when analyzed in the fashion described above is useful for estimating treatment effects it is not particularly useful for estimating treatment means.

Using random effects to account for blocking

The analysis we carried out above is called the fixed effects approach to the randomized complete block design (RCBD). In it we define separate sets of dummy variables for treatment and block. If i denotes the block, here numbered 1 through 16, and j the observation in that block, j = 1 through 6, then we create the following set of dummy variables.

, z factor

The model for the mean response is

fixed mean

and the model for an individual observation is

observation

where . In this formulation the intercept β₀ represents the mean response for type = 'G' in block 1, β₁ is the treatment effect which is the same in each block (because there is no interaction), and β₂ through β₁₆ are individual block effects.

There is a second way to formulate the RCBD using what are called random effects. In this approach the model for the mean is the following.

and the model for an individual observation is

	(1)

where as before but in addition . The u_0i are called random effects, more specifically in the RCBD, random intercepts. There is one random effect for each block, a total of 16 in all for the current design. They are called random effects because they have a distribution instead of being constants like the β_i parameters in the model. The β_i are called fixed effects, the u_0i are called random effects, and the entire model is called a mixed effects model.

According to the distributional assumption the random effects have a mean of zero, as do the random errors. Thus if we take the mean of eqn (1) we obtain just the fixed effects portion of the model: . So in the mixed effects approach, unlike the fixed effects model of the RCBD, the fixed effects portion of the model corresponds to the treatment mean averaged over all the blocks. β₀ is the mean length-width ratio for the globe plants averaged across pots and β₀ + β₁ is the mean length-width ratio for the nominal plants averaged across pots. If the blocks used in the experiment are a random sample from a population of such blocks then it is legitimate to refer to the fixed effect portion of the model as the population-average model.

Estimating a model with random effects

For an ordinary linear regression model least squares provides us with an explicit solution for the regression parameters. For a mixed effects model there is no explicit solution; parameter estimates have to be obtained iteratively using some variation of Newton's method. Two different packages in R can be used to fit mixed effects models: nlme and lme4. The nlme package is the older of the two and can only fit models in which the response variable has a normal distribution. It is best suited for fitting models with what are called hierarchical random effects. We previously used the gls function from the nlme package in lecture 3. The lme4 package is a recent development and extends the nlme package to include additional probability distributions for the response besides the normal distribution. It can also fit models with both crossed and hierarchical random effects. I illustrate fitting the RCBD to the jimsonweed data using both packages.

Estimating a RCBD model using the nlme package

Although the nlme package is part of the standard installation of R, it still needs to be loaded into memory at the start of each R session. The function in the nlme package that fits linear regression models with random effects is called lme, for linear mixed effects models. Its syntax is the same as lm except for an additional random argument in which the random effects specification is described.

library(nlme)

mod2.lme <- lme(lw.rat~type, random=~1|pot, data=plants)

Notice the syntax that appears in the random argument: ~1|pot.

The ~ signifies that what comes next is the right side of a formula.
The numeral 1 refers to the intercept, which is why this is sometimes called a random intercepts model.
A vertical bar separates the random coefficient from the grouping variable.
The grouping variable is the variable that identifies the blocks.

The anova function when applied to an lme object produces a fairly standard ANOVA table. The blocks don't appear in the table because they are part of random effects formulation of the model.

anova(mod2.lme)

            numDF denDF F-value p-value
(Intercept)     1    79 5170.002 <.0001
type            1    79 401.444 <.0001

The summary table for the model provides an estimate of the intercept β₀, the treatment effect β₁, the standard deviation of the error distribution, σ, and the standard deviation of the random effects distribution, τ.

summary(mod2.lme)

Linear mixed-effects model fit by REML
Data: plants
AIC BIC logLik
-76.08137 -65.90819 42.04068

Random effects:
Formula: ~1 | pot
(Intercept) Residual
StdDev: 0.08328042 0.1350398

Fixed effects: lw.rat ~ type
                Value Std.Error DF t-value p-value
(Intercept) 1.5191667 0.02851996 79 53.26679       0
typeN       0.5522917 0.02756488 79 20.03607       0
Correlation:
      (Intr)
typeN -0.483

Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-1.89568380 -0.65943997 0.02981137 0.62150485 3.04884738

Number of Observations: 96
Number of Groups: 16

I've highlighted the entries in the summary table corresponding to the parameter estimates.

The regression coefficient estimates are β₀ = 1.5192 and β₁ = 0.55229. The estimate of β₁ is identical to what we obtained using lm to fit the fixed effects RCBD model. The estimate of β₀ is different from the fixed effects model but that's because β₀ now represents something quite different. In the mixed effects model β₀ is the average length-width ratio for all globe plants (not just the average length-width ratio for globe plants in pot #16533).
The section labeled "Random effects" displays estimates of the parameters of the random effects distribution and the error distribution. The entry labeled "(Intercept)" is τ, the standard deviation of the random effects distribution, and the entry labeled "Residual" is σ, the standard deviation of the error distribution.

To extract the two regression coefficients β₀ and β₁ from the model object we can use the fixef function.

fixef(mod2.lme)

(Intercept) typeN
1.5191667 0.5522917

If we use the coef function, which serves this purpose for lm objects, we get a very different kind of output.

coef(mod2.lme)

      (Intercept)     typeN
16533    1.608180 0.5522917
16534    1.565303 0.5522917
16550    1.635993 0.5522917
16668    1.517791 0.5522917
16767    1.554874 0.5522917
16768    1.627881 0.5522917
16770    1.532856 0.5522917
16771    1.551397 0.5522917
16773    1.485343 0.5522917
16775    1.442466 0.5522917
16776    1.461007 0.5522917
16777    1.434354 0.5522917
16780    1.428559 0.5522917
16781    1.543285 0.5522917
16787    1.476072 0.5522917
16789    1.441307 0.5522917

Notice that we get 16 pairs of estimators for β₀ and β₁, one for each of the 16 blocks (pots). The value for β₁ is the same for each but the value of the intercept varies by block. What's being returned here are the random intercepts defined by

These consist of the overall population intercept plus a prediction of the individual block random effects. To see the predictions of the random effects alone use the ranef function.

ranef(mod2.lme)

(Intercept)
16533 0.089013757
16534 0.046136504
16550 0.116826030
16668 -0.001376128
16767 0.035706902
16768 0.108714117
16770 0.013688853
16771 0.032230368
16773 -0.033823779
16775 -0.076701033
16776 -0.058159518
16777 -0.084812945
16780 -0.090607169
16781 0.024118455
16787 -0.043094537
16789 -0.077859877

These random effects are not obtained as part of the ordinary estimation protocol that returns estimates of β₀, β₁, σ, and τ, but instead are obtained afterwards using these estimates. For that reason they are usually called predictions rather than estimates. They are variously called empirical Bayes predictions (estimates) or BLUPs (best linear unbiased predictors).

The predict function works with lme objects. If you apply the predict function to an lme model object you obtain the predicted means for each observation used to fit the model. There are 96 observations in the data set so below I just display the predictions for the first twelve.

predict(mod2.lme)[1:12]

16533 16533 16533 16533 16533 16533 16534 16534 16534 16534
1.608180 1.608180 1.608180 2.160472 2.160472 2.160472 1.565303 1.565303 1.565303 2.117595
16534 16534
2.117595 2.117595

Notice that the predicted mean changes when we switch blocks. By default the predict function returns an estimate for the mean that includes the block effects. Thus it returns for globe plants and for nominal plants. To obtain predictions of the mean that don't include the random effects we have to specify the level argument of predict (an argument that is only appropriate for lme objects) and set it to level=0.

predict(mod2.lme, level=0)[1:12]

16533 16533 16533 16533 16533 16533 16534 16534 16534 16534
1.519167 1.519167 1.519167 2.071458 2.071458 2.071458 1.519167 1.519167 1.519167 2.071458
16534 16534
2.071458 2.071458

Now we see that the globe and nominal means are the same for plants coming from different blocks.

Estimating a RCBD model using the lme4 package

The lme4 package is not part of the standard R installation so it must be downloaded first from the CRAN site. Because the lme4 package is still undergoing active development you'll probably want to make sure you have the latest version of R too. When you load lme4 into memory with the library function you'll obtain some warning messages about objects being masked.

library(lme4)

Attaching package: ‘lme4’

The following object(s) are masked from ‘package:nlme’:

lmList, VarCorr

The following object(s) are masked from ‘package:stats’:

AIC, BIC

Some of the functions in the lme4 and nlme packages have the same name. To ensure that the correct function is called when you need it you should unload the nlme package from memory by using the detach function.

detach(package:nlme)

The function in the lme4 package that fits linear mixed effects models is lmer. The lmer function does not have a random argument. Instead the random effects specification is included as part of the regression model as follows.

mod2.lmer <- lmer(lw.rat~type + (1|pot), data=plants)

Notice that in the lme function we specified: random=~1|pot. Here we include (1|pot) as a term in the model. Both the anova and summary functions work on lmer objects.

anova(mod2.lmer)

Analysis of Variance Table
Df Sum Sq Mean Sq F value
type 1 7.3206 7.3206 401.44

summary(mod2.lmer)

Linear mixed model fit by REML
Formula: lw.rat ~ type + (1 | pot)
   Data: plants
    AIC    BIC logLik deviance REMLdev
-76.08 -65.82 42.04   -94.99 -84.08
Random effects:
Groups   Name        Variance Std.Dev.
pot      (Intercept) 0.0069356 0.08328
Residual             0.0182357 0.13504
Number of obs: 96, groups: pot, 16

Fixed effects:
            Estimate Std. Error t value
(Intercept) 1.51917    0.02852   53.27
typeN        0.55229    0.02756   20.04

Correlation of Fixed Effects:
(Intr)
typeN -0.483

If you compare the estimates reported by lmer with those reported by lme, they are the same. One noticeable difference between the lme output and the lmer output is that lmer reports the values of test statistics, but it does not report p-values.

The p-value problem in mixed effects models

The absence of p-values in the output from lmer is by design. The author of the lme4 package, Douglas Bates, argues that there is no legitimate method using standard probability distributions for obtaining p-values for the F- and t-statistics of mixed effects models that will work in all cases. His explanation can be found here and is summarized briefly below.

With ordinary fixed effects models we can compare nested models using an F-test. Degrees of freedom are based on the sample size and number of estimated parameters. If we can assume that the errors have a normal distribution the F-distribution is exact.
With mixed effects models (models with one or more sets of random effects) we could try doing something similar. If we assume that the estimated parameters of the random effects distribution are the true values, then the mixed effects model reduces to a fixed effects model but with a different variance-covariance matrix for the errors. The usual F-tests then apply. This is the approach taken by the nlme package.

There are problems with the approach used by nlme.

Random effects parameters are not known; they’re estimated. So at best the F-distribution is an approximation.
The notion of degrees of freedom is no longer well-defined. We have multiple random effects but they are not getting counted as parameters. When we turn to hierarchical data (the primary kind of data for which mixed effects models get used) what constitutes an observation for a specific test will vary as will the sample size.

There have been lots of recommendations for obtaining correct p-values of test statistics in mixed effects models but none of these apply in general. For example, SAS offers the user a number of different ways to carry out tests and calculate p-values but leaves it up to the user to choose one. Some of these approaches are clearly good choices for a few special cases, e.g., balanced data with a simple structure, but none of these are general solutions.

We can make the following general recommendations for carrying out hypothesis testing in mixed effects models (Faraway 2010).

When samples are large it is legitimate to assume that the summary table statistics (t-values) are normally distributed. In this case use a likelihood ratio test rather than an F-test for comparing nested models that differ in more than one parameter.
Instead of appealing to some distributional assumption of the test statistic, generate an empirical distribution of the test statistic using a parametric bootstrap.
Use methods based on Markov chain Monte Carlo to obtain confidence (credible) intervals for parameters of interest.

The parametric bootstrap

The parametric bootstrap is a Monte Carlo method for obtaining the empirical distribution of a test statistic of interest. In this method we fit a model of interest to a sequence of simulated data sets where the simulation is carried out in such a way that the null hypothesis is true. For each simulation we calculate the test statistic whose distribution we don't know. The set of test statistics we obtain estimates the null distribution of our test statistic. We then determine the position of the actual test statistic, the one that was calculated using the real data, in this null distribution. If it is an extreme value in this distribution we have reason to reject the null hypothesis. Formally we can obtain a p-value for the test by counting up the number of simulated test statistics as large or larger than the actual test statistic (for a one-tailed test) and divide this by the total number of simulations. Usually the actual test statistic is counted as one of the simulations so the number of simulated test statistics as large as or larger than the actual test statistic is always at least one.

We can use the parametric bootstrap to obtain a p-value for the F-statistic for the variable type that is reported by the anova function when applied to an lmer object. The F-statistic lies in row 1, column 4 of the anova output.

anova(mod2.lmer)

Analysis of Variance Table
Df Sum Sq Mean Sq F value
type 1 7.3206 7.3206 401.44

anova(mod2.lmer)[1,4]

[1] 401.4437

Next we fit a mixed effects model to the plants data set but we don't include type as a predictor. The model includes only the random intercepts.

# fit a model to the data without type as a predictor

mod1.lmer <- lmer(lw.rat~(1|pot), data=plants)

We now use this model to generate new data sets. Data sets generated from this model will have the same random effects structure as do our actual data. By construction the variable type will not be a predictor of the response (because the response variable was generated without it). We fit a mixed effects model to each simulated data set in which we include type as a predictor. Because the simulated response does not depend on type the F-statistics we obtain from these fits provide a null distribution for the F-statistic for type.

In the code below I carry out 999 simulations each time extracting the F-statistic. I initialize a vector using the numeric function to store the results of the simulations. I use a for loop to perform the iterations and the simulate function to generate data from the lmer model. In the end I append the actual F-statistic as one additional simulation to yield a total of 1000.

nrep <- 999

# initialize storage vector

Fstat <- numeric(nrep+1)

# loop to carry out simulations

for(i in 1:nrep) {

# simulate data from model in which type has no effect

rmath <- unlist(simulate(mod1.lmer))

# estimate type model to these data

rmod <- lmer(rmath~(1|pot)+type, data=plants)

# extract statistic

Fstat[i] <- anova(rmod)[1,4]

}

Fstat[1000] <- anova(mod2.lmer)[1,4]

max(Fstat[1:999])

[1] 9.41978

Fstat[1000]

[1] 401.4437

The largest simulated F-statistic is 9.42 while the actual F-statistic exceeds 400. Since the actual F-statistic is the largest of the simulated F-statistics our p-value is 1 out of 1000, or .001. The F-test of type is statistically significant.

sum(Fstat>=Fstat[1000])/length(Fstat)

[1] 0.001

Markov chain Monte Carlo methods

We will discuss Bayesian estimation in greater detail later in this course, so at this point I give only a cursory introduction. The standard statistical methods we've covered so far in this course are called frequentist methods. The logical foundation of the frequentist approach is based on the notion of a sampling distribution. Parameters are thought to be fixed in a nature, but a statistic that is used to estimate that parameter being based on a sample varies. If we were to obtain a different sample of the same size from the same population the value of the statistic would probably be different. The set of statistics calculated from all possible samples from the population yields the sampling distribution of the statistic. The sampling distribution is the basis for constructing confidence intervals and carrying out statistical tests in the frequentist approach.

The Bayesian school of statistics approaches statistical estimation from a different philosophical viewpoint. Bayesians are indifferent to the notion of whether there exists a true value of a parameter in nature. From a Bayesian point of view what we know about a parameter is really just a matter of opinion. Being rational beings we should formulate our opinions as probability statements that quantify what are the likely values of the parameter. Before we collect any data we generally have some opinion about what values the parameter might take. This is called our prior distribution for the parameter. If we truly don't know anything about the parameter then we have what's called a flat or uninformative prior. After collecting data or carrying out an experiment we use the information contained in the data to update our prior opinion about the parameter. Our updated opinion is called the posterior distribution of the parameter.

The modern way of carrying out Bayesian estimation is via Markov chain Monte Carlo (MCMC) sampling. MCMC doesn't provide formulas for the posterior distributions of parameters, instead it yields samples from these distributions. Using these samples we can then calculate summary statistics to characterize the distributions of the parameters. For instance we could calculate means or medians of the distribution and use them as point estimates. For interval estimates we can calculate the .025 and .975 quantiles of the sampled posterior distributions to obtain 95% confidence intervals for the parameters (although Bayesians prefer to use the term credible intervals).

R provides the mcmcsamp function as a simple way to obtain Bayesian parameter estimates using a frequentist model fit with lmer. To use mcmcsamp we supply the name of the lmer model plus the number of desired samples from the posterior distribution. There are a number of diagnostics that one should look at to verify that the Markov chain has stabilized and is truly sampling from the posterior distribution. We'll ignore those issues for now and explore them at a later date when we formally consider Bayesian estimation. In the call below I use the lmer model mod2.lmer and request 10,000 samples.

# Bayesian estimation using MCMC

out.mc <- mcmcsamp(mod2.lmer, n=10000)

To organize the output from mcmcsamp in the form of a data frame we can use the as.data.frame function.

# convert to a data frame

ff <- as.data.frame(out.mc)

head(ff)

(Intercept)     typeN       ST1     sigma
1    1.519167 0.5522917 0.6167091 0.1350398
2    1.555116 0.5903976 0.3500714 0.1333496
3    1.522187 0.5537884 0.5679478 0.1316728
4    1.527750 0.5135399 0.2876111 0.1477140
5    1.492065 0.5687822 0.2561409 0.1378538
6    1.479926 0.6054857 0.2515148 0.1578332

The data frame contains samples from the posterior distribution of the intercept β₀, the treatment effect β₁ labeled "typeN", the standard deviation of the errors "sigma", and ST1 which is the ratio of the random effects standard deviation to the error standard deviation, τ/σ. To obtain point estimates of these parameters we can take the mean or median of each of the columns of the data frame. These values compare favorably with the point estimates returned by lmer.

apply(ff,2,median)

(Intercept) typeN ST1 sigma
1.5192098 0.5524536 0.4420691 0.1414697

apply(ff,2,mean)

(Intercept) typeN ST1 sigma
1.5190424 0.5524594 0.4460139 0.1421281

fixef(mod2.lmer)

(Intercept) typeN
1.5191667 0.5522917

To obtain 95% confidence (credible) intervals for the parameters we can extract the .025 and .975 quantiles of each column with the quantile function.

# 95% credible interval for treatment effect using percentile method

apply(ff, 2, function(x) quantile(x, c(.025,.975)))

      (Intercept)     typeN       ST1     sigma
2.5%     1.467927 0.4961414 0.1528084 0.1211936
97.5%    1.569291 0.6092351 0.7489960 0.1671108

Because the 95% credible interval for the treatment effect (typeN) does not include zero we can conclude that the length-width ratios of globe and nominal jimsonweed plants are significantly different. Bayesians generally prefer to calculate something called an HPD (highest probability density) interval instead of a percentile interval. The HPD intervals can be obtained by applying the HPDinterval function directly to the original mcmcsamp object. In the output below the differences between the HPD intervals and the percentile intervals obtained above are fairly small.

# 95% highest probability density interval

HPDinterval(out.mc)

$fixef
lower upper
(Intercept) 1.4689726 1.5701743
typeN 0.4956552 0.6084897
attr(,"Probability")
[1] 0.95

$ST
lower upper
[1,] 0.163902 0.7560056
attr(,"Probability")
[1] 0.95

$sigma
lower upper
[1,] 0.1196254 0.1654788
attr(,"Probability")
[1] 0.95

What's especially attractive about Bayesian estimation is that we can easily obtain interval estimates for functions of parameter estimates. In the mixed effects model β₀ is the mean length-width ratio of the globe plants while β₀ + β₁ is the mean length-width ratio of nominal plants. To obtain a sample from the posterior distribution of the mean length-width ratio of nominal plants we just add the sample from the posterior distribution of β₀ to the sample from the posterior distribution of β₁. Intervals estimates for the treatment means can be obtained by calculating quantiles of the posterior distributions.

# obtain posterior distributions of the treatment means

mean.post <- data.frame(mean.g=ff[,1], mean.n=ff[,1]+ff[,2])

mean.post[1:10,]

mean.g mean.n
1 1.519167 2.071458
2 1.555116 2.145513
3 1.522187 2.075975
4 1.527750 2.041290
5 1.492065 2.060847
6 1.479926 2.085411
7 1.524527 2.061974
8 1.491247 2.098950
9 1.531388 2.044900
10 1.566630 2.119074

# obtain 95% credible intervals for the individual treatment means

apply(mean.post, 2, function(x) quantile(x, c(.025,.975)))

mean.g mean.n
2.5% 1.467927 2.019288
97.5% 1.569291 2.122747

Because the 95% credible intervals don't overlap we can say that the mean length-width ratios of the two plant types are significantly different.

Cited references

Blakeslee, A. F. 1921. The globe mutant in the jimson weed (Datura stramonium). Genetics 6: 241–264.
Faraway, Julian. 2010. Changes to the mixed effects models chapters in Extending the Linear Model with R, 2006, CRC Press. www.maths.bath.ac.uk/~jjf23/ELM/mixchange.pdf
Sokal, R. R. and F. J. Rohlf. 1995. Biometry: The Principles and Practice of Statistics in Biological Research. 3rd edition. W. H. Freeman and Co.: New York.

R Code

A compact collection of all the R code displayed in this document appears here.

Course Home Page

Jack Weiss
Phone: (919) 962-5930
E-Mail: jack_weiss@unc.edu
Address: Curriculum for the Environment and Ecology, Box 3275, University of North Carolina, Chapel Hill, 27599
Copyright © 2012
Last Revised--September 17, 2012
URL: https://sakai.unc.edu/access/content/group/3d1eb92e-7848-4f55-90c3-7c72a54e7e43/public/docs/lectures/lecture6.htm