Assignment 7—Solution

Part 1

fertilizer <- read.table( 'http://www.bio.ic.ac.uk/research/mjcraw/therbook/data/fertilizer.txt', header=T)

names(fertilizer)

[1] "root" "week" "plant" "fertilizer"

Question 1

The important characteristic of these data is that we have multiple measurements from the same plant (5 values per plant) as well as measurements made on different plants. Thus the data are heterogeneous. We can account for this by fitting a mixed effects model in which plant is identified as the grouping variable. Preliminary plots indicate that growth is nearly linear over the time period in question so a linear regression of root versus week treating week as a continuous variable makes sense here. Letting i denote the plant and j the observation on that plant we have

where . With only 12 plants it will be overkill to try to let both the intercepts and slopes be random. In fact as we'll see, except for an effect due to fertilizer, the slopes show almost no variability across plants. A minimal mixed effects model is a random intercepts model in which the intercepts are allowed to vary across plants but the slopes are treated as constant.

eqn 2

where . This model is fit as follows.

library(nlme)

out1 <- lme(root~week, random=~1|plant, data=fertilizer, method="ML")

anova(out1)

            numDF denDF   F-value p-value
(Intercept)     1    47 525.8152 <.0001
week            1    47 1726.7813 <.0001

The linear relationship between root size and time is highly significant.

Question 2

Fertilizer is a level-2 variable. It was randomly applied to whole plants, not to the individual times at which a plant was measured. As a level-2 variable fertilizer can affect the level-2 equation either by modifying the intercept, the slope, or both. We consider each possibility in turn.

#fertilizer modifies the intercepts

out2 <- lme(root~week+fertilizer, random=~1|plant, data=fertilizer, method="ML")

#fertilizer modifies the slopes

out3 <- lme(root~week+week:fertilizer, random=~1|plant, data=fertilizer, method="ML")

#fertilizer modifies the slopes and the intercepts.

out4 <- lme(root~ week*fertilizer, random=~1|plant, data=fertilizer, method="ML")

AIC(out1, out2, out3, out4)

df AIC
out1 4 121.2852
out2 5 105.7642
out3 5 106.3386
out4 6 103.2927

The model in which fertilizer affects both the intercepts and slopes ranks best. We carry out a formal significance test of the individual terms.

anova(out4)

                numDF denDF   F-value p-value
(Intercept)         1    46 2186.2403 <.0001
week                1    46 1830.0170 <.0001
fertilizer          1    10   37.0307 0.0001
week:fertilizer     1    46    4.3740 0.0420

There is a significant interaction between week and fertilizer (the slope effect) as well as a significant effect of fertilizer on the intercepts.

Question 3

I display the autocorrelation function of the residuals using a Bonferroni correction for four tests (corresponding to the four lags shown). None of the spikes achieves statistical significance so there is no need to consider a correlation model for the residuals.

plot(ACF(out4, form=~week), alpha=.05/4)

Fig. 1 ACF for the level-1 residuals

Question 4

I color the lines and points to indicate if fertilizer is present (red) or absent (blue).

xyplot(root~week|plant, data=fertilizer, panel=function(x,y,subscripts){

fert <- as.numeric(fertilizer$fertilizer)[subscripts][1]

panel.xyplot(x, y, col=c(2,4)[fert], pch=c(21,22)[fert])

panel.lines(x, predict(out4,level=0)[subscripts], col=c(2,4)[fert])

panel.lines(x, predict(out4,level=1)[subscripts], col=c(2,4)[fert], lty=2)},

key=list(x=.75, y=.89, corner=c(0,0), lines=list(lty=c(1,2), col=c(1,1), size=3), text=list(c('population average', 'subject-specific'), cex=.8) ))

Fig. 2 Plot of fitted model: population average (consisting of estimates of the fixed effects only) and subject specific (consisting of estimates of the fixed effects and predictions of the random effects) along with the data. The data are colored red (fertilizer added) and blue (fertilizer absent).

Additional comments

In Fig. 2 we see that a regression line whose slope varies only by fertilizer type seems to provide a reasonable fit to the data. There is no need for each plant to have its own slope. If one tries to fit a random slopes and intercepts model, the following message is obtained.

out1.5 <- lme(root~week,random=~week|plant, data=fertilizer, method="ML")

Error in lme.formula(root ~ week, random = ~week | plant, data = fertilizer, :
nlminb problem, convergence error code = 1
message = iteration limit reached without convergence (10)

It is possible by adjusting various control parameters to get a slopes and intercept model to converge here. The lme function has a control argument for which one can use the lmeControl function to modify the default settings of the optimization algorithm. In the run below I change the optimization function from the default nlminb to optim and increase the maximum number of iterations for various steps.

out1.5 <- lme(root~week, random=~week|plant, data=fertilizer, method="ML", control=lmeControl(maxIter=1000, msMaxIter=1000, niterEM=1000, opt='optim'))

AIC(out1,out1.5)

df AIC
out1 4 121.2852
out1.5 6 119.7192

out4.5 <- lme(root~week*fertilizer, random=~week|plant, data=fertilizer, method="ML", control=lmeControl(maxIter=1000, msMaxIter=1000, niterEM=1000, opt='optim'))

AIC(out4,out4.5)

df AIC
out4 6 103.2927
out4.5 8 106.2191

Notice that before including fertilizer in the model, a random slopes and intercepts model has a lower AIC than a random intercepts model, but once fertilizer is included as a predictor of the slopes there is no longer a need for the random slopes. Fitting the random slopes and intercepts model in this case increases the AIC over a random intercepts model.

Part 2

farms <- read.table( 'http://www.bio.ic.ac.uk/research/mjcraw/therbook/data/farms.txt', header=T)

names(farms)

[1] "N" "size" "farm"

Question 1

These data consist of a random sample of farms and then multiple measurements taken from the same farm. To investigate whether this structure matters we can fit a random intercepts model in which there are no predictors, just an intercept, and in which we allow that intercept to vary across farms.

out0 <- lme(size~1, random=~1|farm, data=farms, method='ML')

VarCorr(out0)

farm = pdLogChol(1)
Variance StdDev
(Intercept) 66.950405 8.182323
Residual 5.969213 2.443197

From the output we see that the variance within farms, σ², is 5.97 while the variance between between farms, τ², is 66.95 indicating that the variability of the response between farms is 11 times its variability within farms.

Question 2

The intra-class correlation compares the variability between farms to the total variability (sum of the variability between and within farms).

rho

as.numeric(VarCorr(out0)[1,1]) / (as.numeric(VarCorr(out0)[1,1]) + as.numeric(VarCorr(out0)[2,1]))

[1] 0.9181398

Question 3

I fit a random intercepts mode in which nitrogen is a level-1 predictor of size.

out1 <- lme(size~N, random=~1|farm, data=farms, method='ML')

summary(out1)

Linear mixed-effects model fit by maximum likelihood
Data: farms
AIC BIC logLik
614.3769 625.5269 -303.1885

Random effects:
Formula: ~1 | farm
(Intercept) Residual
StdDev: 8.316789 1.920269

Fixed effects: size ~ N
               Value Std.Error DF t-value p-value
(Intercept) 85.58120 2.5276865 95 33.85752       0
N            0.70806 0.0931177 95 7.60392       0
Correlation:
(Intr)
N -0.732

Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-2.80413111 -0.63410087 0.02949117 0.70724199 2.19176486

Number of Observations: 120
Number of Groups: 24

Other comments

Like the model in Part 1, in order to fit a random slopes and intercepts model here it is necessary to adjust the default optimization settings of the lme function. If we do so we fail to find evidence that both slopes and intercepts should be random. The AIC of this model is higher than the comparable random intercepts model.

out1.5 <- lme(size~N, random=~N|farm, data=farms, method='ML')

Error in lme.formula(size ~ N, random = ~N | farm, data = farms, method = "ML") :
nlminb problem, convergence error code = 1
message = iteration limit reached without convergence (10)

out1.5 <- lme(size~N, random=~N|farm, data=farms, method='ML', control=lmeControl(maxIter=1000, msMaxIter=1000, niterEM=1000, opt='optim'))

AIC(out1, out1.5)

df AIC
out1 4 614.3769
out1.5 6 617.7898

HW scores

Course Home Page

Jack Weiss
Phone: (919) 962-5930
E-Mail: jack_weiss@unc.edu
Address: Curriculum for the Environment and Ecology, Box 3275, University of North Carolina, Chapel Hill, 27599
Copyright © 2012
Last Revised--March 13, 2012
URL: https://sakai.unc.edu/access/content/group/2842013b-58f5-4453-aa8d-3e01bacbfc3d/public/Ecol562_Spring2012/docs/solutions/assign7.htm