Final Exam—Solution

Problem 1

Question 1

We need to fit a Poisson random intercepts model so we can adapt the comparable code from lecture 25. The variable Season is the grouping variable for the random effects. I read in the data and create the objects needed for WinBUGS and JAGS.

bycatch <- read.csv('ecol 563/bycatch.csv')
y <- bycatch$Bycatch
area <- as.numeric(bycatch$Area=='North')
gear <- as.numeric(bycatch$Gear.Type=='Bottom')
time <- as.numeric(bycatch$Time=='Day')
logtows <- log(bycatch$Tows)
season <- as.numeric(bycatch$Season)
J <- length(unique(season))
n <- length(y)

The BUGS program is shown below.

BUGS model (manlybugs.txt)
model{
for(i in 1:n) {
y[i]~dpois(mu.hat[i])
log.mu[i] <- a[season[i]] + b1*area[i] + b2*gear[i] + b3*time[i] + b4*logtows[i]
mu.hat[i] <- exp(log.mu[i])
}
#level-2 model
for(j in 1:J){
a[j]~dnorm(mu.a,tau.a)
}
#priors
mu.a~dnorm(0,.000001)
tau.a <- pow(sigma.a,-2)
sigma.a~dunif(0,10000)

b1~dnorm(0,.000001)
b2~dnorm(0,.000001)
b3~dnorm(0,.000001)
b4~dnorm(0,.000001)
}


setwd("/Users/jackweiss/Documents/Ecol 563 Fall 2012 stuff/models")
manly.data <- list("y", "area", "gear", "time", "logtows", "n", "season", "J")
manly.inits <- function() {list(a=rnorm(J), b1=rnorm(1), b2=rnorm(1), b3=rnorm(1), b4=rnorm(1), mu.a=rnorm(1), sigma.a=runif(1))}
manly.parms <- c("a", "b1", "b2", "b3", "b4", "mu.a", "sigma.a")
# WinBUGS run
library(arm)
manly1.bugs <- jags(manly.data, manly.inits, manly.parms, "manlybugs.txt", bugs.directory='C:/WinBUGS14', n.chains=3, n.iter=100, debug=T)
# JAGS run
library(R2jags)
manly1.jags <- jags(manly.data, manly.inits, manly.parms, "manlybugs.txt", n.chains=3, n.iter=100)
manly1.jags <- jags(manly.data, manly.inits, manly.parms, "manlybugs.txt", n.chains=3, n.iter=10000)
xyplot(as.mcmc(manly1.jags),layout=c(3,5))

   fig. 1
Fig. 1   Trace plots of individual Markov chains for each parameter.

round(manly1.jags$BUGSoutput$summary,3)
           mean    sd   2.5%    25%    50%    75%  97.5%  Rhat n.eff
a[1]      0.013 0.886 -1.849 -0.554  0.041  0.617  1.672 1.001  3000
a[2]     -3.919 1.915 -8.744 -4.840 -3.625 -2.653 -0.957 1.020   480
a[3]      1.259 0.555  0.121  0.898  1.266  1.644  2.301 1.001  2900
a[4]     -0.318 0.754 -1.824 -0.814 -0.303  0.197  1.140 1.001  3000
a[5]     -1.021 0.872 -2.771 -1.582 -0.995 -0.419  0.602 1.001  2200
a[6]     -0.173 0.869 -1.883 -0.733 -0.149  0.417  1.462 1.001  3000
b1       -1.957 0.420 -2.826 -2.229 -1.935 -1.660 -1.203 1.002  1500
b2       -1.952 0.408 -2.807 -2.215 -1.926 -1.678 -1.202 1.002  1600
b3       -2.161 0.432 -3.047 -2.442 -2.143 -1.850 -1.362 1.004   530
b4        0.668 0.196  0.305  0.533  0.664  0.792  1.072 1.001  3000
deviance 87.857 4.796 80.649 84.383 87.209 90.595 99.131 1.001  3000
mu.a     -0.660 1.445 -3.760 -1.443 -0.562  0.194  1.963 1.003  3000
sigma.a   2.506 1.507  0.890  1.535  2.110  2.979  6.588 1.002  1500

Question 2

The percentile and HPD 95% credible intervals are shown below. I've highlighted the intervals that correspond to the coefficient of log(Tows).

round(manly1.jags$BUGSoutput$summary[,c(3,5,7)],3)
           2.5%    50%  97.5%
a[1]     -1.849  0.041  1.672
a[2]     -8.744 -3.625 -0.957
a[3]      0.121  1.266  2.301
a[4]     -1.824 -0.303  1.140
a[5]     -2.771 -0.995  0.602
a[6]     -1.883 -0.149  1.462
b1       -2.826 -1.935 -1.203
b2       -2.807 -1.926 -1.202
b3       -3.047 -2.143 -1.362
b4       
0.305  0.664  1.072
deviance 80.649 87.209 99.131
mu.a     -3.760 -0.562  1.963
sigma.a   0.890  2.110  6.588
HPDinterval(as.mcmc(manly1.jags$BUGSoutput$sims.matrix))
              lower      upper
a[1]     -1.6373366  1.7858370
a[2]     -8.2517138 -0.7911186
a[3]      0.1994334  2.3578090
a[4]     -1.7349306  1.1848399
a[5]     -2.7347817  0.6374401
a[6]     -1.8827512  1.4618864
b1       -2.7452448 -1.1350550
b2       -2.8159875 -1.2146354
b3       -3.0467614 -1.3610107
b4       
0.2739653  1.0199939
deviance 79.7408560 97.1228256
mu.a     -3.5632133  2.0624371
sigma.a   0.6721666  5.4892025
attr(,"Probability")
[1] 0.95

When we include log(Tows) as an offset in a model we are fixing its coefficient to be 1 rather than estimating it. To determine if this is a good idea we can carry out a hypothesis test.

H0: b4 = 1
H1: b4 ≠ 1

An equivalent way of carrying out this test is to construct a confidence interval for b4. We fail to reject H0 at α = .05 if 1 is not in a 95% confidence interval for b4. The reported 95% credible intervals are percentile: (0.305, 1.072) and HPD: (0.274, 1.02). Because the number 1 is contained in both the percentile and HPD 95% credible intervals we fail to reject 1 as a possible value for the coefficient of log(Tows) in the model. Therefore treating log(Tows) as an offset with a coefficient of 1 is a reasonable thing to do here.


problem1 scores

Problem 2

Question 1

The three design features that are deviations from an ordinary two-factor design with crossed factors are the following.

  1. This is a split plot design for the temperature treatment.
  2. There are repeated measures for the time effect
  3. We need to include a covariate to control for differences in initial size.

Split plot design. Although the growth of individual animals was tracked over time, the main treatment of interest, temperature, was randomly applied to nine different aquaria, not to the 9 × 18 = 162 individual animals in the study. If this had been a true factorial design then each coral would have been placed in an individual jar with its own thermostat. 54 of the jars would have been randomly assigned to the 25°C treatment, 54 jars would have been randomly assigned to the 28°C treatment, and 54 jars would have been randomly assigned to the 32°C treatment. For logistical reasons this wasn't done and treatments were randomly assigned to aquaria. Corals from the same tank would be expected to be more similar to each other than to corals from different tanks, not only because they share a common temperature treatment but because they share a common aquarium environment. As a result the observations used in the study are not all equivalent, something we've called data heterogeneity. In the language of split plot designs the aquarium is the whole plot unit, the individual corals are the split plot units, and temperature is the whole plot treatment. Technically there is no split plot treatment here. Analytically we can account for the split plot design by fitting a mixed effects model. The grouping variable in the mixed effects model is tank.grp, a variable that uniquely identifies each of the nine aquaria. (Note: Technically because there is no split plot treatment we would usually call the temperature experiment a nested design rather than a split plot design but the method of analysis is the same.)

Repeated measures design. There are 486 observations in the data frame coming from 162 different animals with three measurements per animal. Thus we don't have a random sample of animal times. If this had been a random sample of animal-times then there would have been 486 animals of which 162 would have been randomly chosen to be measured at time 1, a different 162 chosen to be measured at time 2, and the remaining 162 chosen to be measured at time 3. Instead we measured each animal three times. Thus we have measurements from the same animal as well as measurements from different animals. We would expect measurements made on the same animal to be more similar to each other than they are to measurements made on different animals. Thus we have observational heterogeneity. The grouping variable here is the individual animal, recorded as the variable id in the data frame. A repeated measures design can be analyzed like a split plot design. The animal (id) is the whole plot unit and the individual time measurements on that animal (the rows of the data frame) are the split plot units. There are no treatments per se unless time is viewed as a treatment.

The temperature experiment combined with the repeated measures together can be viewed as a split-split plot design. In this perspective aquaria are whole plot units, individual corals are split plot units, and the individual time measurements are split-split plot units. Temperature is the whole plot treatment and time can be viewed as a split-split plot "treatment", but that's a bit of a stretch. Once again there is no split plot treatment.

Controlling for a nuisance covariate. Because the animals did not all have the same initial size, it is necessary to include the variable surf.area as a covariate to control for differences in size. In its simplest form this would convert a two-factor analysis of variance model to test for the effects of temperature, time, and their interaction, into an analysis of covariance model that also includes the additive effect of surf.area. It is possible that the effect of treatment also varies with the value of the covariate. This should be investigated by fitting a preliminary model in which surface area is allowed to interact with the treatment variables.

Question 2

The grouping variables are tank.grp, the unique identifier for the nine aquaria, and id, the unique identifier for the individual coral animals. Using the nlme package we would represent this structure with random=~1|tank.grp/id. Using the lme4 package we just need to include the terms (1|tank.grp) + (1|id). Because this is a controlled experiment where all detected effects should be only due to the administered treatments we should start with the most complicated model, one that includes an interaction between temp, time.period, and surf.area. To avoid confusion I convert both temp and time.period to factors before fitting the model.

corals <- read.table('corals.txt', header=T)
corals[1:4,]
  tank temp      incr  id surf.area tank.grp inctime     rate time.period
1    3   32 0.1656716  A1     7.084     32.3      35 4.733473           1
2    3   32 0.0271845  A1     7.084     32.3      25 1.087380           2
3    3   32 0.1051476  A1     7.084     32.3      35 3.004217           3
4    1   25 0.3333948 A13     7.050     25.1      35 9.525565           1
# convert temp and time.period to factors
corals$temp <- factor(corals$temp)
corals$time.period <- factor(corals$time.period)
# lme version of the model
library(nlme)
out.lme <- lme(rate ~ temp*time.period*surf.area, random=~1|tank.grp/id, data=corals)

Using the lme4 package, the model would be written as follows.

library(lme4)
out.lmer <- lmer(rate ~ temp*time.period*surf.area + (1|tank.grp) + (1|id), data=corals)

Question 3

I use with the nlme package for the analysis because it's easier to use with normal-response mixed effects models.

anova(out.lme)
                           numDF denDF  F-value p-value
(Intercept)                    1   312 379.2266  <.0001
temp                           2     6  55.7383  0.0001
time.period                    2   312  24.2382  <.0001
surf.area                      1   150  77.8080  <.0001
temp:time.period               4   312  62.3850  <.0001
temp:surf.area                 2   150   8.7679  0.0003
time.period:surf.area          2   312   0.3115  0.7326
temp:time.period:surf.area     4   312   1.1847  0.3175

According to the ANOVA table we can drop the three-factor interaction. I refit the model with all two-factor interactions.

out.lme2 <- lme(rate ~ (temp + time.period + surf.area)^2, random=~1|tank.grp/id, data=corals)
anova(out.lme2)
                      numDF denDF  F-value p-value
(Intercept)               1   316 379.2267  <.0001
temp                      2     6  55.7384  0.0001
time.period               2   316  24.1817  <.0001
surf.area                 1   150  77.8080  <.0001
temp:time.period          4   316  62.2395  <.0001
temp:surf.area            2   150   8.7679  0.0003
time.period:surf.area     2   316   0.3107  0.7331

When time.period:surf.area is added last it is not significant so we can drop it.

out.lme3 <- update(out.lme2, .~. - time.period:surf.area)
anova(out.lme3)
                 numDF denDF  F-value p-value
(Intercept)          1   318 379.2268  <.0001
temp                 2     6  55.7384   1e-04
time.period          2   318  24.2870  <.0001
surf.area            1   150  77.8080  <.0001
temp:time.period     4   318  62.5105  <.0001
temp:surf.area       2   150   8.7679   3e-04

Changing the order of the two-factor interaction terms in the sequential ANOVA table does not affect our conclusions. All of the remaining effects are statistically significant.

out.lme3a <- lme(rate~temp*surf.area + temp*time.period, random=~1|tank.grp/id, data=corals)
anova(out.lme3a)
                 numDF denDF  F-value p-value
(Intercept)          1   318 379.2268  <.0001
temp                 2     6  55.7384   1e-04
surf.area            1   150  77.8080  <.0001
time.period          2   318  24.2870  <.0001
temp:surf.area       2   150   8.7679   3e-04
temp:time.period     4   318  62.5105  <.0001

Note: There was actually no need to refit the model. We could have used anova(out.lme3, type='marginal') to obtain marginal tests of each effect.

Question 4

There are three different ways to obtain treatment means, their standard errors, and confidence intervals. We can

  1. reparameterize the model to return means directly and then use the intervals function from the nlme package,
  2. use the effects package with the original model, or
  3. construct a contrast matrix to obtain the means and standard errors.

I illustrate each method in turn.

Method 1: Reparameterize the model and use the intervals function

The first two hints for this question suggest reparameterizing the model so that we can get the means we want directly rather than their effects. The means of interest correspond to the temp:time.period interaction. Following the hint I fit a model with only the two 2-factor interactions and no intercept. I also center surface area at 10 using the I function of R.

out.lme4 <- lme(rate ~ temp:time.period + temp:I(surf.area-10) - 1, data=corals, random=~1|tank.grp/id)

The AIC of this model matches the AIC of the model in its original parameterization indicating that they're equivalent.

AIC(out.lme3, out.lme4)
         df      AIC
out.lme3 15 2886.079
out.lme4 15 2886.079

If we examine the summary table of the model we can see that we've obtained what we want.

round(summary(out.lme4)$tTable,4)
                           Value Std.Error  DF t-value p-value
temp25:time.period1      16.0675    1.4411 316 11.1497  0.0000
temp28:time.period1      18.9906    1.4368 316 13.2174  0.0000
temp32:time.period1       7.8468    1.4869 316  5.2772  0.0000
temp25:time.period2      19.1829    1.4411 316 13.3116  0.0000
temp28:time.period2      26.9032    1.4368 316 18.7246  0.0000
temp32:time.period2       3.1679    1.4869 316  2.1305  0.0339
temp25:time.period3      21.5174    1.4411 316 14.9316  0.0000
temp28:time.period3      26.9933    1.4368 316 18.7873  0.0000
temp32:time.period3       2.4784    1.4869 316  1.6668  0.0965
temp25:I(surf.area - 10)  2.2798    0.3545 151  6.4304  0.0000
temp28:I(surf.area - 10)  2.7244    0.3741 151  7.2822  0.0000
temp32:I(surf.area - 10)  0.4252    0.4333 151  0.9815  0.3279

When we plug in surface area = 10, the last three terms in the model will drop out. The first nine terms are the means at the individual temperatures and time periods. To get the confidence intervals I use the intervals function on this model with the which="fixed" argument so that we only get confidence intervals for the fixed effect estimates. The entries we want are in the first nine rows.

out.int <- intervals(out.lme4, which='fixed')
names(out.int)
[1] "fixed"
out.ci <- out.int$fixed[1:9,]
out.ci
                         lower      est.     upper
temp25:time.period1 13.2321571 16.067453 18.902749
temp28:time.period1 16.1636829 18.990559 21.817435
temp32:time.period1  4.9212827  7.846818 10.772354
temp25:time.period2 16.3476520 19.182948 22.018244
temp28:time.period2 24.0763344 26.903210 29.730086
temp32:time.period2  0.2423577  3.167893  6.093429
temp25:time.period3 18.6821261 21.517422 24.352718
temp28:time.period3 24.1663868 26.993263 29.820139
temp32:time.period3 -0.4470924  2.478443  5.403978

Method 2: Use the effects package

To use the effects package to obtain the means we can use the original model. The syntax for obtaining the means at each temperature and time combination when surface area = 10 is the following.

library(effects)
out.eff <- effect("temp:time.period", out.lme3, given.values = c(surf.area = 10))

The terms we want are scattered in different components of the object returned by the effect function.

names(out.eff)
 [1] "term"             "formula"          "response"         "variables"      
 [5] "fit"              "x"                "model.matrix"     "data"           
 [9] "discrepancy"      "offset"           "vcov"             "se"             
[13] "lower"            "upper"            "confidence.level" "transformation" 
out.ci2 <- cbind(lower=out.eff$lower, est.=out.eff$fit, upper=out.eff$upper)
out.ci2
        [,1]      [,2]      [,3]
1 13.2430163 16.067453 18.891890
2 16.1745099 18.990559 21.806608
3  4.9324875  7.846818 10.761149
4 16.3585112 19.182948 22.007385
5 24.0871613 26.903210 29.719259
6  0.2535625  3.167893  6.082224
7 18.6929853 21.517422 24.341859
8 24.1772137 26.993263 29.809312
9 -0.4358876  2.478443  5.392774

Notice that the confidence bounds are slightly different from what we obtained using the first method. That's because the intervals function uses a t-distribution with 316 degrees of freedom (see the summary table above) in calculating the confidence bounds while the effects package uses a normal distribution. We can easily demonstrate this fact.

# normal distribution: used by effect
out.eff$fit + qnorm(.025)*out.eff$se
        [,1]
1 13.2430163
2 16.1745099
3  4.9324875
4 16.3585112
5 24.0871613
6  0.2535625
7 18.6929853
8 24.1772137
9 -0.4358876
# t-distribution: used by intervals
out.eff$fit + qt(.025,316)*out.eff$se
        [,1]
1 13.2321571
2 16.1636829
3  4.9212827
4 16.3476520
5 24.0763344
6  0.2423577
7 18.6821261
8 24.1663868
9 -0.4470924

Method 3: Hand calculation

This method was outlined in lecture 4. To obtain the means at the nine different combinations of temperature and time we form the matrix product where the nine rows of C contain the values of the regressors in the model at each of those means and β is the vector of fixed effect estimates from the model, fixef(out.lme3). The variance-covariance matrix of the means can be obtained with the matrix product matrix sandwich, where Σβ is the variance covariance matrix of the parameter estimates obtained with vcov(out.lme3). The second appearance of C in the product is the matrix transpose of the first. To obtain the standard errors of the means we extract the diagonal of the final matrix and take its square root.

I start by using expand.grid to obtain the nine combinations of temperature and time that correspond to the nine means.

temp.time <- expand.grid(temp=levels(corals$temp), time.period=levels(corals$time.period))
temp.time
  temp time.period
1   25           1
2   28           1
3   32           1
4   25           2
5   28           2
6   32           2
7   25           3
8   28           3
9   32           3

I next create the dummy variables for temperature and time that correspond to these nine observations. An easy way to get them is with the outer function, which can be used to carry out element-wise calculations on two sets of vectors. Here I test separately whether the temperature is equal to 28 or to 32.

part.temp <- outer(temp.time[,1], c(28,32), '==') + 0
part.temp
      [,1] [,2]
 [1,]    0    0
 [2,]    1    0
 [3,]    0    1
 [4,]    0    0
 [5,]    1    0
 [6,]    0    1
 [7,]    0    0
 [8,]    1    0
 [9,]    0    1

I do a similar calculation for time.

part.time <- outer(temp.time[,2], c(2,3), '==') + 0
part.time
      [,1] [,2]
 [1,]    0    0
 [2,]    0    0
 [3,]    0    0
 [4,]    1    0
 [5,]    1    0
 [6,]    1    0
 [7,]    0    1
 [8,]    0    1
 [9,]    0    1

I examine the entries of the coefficient vector from the model.

fixef(out.lme3)
        (Intercept)              temp28              temp32
         -6.7305414          -1.5227216          10.3250530
       time.period2        time.period3           surf.area
          3.1154949           5.4499690           2.2797995
temp28:time.period2 temp32:time.period2 temp28:time.period3
          4.7971566          -7.7944198           2.5527349
temp32:time.period3    temp28:surf.area    temp32:surf.area
        -10.8183440           0.4445827          -1.8545688

We need to construct a matrix C with 12 columns so that each column corresponds to a different regressor. I start by adding the intercept, the dummy vectors for temperature, the dummy vectors for time, and a column for surface area = 10.

cmat <- cbind(rep(1, nrow(part.temp)), part.temp, part.time, rep(10, nrow(part.time)))
colnames(cmat) <- c('i', 't28', 't32', 't2', 't3', 's')
cmat
      i t28 t32 t2 t3  s
 [1,] 1   0   0  0  0 10
 [2,] 1   1   0  0  0 10
 [3,] 1   0   1  0  0 10
 [4,] 1   0   0  1  0 10
 [5,] 1   1   0  1  0 10
 [6,] 1   0   1  1  0 10
 [7,] 1   0   0  0  1 10
 [8,] 1   1   0  0  1 10
 [9,] 1   0   1  0  1 10

Next I create the interactions by multiplying the corresponding dummy vectors together. I do this so that the results match the order shown in the coefficient vector.

cmat.int <- cbind(cmat[,2]*cmat[,4], cmat[,3]*cmat[,4], cmat[,2]*cmat[,5], cmat[,3]*cmat[,5], cmat[,2]*cmat[,6], cmat[,3]*cmat[,6])
cmat <- cbind(cmat, cmat.int)
colnames(cmat)[7:12] <- c('t28.t2', 't32.t2', 't28.t3', 't32.t3', 't28.s', 't32.s')
cmat
      i t28 t32 t2 t3  s t28.t2 t32.t2 t28.t3 t32.t3 t28.s t32.s
 [1,] 1   0   0  0  0 10      0      0      0      0     0     0
 [2,] 1   1   0  0  0 10      0      0      0      0    10     0
 [3,] 1   0   1  0  0 10      0      0      0      0     0    10
 [4,] 1   0   0  1  0 10      0      0      0      0     0     0
 [5,] 1   1   0  1  0 10      1      0      0      0    10     0
 [6,] 1   0   1  1  0 10      0      1      0      0     0    10
 [7,] 1   0   0  0  1 10      0      0      0      0     0     0
 [8,] 1   1   0  0  1 10      0      0      1      0    10     0
 [9,] 1   0   1  0  1 10      0      0      0      1     0    10

Next I use the matrix C to calculate the means and their standard errors. I then obtain the lower and upper 95% confidence intervals for the means using a t-distribution for the multiplier just like the intervals function does.

est.model <- cmat %*% fixef(out.lme3)
se.model <- sqrt(diag(cmat %*% vcov(out.lme3) %*% t(cmat)))
lower.model <- est.model + qt(.025, 316)*se.model
upper.model <- est.model + qt(.975,316)*se.model
out.ci3 <- data.frame(lower=lower.model, est.=est.model, upper=upper.model)
out.ci3
       lower      est.     upper
1 13.2321571 16.067453 18.902749
2 16.1636829 18.990559 21.817435
3  4.9212827  7.846818 10.772354
4 16.3476520 19.182948 22.018244
5 24.0763344 26.903210 29.730086
6  0.2423577  3.167893  6.093429
7 18.6821261 21.517422 24.352718
8 24.1663868 26.993263 29.820139
9 -0.4470924  2.478443  5.403978

The matrix of results matches what we obtained using the intervals function in Method 1. 

Question 5

Because this is a repeated measures design I place time on the x-axis and draw separate time profiles for each temperature. There are two sensible graphs we could produce here. We can put the time profiles for each temperature in a single panel or in different panels. I illustrate both approaches.

Method 1: One-panel graph

I modify the code from Lecture 8 that uses a group panel function. In lecture 8 there were only two groups. Here we have three. I start by collecting the factor levels and estimates in a data frame.

fac.vals <- data.frame(time=factor(rep(1:3,each=3)), temp=factor(rep(c(25,28,32),3)), est=out.ci[,"est."], low95=out.ci[,"lower"], up95=out.ci[,"upper"])
fac.vals
                    time temp       est      low95      up95
temp25:time.period1    1   25 16.067453 13.2321571 18.902749
temp28:time.period1    1   28 18.990559 16.1636829 21.817435
temp32:time.period1    1   32  7.846818  4.9212827 10.772354
temp25:time.period2    2   25 19.182948 16.3476520 22.018244
temp28:time.period2    2   28 26.903210 24.0763344 29.730086
temp32:time.period2    2   32  3.167893  0.2423577  6.093429
temp25:time.period3    3   25 21.517422 18.6821261 24.352718
temp28:time.period3    3   28 26.993263 24.1663868 29.820139
temp32:time.period3    3   32  2.478443 -0.4470924  5.403978

Because the graph is rather full, I don't do a legend but instead use the panel.text function to place an identifier for each temperature at the end of its corresponding profile. To get the correct temperature to show up on the correct profile with a degree symbol, I use the substitute function, something that we didn't cover in the course. The substitute function allows one to insert the values of variables into mathematical expressions.

my.panel <- function(x, y, subscripts, col, pch, group.number, ...) {
#subset variables for the current panel
low95 <- fac.vals$low95[subscripts]
up95 <- fac.vals$up95[subscripts]
myjitter <- c(-.05,.05,0)
temp <- c(25,28,32)
col2 <- c('tomato', 'grey60', 'dodgerblue1')
#95% confidence interval
panel.arrows(x+myjitter[group.number], low95, x+myjitter[group.number], up95, angle=90, code=3, length=.05, col=col)
# connect means with line segments
panel.lines(x+myjitter[group.number], y, col=col)
# add point estimates for the means
panel.xyplot(x+myjitter[group.number], y, col='white', pch=16)
panel.xyplot(x+myjitter[group.number], y, col=col, pch=pch, ...)
# add temperature labels at ends of profiles
last.y <- y[x==3]
panel.text(3+myjitter[group.number], last.y, labels=substitute(paste(a*degree, 'C'), list(a = temp[group.number])), pos=4, cex=0.9)
# grid lines
panel.abline(h=seq(0,30,5), col='lightgrey', lty=3)
}
prepanel.ci2 <- function(x, y, ly, uy, subscripts, ...) {
list(ylim=range(y, uy[subscripts], ly[subscripts], finite=TRUE))
}
xyplot(est~time, xlab ='Time period', ylab='Estimated mean rate', groups=temp, data=fac.vals, col=c(1,2,4), pch=c(1,16,15), prepanel=prepanel.ci2, ly=fac.vals$low95, uy=fac.vals$up95, panel.groups=my.panel, panel="panel.superpose")

   Fig. 2
Fig. 2   Interaction graph in a single panel

Method 2: Three-panel graph

For this method I place each time profile in a separate panel according to its temperature.

prepanel.ci2 <- function(x, y, ly, uy, subscripts, ...) {
list(ylim=range(y, uy[subscripts], ly[subscripts], finite=TRUE))
}
xyplot(est~time|temp, xlab ='Time period', ylab='Estimated mean rate', data=fac.vals, prepanel=prepanel.ci2, ly=fac.vals$low95, uy=fac.vals$up95, layout=c(3,1), panel=function(x, y, subscripts) {
panel.xyplot(x, y, col='dodgerblue4', type='l')
panel.arrows(x, fac.vals$low95[subscripts], x, fac.vals$up95[subscripts], lineend=1, col='dodgerblue1', angle=90, code=3, length=.1)
panel.xyplot(x, y, col='dodgerblue4', pch=16)})

To get degree symbols in the panel strips takes a bit of work. If we're sure of which label goes where we can hardwire it in with the strip.custom function in the strip argument. I use the factor.levels argument to specify formatted values for the temperature in a specific order.

# hard-wiring the temperature labels
xyplot(est~time|temp, xlab ='Time period', ylab='Estimated mean rate', data=fac.vals, prepanel=prepanel.ci2, ly=fac.vals$low95, uy=fac.vals$up95, layout=c(3,1), panel=function(x, y, subscripts) {
panel.xyplot(x, y, col='dodgerblue4',type='l')
panel.arrows(x, fac.vals$low95[subscripts], x, fac.vals$up95[subscripts], lineend=1, col='dodgerblue1', angle=90, code=3, length=.1)
panel.xyplot(x, y, col='dodgerblue4', pch=16)}, strip=strip.custom(par.strip.text = list(cex=0.9), factor.levels= c(expression(paste(25*degree, 'C')), expression(paste(28*degree, 'C')), expression(paste(32*degree, 'C')))))

A more elegant approach is to do this dynamically and let the xyplot function figure out what temperature labels go where. For this we have to write a function that substitutes a value of temperature into a math expression. Then we have to use the strip argument with the strip.custom function to apply this function to the value for each panel separately.

#function for panel labels
stuff.func3 <- function(x) substitute(expression(paste(z*degree, 'C', sep='')), list(z=x))
# dynamic labeling
xyplot(est~time|temp, xlab ='Time period', ylab='Estimated mean rate', data=fac.vals, prepanel=prepanel.ci2, ly=fac.vals$low95, uy=fac.vals$up95, layout=c(3,1), panel=function(x,y,subscripts) {
panel.arrows(x, fac.vals$low95[subscripts], x, fac.vals$up95[subscripts], lineend=1, col='dodgerblue1', angle=90, code=3, length=.1)
panel.xyplot(x, y, col='dodgerblue4', pch=16, type='o')},
strip=strip.custom(par.strip.text = list(cex=0.9), factor.levels= as.expression(unlist(sapply(levels(factor(fac.vals$temp)), function(x) eval(stuff.func3(x)))))))

   Fig. 3
Fig. 3   Interaction graph in three panels


final 2 scores

Problem 3

Question 1

Define a dummy variable for Morph.

morph

The regression model that answers the researcher's question is a model that includes distance, a dummy variable for Morph, and the interaction between the two of them.

Question 2

The goal is to find evidence of natural selection on moth color morphs based on the background color of tree bark. Distance from Liverpool is the proxy for tree background color. The greater the distance the lighter the bark. If we can show that the distance from Liverpool affects the removal rate of the two color morphs differently that would be evidence that the combination of moth color and bark color matters.

We have two regression models depending on the Morph. When Morph = 'dark', (z = 0), we have

dark

When Morph = 'light', (z = 1), we have

light

On the scale of the response these expressions represent two lines with different slopes and different intercepts. The relationship with distance is the same if the two slopes are the same. This would happen if β3 = 0. Thus our null hypothesis stated in terms of the regression model is simply H0: β3 = 0.

Question 3

The response variable is the number of moths removed out of a fixed total. A suitable probability model is a binomial distribution and hence logistic regression is appropriate.

moths <- read.csv('moths.csv')
moths
        Location Distance Morph Num_moths Num_removed
1    Sefton Park      0.0 light        56          17
2    Sefton Park      0.0  dark        56          14
3  Eastham Ferry      7.2 light        80          28
4  Eastham Ferry      7.2  dark        80          20
5       Hawarden     24.1 light        52          18
6       Hawarden     24.1  dark        52          22
7    Loggerheads     30.2 light        60           9
8    Loggerheads     30.2  dark        60          16
9       Llanbedr     36.4 light        60          16
10      Llanbedr     36.4  dark        60          23
11     Pwyllglas     41.5 light        84          20
12     Pwyllglas     41.5  dark        84          40
13   Clergy Mawr     51.2 light        92          24
14   Clergy Mawr     51.2  dark        92          39
out1 <- glm(cbind(Num_removed, Num_moths-Num_removed) ~ Morph*Distance, data=moths, family=binomial)
summary(out1)
Call:
glm(formula = cbind(Num_removed, Num_moths - Num_removed) ~ Morph *
    Distance, family = binomial, data = moths)

Deviance Residuals:
     Min        1Q    Median        3Q       Max 
-2.21183  -0.39883   0.01155   0.68292   1.31242 

Coefficients:
                     Estimate Std. Error z value Pr(>|z|)   
(Intercept)         -1.128987   0.197906  -5.705 1.17e-08 ***
Morphlight           0.411257   0.274490   1.498 0.134066   
Distance             0.018502   0.005645   3.277 0.001048 **
Morphlight:Distance -0.027789   0.008085  -3.437 0.000588 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 35.385  on 13  degrees of freedom
Residual deviance: 13.230  on 10  degrees of freedom
AIC: 83.904

Number of Fisher Scoring iterations: 4

The interaction is significant so we have evidence that the predation rate varies with distance in a different manner for the two morphs. From the output the predation rate for the dark morph increases with distance. The significant interaction tells us that the change in the predation rate with distance for the light morph is significantly less than it is for the dark morph. To understand this better we can force R to return separate slopes for each Morph type by refitting the model without the main effect of Distance.

out1a <- glm(cbind(Num_removed, Num_moths-Num_removed) ~ Morph:Distance + Morph, data=moths, family=binomial)
round(summary(out1a)$coefficients,5)
                    Estimate Std. Error  z value Pr(>|z|)
(Intercept)         -1.12899    0.19791 -5.70466  0.00000
Morphlight           0.41126    0.27449  1.49826  0.13407
Morphdark:Distance   0.01850    0.00565  3.27744  0.00105
Morphlight:Distance -0.00929    0.00579 -1.60439  0.10863

Now we see that for the dark morph the predation rate increases with distance (slope = 0.0185) and the relationship is significant (p = 0.001). For the light morph the predation rate decreases with distance (slope = –0.0093), although the relationship is not statistically significant (p = 0.109). The signs of the coefficients are consistent with theory. The further away from Liverpool we are the lighter the trees are thus making the dark morph more vulnerable to predation and the light morph less so.

Question 4

The residual deviance can be used as a goodness of fit statistic if the predicted counts are large enough, greater than 5. Checking this we find that all the predicted counts are greater than 5.

fitted(out1)*moths$Num_moths
       1        2        3        4        5        6        7
18.36202 13.68350 25.06645 21.58191 14.59064 17.44984 16.15784
       8        9       10       11       12       13       14
21.67119 15.48714 23.28317 20.92891 34.49737 21.40701 41.83303

I compare the residual deviance to a chi-squared distribution with degrees of freedom equal to the residual degrees of freedom.

1-pchisq(deviance(out1), df.residual(out1))
[1] 0.2111003

The residual deviance is not significant. Therefore we fail to find evidence of lack of fit.

Question 5

The experiment is set up in the form of a randomized block design. At each of the seven location we have pairs of observations: the light morph binomial result and the dark morph binomial result for a total of 14 measurements. Contrast this with an ordinary crossed design in which case there would be 14 locations and 14 measurements. The fact that we have measurements from the same location and measurements from different locations means that the data are heterogeneous. We should account for the pairing of observations in the analysis.

Question 6

I fit a logistic regression random intercepts model treating Location as the grouping variable. Because Distance varies across Location (but is constant within a Location), Distance is a level-2 predictor. Because Morph varies within a Location, Morph is a level-1 predictor.

library(lme4)
out1.lmer <- lmer(cbind(Num_removed, Num_moths-Num_removed) ~ Morph*Distance + (1|Location), data=moths, family=binomial)
summary(out1.lmer)
Generalized linear mixed model fit by the Laplace approximation
Formula: cbind(Num_removed, Num_moths - Num_removed) ~ Morph * Distance +      (1 | Location)
   Data: moths
 AIC  BIC logLik deviance
  23 26.2 -6.501       13
Random effects:
 Groups   Name        Variance Std.Dev.
 Location (Intercept) 0.011483 0.10716
Number of obs: 14, groups: Location, 7

Fixed effects:
                     Estimate Std. Error z value Pr(>|z|)   
(Intercept)         -1.130918   0.212723  -5.316 1.06e-07 ***
Morphlight           0.411129   0.274625   1.497 0.134379   
Distance             0.018477   0.006150   3.005 0.002659 **
Morphlight:Distance -0.027819   0.008089  -3.439 0.000584 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
            (Intr) Mrphlg Distnc
Morphlight  -0.672             
Distance    -0.871  0.581      
Mrphlght:Ds  0.570 -0.859 -0.643

Other than a slight increase in the standard errors of the parameter estimates, the results from the mixed effects model are virtually identical to the results from the ordinary logistic regression model.

Question 7

Analysis of deviance

Ordinarily to carry out an analysis of deviance we would subtract the deviance of the two models and compare the result to a chi-squared distribution with 1 degree of freedom (the number of parameters that the two models differ by). The parameter in question is the variance of the random effects, τ2, which is set to zero in the ordinary logistic regression model. Because 0 is a boundary value for the variance, the usual chisquared one distribution for the test statistic is incorrect. What we saw previously in lecture 18 was that the correct distribution can be approximated by a distribution that is approximately a 50:50 mixture of two chi-squared distributions, one with 0 degrees of freedom and one with 1 degree of freedom. Thus we should use a mixture of a chisquare zero and a chisquared one random variable, each assigned a weight of 0.5.

mixture

This translates into dividing the usual p-value of our test by 2.

deviance(out1)-deviance(out1.lmer)
[1] 0.2277714
0.5*(1-pchisq(deviance(out1)-deviance(out1.lmer),1))
[1] 0.3165906

The test statistic is not significant so we conclude that the two models are not significantly different. Thus there doesn't appear to be a need for the random effects.

Using AIC

Alternatively we can calculate the AIC of the models. The values returned by AIC and logLik for an lmer model cannot be compared to those same values from an ordinary logistic regression model. Following the hint for this question I calculate the log-likelihood of a saturated model. The saturated model has one parameter per observation. I create a factor variable with 14 levels and fit the logistic regression model to it.

sat.fact <- factor(1:14)
out3 <- glm(cbind(Num_removed, Num_moths-Num_removed) ~ sat.fact, data=moths, family=binomial)
logLik(out3)
'log Lik.' -31.33724 (df=14)

Alternatively the formula for the deviance tells us the following.

deviance

Solving for the log-likelihood of the saturated model we obtain the following.

saturated

Using this formula with the glm model we obtain the following.

logLik(out1) + 0.5*deviance(out1)
'log Lik.' -31.33724 (df=4)

which is the same answer we obtained above. Using the deviance expression again I this time solve for the log-likelihood of the model of interest.

LL model

For the mixed effects model we estimated four regression parameters plus the variance of the random effects for a total of 5 parameters. Using this and the correct log-likelihood for the lmer model we obtain the following value of AIC.

LL.out.lmer <- logLik(out3)-.5*deviance(out1.lmer)
LL.out.lmer
'log Lik.' -37.8383 (df=14)
data.frame(model=c('glm','lmer'), LL=c(logLik(out1), LL.out.lmer), df=c(4,5), AIC=c(AIC(out1), -2*LL.out.lmer+2*5))
  model        LL df      AIC
1   glm -37.95219  4 83.90438
2  lmer -37.83830  5 85.67661

The AIC of the random effects model is larger than the AIC of the glm model. Once again we find little support for the random effects.

Question 8

Consider two moths both of the dark morph that are 50 km apart. The predicted logits for the two moths are the following.

log odds 50

Subtracting the two expressions yields the following.

OR 50

If we turn to the light morphs we get the same kind of expression except for light morphs the effect of distance is given by β2 + β3.

ORlight

Subtracting the two expressions and exponentiating the result yields the following.

OR 50

Plugging in the data we obtain the odds ratios.

fixef(out1.lmer)
        (Intercept)          Morphlight            Distance
        -1.13091787          0.41112874          0.01847748
Morphlight:Distance
        -0.02781868
# dark morph
exp(50*fixef(out1.lmer)[3])
Distance
 2.51903
# light morph
exp(50*(fixef(out1.lmer)[3] + fixef(out1.lmer)[4]))
 Distance
0.6268426

So for the dark morph the odds of being removed in North Wales is 2.5 times its odds of being removed in Liverpool. For the light morph the odds of it being removed in North Wales is only 62% of its odds of being removed in Liverpool.

Question 9

mycol <- c('tomato', 'dodgerblue3')
plot(Num_removed/Num_moths~Distance, data=moths, col=mycol[as.numeric(moths$Morph)], pch=1, ylab='Probability of removal', xlab='Distance from Liverpool (km)')
# fixed effects curve
my.prob <- function(morph,x) {
linpred <- fixef(out1.lmer)[1] + fixef(out1.lmer)[2]*(morph=='light') + fixef(out1.lmer)[3]*x + fixef(out1.lmer)[4]*(morph=='light')*x
exp(linpred)/(1+exp(linpred))}
curve(my.prob('dark',x), add=T, col=mycol[1], lty=2)
curve(my.prob('light',x), add=T, col=mycol[2], lty=2)
# subject-specific estimates
points(moths[moths$Morph=='dark', 'Distance'], fitted(out1.lmer)[moths$Morph=='dark'], pch=8, cex=.8, type='o', col=mycol[1])
points(moths[moths$Morph=='light', 'Distance'], fitted(out1.lmer)[moths$Morph=='light'], pch=8, cex=.8, type='o', col=mycol[2])
legend('bottomleft', rep(c('observed', 'fixed only', 'fixed + random'),2), col=rep(mycol, each=3), pch=rep(c(1,NA,8), 2), lty=rep(c(NA,2,1), 2), cex=.9, bty='n', ncol=2, title=c(expression(bold('Dark morph      Light morph'))))

   fig. 4
Fig. 4   Effect of distance on the probability of removal of individual morphs

The curves labeled "fixed only" correspond to the estimated regression model from the mixed effects model that includes only the fixed effects terms. The trajectories labeled "fixed + random" are the location-specific probabilities obtained by adding the random effects to the already displayed fixed effects curves.

Question 10

The four basic assumptions of the binomial distribution are:

  1. The experiment must consist of a sequence of trials in which each trial is a Bernoulli trial, meaning only one of two outcomes with probabilities p and 1 – p can occur.
  2. The number of trials is fixed ahead of time at n, a value that is known to us.
  3. The probability p is the same on each Bernoulli trial.
  4. The Bernoulli trials are independent.

Clearly the first two assumptions still hold. Independence is always an issue with binomial trials, but we need to assume that the researche did not choose to blatantly violate this assumption. If you look at the paper you'll see that the protocol was to put one moth out for a fixed length of time. They then recorded whether during that time it was removed or not. If it was not removed they removed it and put a fresh moth out. This was then repeated until the moth supply was exhausted.

If birds developed a search image over time then perhaps independence might have been an issue in the original design. With multiple trees the same protocol was followed. If anything the use of multiple trees rather than a single tree might actually make it more likely the individual Bernoulli trials were independent.

What the introduction of multiple trees does do is introduce observational heterogeneity. Given that different trees are not likely to be identical in their bark color and other characteristics, the probability of a moth being detected and eaten probably varied from tree to tree. We've been assuming that except for the effect of morph type the probability of being removed was the same for each moth at a location. This might not be the case for different moths if they were on different trees.

Question 11

A simple rule is that the normal approximation to the binomial distribution is likely to hold if both np > 5 and n(1 – p) > 5. I check this for our data. np and n(1 – p) are just the number of moths removed and the number of moths that are left behind.

sum(moths$Num_removed<=5) + sum((moths$Num_moths-moths$Num_removed) <= 5)
[1] 0

So both of these values are greater than 5 for every observation. The normal approximation should be reasonable. I analyze the proportions using a randomized block design treating block (location) as a random effect.

moths$p <- moths$Num_removed/moths$Num_moths
detach(package:lme4)
library(nlme)
out.lme <- lme(p~Morph*Distance, random=~1|Location, data=moths)
summary(out.lme)
Linear mixed-effects model fit by REML
 Data: moths
       AIC      BIC   logLik
  2.149637 3.965148 4.925181

Random effects:
 Formula: ~1 | Location
        (Intercept)  Residual
StdDev:  0.05410956 0.0378578

Fixed effects: p ~ Morph * Distance
                          Value  Std.Error DF   t-value p-value
(Intercept)          0.24394502 0.04712545  5  5.176503  0.0035
Morphlight           0.08102398 0.03820586  5  2.120721  0.0874
Distance             0.00401661 0.00146803  5  2.736044  0.0410
Morphlight:Distance -0.00590237 0.00119017  5 -4.959245  0.0043
 Correlation:
                    (Intr) Mrphlg Distnc
Morphlight          -0.405             
Distance            -0.848  0.344      
Morphlight:Distance  0.344 -0.848 -0.405

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max
-0.83172423 -0.66326402 -0.01896693  0.52633610  1.12747315

Number of Observations: 14
Number of Groups: 7

Our conclusions are the same. The probability of being removed increases with distance for the dark morph. This change with distance is significantly less for the light morph where it actually decreases with distance (although not significantly different from zero).

Question 12

I obtain the pairwise differences in the proportion of moths removed for each pair of morphs, dark minus light, at each location and regress them against the location's distance from Liverpool.

out.diff <- tapply(moths$Num_removed/moths$Num_moths, moths$Location, function(x) x[1]-x[2])
out.dist <- tapply(moths$Distance, moths$Location, function(x) x[1])
out.reg <- lm(out.diff~out.dist)
summary(out.reg)
Call:
lm(formula = out.diff ~ out.dist)

Residuals:
  Clergy Mawr Eastham Ferry      Hawarden      Llanbedr   Loggerheads
      0.05813       0.06147      -0.01570       0.01716      -0.01944
    Pwyllglas   Sefton Park
     -0.07417      -0.02745

Coefficients:
             Estimate Std. Error t value Pr(>|t|)  
(Intercept)  0.081024   0.038206   2.121  0.08743 .
out.dist    -0.005902   0.001190  -4.959  0.00425 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.05354 on 5 degrees of freedom
Multiple R-squared: 0.831,  Adjusted R-squared: 0.7973
F-statistic: 24.59 on 1 and 5 DF,  p-value: 0.004251

Observe that the two rows of the summary table from this model match the two rows in the summary table of the randomized blocks design of Question 11 labeled Morph and Morph:Distance. There Morph represents the difference in removal rate between dark and light Morphs at distance 0 (Liverpool). The interaction in that model represents how much the probability change with distance differs between the two morphs. Both of these are difference estimates. It is always the case that when you analyze paired data as a randomized block design or as a regression on differences (thus removing the block effects) you get identical results for the differences. Of course with the paired difference analysis you can't predict the actual counts at each site, only the difference between them.

The point of Questions 11 and 12 is to note that it is not always necessary to carry out a complicated analysis of data. The old-fashioned methods that were popular before the advent of sophisticated statistical software often work just as well and can be easier for the lay reader to understand.


final 3 scores

Course Home Page

Jack Weiss
Phone: (919) 962-5930
E-Mail: jack_weiss@unc.edu
Address: Curriculum for Ecology and the Environment, Box 3275, University of North Carolina, Chapel Hill, 27599
Copyright © 2012
Last Revised--December 11, 2012
URL: https://sakai.unc.edu/access/content/group/3d1eb92e-7848-4f55-90c3-7c72a54e7e43/public/docs/solutions/final.htm