Assignment 8—Solution

Question 1

stirrett <- read.delim('ecol 563/Stirrett.txt')
stirrett
   Count Trt1 Trt2 Trt3 Trt4
1      0   19   24   43   47
2      1   12   16   35   23
3      2   18   16   17   27
4      3   18   18   11    9
5      4   11   15    5    7
6      5   12    9    4    3
7      6    7    6    1    1
8      7    8    5    2    1
9      8    4    3    2    0
10     9    4    4    0    0
11    10    1    3    0    1
12    11    0    0    0    1
13    12    1    1    0    0
14    13    1    0    0    0
15    15    1    0    0    0
16    17    1    0    0    0
17    19    1    0    0    0
18    26    1    0    0    0

These are the tabulated data. I begin by generating the raw data. I do this separately by treatment, using the rep function to repeat the values in the Count column the number of times indicated in each of the treatment columns.

trt1 <- rep(stirrett$Count, stirrett$Trt1)
trt2 <- rep(stirrett$Count, stirrett$Trt2)
trt3 <- rep(stirrett$Count, stirrett$Trt3)
trt4 <- rep(stirrett$Count, stirrett$Trt4)

Next we concatenate the counts contained in these three treatment vectors and add a second column that records the treatment to which they correspond.

borer <- data.frame(count = c(trt1, trt2, trt3, trt4), treatment = rep(1:4, c(length(trt1), length(trt2), length(trt3), length(trt4))))
dim(borer)
[1] 480 2

We're told that the four treatments are connected in a special way. They correspond to combining the levels of two binary factor variables in all possible ways. The correspondence is detailed in Table 1 where we see that treatments 3 and 4 received an early application of fungal spores, while treatments 2 and 4 received a late application of fungal spores. I add these two factors as two additional columns of the data frame.

borer$early <- factor(ifelse(borer$treatment %in% c(3,4), 'present', 'absent'))
borer$late <- factor(ifelse(borer$treatment %in% c(2,4), 'present', 'absent'))

When treatments are created by combining factors it is always advantageous to analyze the experiment as a factorial design in this case with two factors rather than as a one-way design with a single treatment. The two-factor interaction model yields a model that is identical to the one-way design. It's just parameterized differently. If the two-factor interaction is significant then nothing has been accomplished by analyzing things as a factorial design. If the interaction is not significant then we can then proceed to simplify the model and perhaps end up with a much simpler interpretation of the effect of treatment. I fit a two-factor interaction model assuming a negative binomial distribution for the response. I use the anova function to obtain sequential likelihood ratio tests.

library(MASS)
out.NB1 <- glm.nb(count~early*late, data=borer)
anova(out.NB1)
Analysis of Deviance Table

Model: Negative Binomial(1.4714), link: log

Response: count

Terms added sequentially (first to last)

           Df Deviance Resid. Df Resid. Dev Pr(>Chi)   
NULL                         479     623.56            
early       1   81.157       478     542.41   <2e-16 ***
late        1    1.927       477     540.48   0.1651   
early:late  1    1.738       476     538.74   0.1874   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Warning message:
In anova.negbin(out.NB1) : tests made without re-estimating 'theta'

Reading from the bottom of the table up we see that the two-factor interaction between early and late is not significant. Furthermore we also see the late main effect is also not significant given that the early effect is already in the model. The early effect is highly significant. I refit the model with just an early effect.

out.NB2 <- glm.nb(count~early, data=borer)

Question 2

To carry out the goodness of fit tests we need to work with the tabulated data. As we've seen, not all of the count categories are shown in the table. Intermediate categories that were not recorded for any of the four treatments are not included. For the goodness of fit test we need to include these missing categories and assign them a frequency of zero. Following the hint for this problem we can carry this out by creating a dummy data frame with just the full set of count categories and then merge this data frame with the tabulated data.

junk <- data.frame(Count=0:26)
borer.table <- merge(stirrett, junk, all=T)
borer.table
   Count Trt1 Trt2 Trt3 Trt4
1      0   19   24   43   47
2      1   12   16   35   23
3      2   18   16   17   27
4      3   18   18   11    9
5      4   11   15    5    7
6      5   12    9    4    3
7      6    7    6    1    1
8      7    8    5    2    1
9      8    4    3    2    0
10     9    4    4    0    0
11    10    1    3    0    1
12    11    0    0    0    1
13    12    1    1    0    0
14    13    1    0    0    0
15    14   NA   NA   NA   NA
16    15    1    0    0    0
17    16   NA   NA   NA   NA
18    17    1    0    0    0
19    18   NA   NA   NA   NA
20    19    1    0    0    0
21    20   NA   NA   NA   NA
22    21   NA   NA   NA   NA
23    22   NA   NA   NA   NA
24    23   NA   NA   NA   NA
25    24   NA   NA   NA   NA
26    25   NA   NA   NA   NA
27    26    1    0    0    0
temp <- sapply(2:5, function(x) ifelse(is.na(borer.table[,x]), 0, borer.table[,x]))
dim(temp)
[1] 27  4
temp
      [,1] [,2] [,3] [,4]
 [1,]   19   24   43   47
 [2,]   12   16   35   23
 [3,]   18   16   17   27
 [4,]   18   18   11    9
 [5,]   11   15    5    7
 [6,]   12    9    4    3
 [7,]    7    6    1    1
 [8,]    8    5    2    1
 [9,]    4    3    2    0
[10,]    4    4    0    0
[11,]    1    3    0    1
[12,]    0    0    0    1
[13,]    1    1    0    0
[14,]    1    0    0    0
[15,]    0    0    0    0
[16,]    1    0    0    0
[17,]    0    0    0    0
[18,]    1    0    0    0
[19,]    0    0    0    0
[20,]    1    0    0    0
[21,]    0    0    0    0
[22,]    0    0    0    0
[23,]    0    0    0    0
[24,]    0    0    0    0
[25,]    0    0    0    0
[26,]    0    0    0    0
[27,]    1    0    0    0
borer.table[,2:5] <- temp
borer.table
   Count Trt1 Trt2 Trt3 Trt4
1      0   19   24   43   47
2      1   12   16   35   23
3      2   18   16   17   27
4      3   18   18   11    9
5      4   11   15    5    7
6      5   12    9    4    3
7      6    7    6    1    1
8      7    8    5    2    1
9      8    4    3    2    0
10     9    4    4    0    0
11    10    1    3    0    1
12    11    0    0    0    1
13    12    1    1    0    0
14    13    1    0    0    0
15    14    0    0    0    0
16    15    1    0    0    0
17    16    0    0    0    0
18    17    1    0    0    0
19    18    0    0    0    0
20    19    1    0    0    0
21    20    0    0    0    0
22    21    0    0    0    0
23    22    0    0    0    0
24    23    0    0    0    0
25    24    0    0    0    0
26    25    0    0    0    0
27    26    1    0    0    0

Next we need to repeat the Count column four times, stack the four treatment columns, and add a vector that identifies the treatments.

final.borer <- data.frame(count=rep(borer.table$Count,4), freq=unlist(borer.table[,2:5]), treatment=rep(1:4,each=nrow(borer.table)))

I create the factor variables from before.

final.borer$early <- factor(ifelse(final.borer$treatment %in% c(3,4), 'present', 'absent'))
final.borer$late <- factor(ifelse(final.borer$treatment %in% c(2,4), 'present', 'absent'))
final.borer[1:8,]
      count freq treatment  early   late
Trt11     0   19         1 absent absent
Trt12     1   12         1 absent absent
Trt13     2   18         1 absent absent
Trt14     3   18         1 absent absent
Trt15     4   11         1 absent absent
Trt16     5   12         1 absent absent
Trt17     6    7         1 absent absent
Trt18     7    8         1 absent absent

Add model results

I next add the predicted mean from the negative binomial model.

final.borer$mu <- predict(out.NB2, type='response', newdata=data.frame(early=final.borer$early))
final.borer[1:8,]
      count freq treatment  early   late  mu
Trt11     0   19         1 absent absent 3.6
Trt12     1   12         1 absent absent 3.6
Trt13     2   18         1 absent absent 3.6
Trt14     3   18         1 absent absent 3.6
Trt15     4   11         1 absent absent 3.6
Trt16     5   12         1 absent absent 3.6
Trt17     6    7         1 absent absent 3.6
Trt18     7    8         1 absent absent 3.6
final.borer$p <- dnbinom(final.borer$count,mu=final.borer$mu,size=out.NB2$theta)
final.borer[1:8,]
      count freq treatment  early   late  mu          p
Trt11     0   19         1 absent absent 3.6 0.16416474
Trt12     1   12         1 absent absent 3.6 0.16930752
Trt13     2   18         1 absent absent 3.6 0.14770788
Trt14     3   18         1 absent absent 3.6 0.12104009
Trt15     4   11         1 absent absent 3.6 0.09598139
Trt16     5   12         1 absent absent 3.6 0.07458538
Trt17     6    7         1 absent absent 3.6 0.05716880
Trt18     7    8         1 absent absent 3.6 0.04338662
final.borer$p <- final.borer$p + ifelse(final.borer$count==26, 1-pnbinom(26, mu=final.borer$mu, size=out.NB2$theta),0)
final.borer[1:8,]
      count freq treatment  early   late  mu          p
Trt11     0   19         1 absent absent 3.6 0.16416474
Trt12     1   12         1 absent absent 3.6 0.16930752
Trt13     2   18         1 absent absent 3.6 0.14770788
Trt14     3   18         1 absent absent 3.6 0.12104009
Trt15     4   11         1 absent absent 3.6 0.09598139
Trt16     5   12         1 absent absent 3.6 0.07458538
Trt17     6    7         1 absent absent 3.6 0.05716880
Trt18     7    8         1 absent absent 3.6 0.04338662
tapply(final.borer$p, final.borer$treatment, sum)
1 2 3 4
1 1 1 1
tapply(final.borer$freq, final.borer$treatment, sum)
  1   2   3   4
120 120 120 120
n <- tapply(final.borer$freq, final.borer$treatment, sum)
final.borer$exp <- final.borer$p * n[final.borer$treatment]
final.borer[1:8,]
      count freq treatment  early   late  mu          p       exp
Trt11     0   19         1 absent absent 3.6 0.16416474 19.699769
Trt12     1   12         1 absent absent 3.6 0.16930752 20.316902
Trt13     2   18         1 absent absent 3.6 0.14770788 17.724946
Trt14     3   18         1 absent absent 3.6 0.12104009 14.524810
Trt15     4   11         1 absent absent 3.6 0.09598139 11.517767
Trt16     5   12         1 absent absent 3.6 0.07458538  8.950245
Trt17     6    7         1 absent absent 3.6 0.05716880  6.860256
Trt18     7    8         1 absent absent 3.6 0.04338662  5.206394

Generate graph

library(lattice)
xyplot(freq~count|early+late, data=final.borer, type='h', lwd=8, col='grey', lineend=1, panel=function(x, y, subscripts){
panel.xyplot(x, y, type='h', lwd=8, col='grey', lineend=1)
panel.xyplot(x, final.borer$exp[subscripts], pch=16, cex=.8, type='o')})

I improve the labeling on the factors defining the panels and add a legend.

xyplot(freq~count|factor(early, levels=levels(early), labels=paste('Early = ', levels(early), sep=''))+factor(late, levels=levels(late), labels=paste('Late = ', levels(late), sep='')), data=final.borer, type='h', lwd=8, col='grey', lineend=1, ylab='frequency', panel=function(x,y,subscripts){
panel.xyplot(x, y, type='h', lwd=8, col='grey',lineend=1)
panel.xyplot(x, final.borer$exp[subscripts], pch=16, cex=.7, type='o', col=1)},
key=list(x=.7, y=.85, corner=c(0,0), lines=list(col=c('grey',1), type=c('p','o'), pch=c(15,16), cex=c(1.2,.7), size=2), text=list(c('observed', 'early main effect'), cex=.9), divide=1)) -> my.graph
library(latticeExtra)
useOuterStrips(my.graph)

fig. 1

Parametric bootstrap

borer.p <- split(final.borer$p, final.borer$treatment)
myfunc <- function() {
out.obs <- as.vector(sapply(1:4, function(x) rmultinom(1, size=n[x], prob=borer.p[[x]])))
sum((out.obs-final.borer$exp)^2/final.borer$exp)
}
set.seed(50)
sim.data <- replicate(9999, myfunc())
actual <- sum((final.borer$freq-final.borer$exp)^2/final.borer$exp)
actual
[1] 74.0738
pearson <- c(sim.data, actual)
pval <- sum(pearson>=actual)/length(pearson)
pval
[1] 0.4984

Question 3

The unconstrained model is the two-factor interaction model, or equivalently, a one-way model fit to the original treatment variable. We need to calculate the means, probabilities, and expected counts and add these to the graph of Question 2.

final.borer$mu2 <- predict(out.NB1, type='response', newdata=data.frame(early=final.borer$early, late=final.borer$late))
final.borer$p2 <- dnbinom(final.borer$count, mu=final.borer$mu2, size=out.NB1$theta)
final.borer$p2 <- final.borer$p2 + ifelse(final.borer$count==26, 1-pnbinom(26, mu=final.borer$mu2, size=out.NB1$theta), 0)
final.borer$exp2 <- final.borer$p2 * n[final.borer$treatment]
final.borer[1:8,]
      count freq treatment  early   late  mu          p       exp      mu2         p2      exp2
Trt11     0   19         1 absent absent 3.6 0.16416474 19.699769 4.033333 0.14350498 17.220598
Trt12     1   12         1 absent absent 3.6 0.16930752 20.316902 4.033333 0.15471639 18.565967
Trt13     2   18         1 absent absent 3.6 0.14770788 17.724946 4.033333 0.14008190 16.809828
Trt14     3   18         1 absent absent 3.6 0.12104009 14.524810 4.033333 0.11876694 14.252032
Trt15     4   11         1 absent absent 3.6 0.09598139 11.517767 4.033333 0.09727647 11.673177
Trt16     5   12         1 absent absent 3.6 0.07458538  8.950245 4.033333 0.07799453  9.359343
Trt17     6    7         1 absent absent 3.6 0.05716880  6.860256 4.033333 0.06163656  7.396387
Trt18     7    8         1 absent absent 3.6 0.04338662  5.206394 4.033333 0.04820245  5.784294
xyplot(freq~count|factor(early,levels=levels(early), labels=paste('Early = ',levels(early),sep=''))+factor(late,levels=levels(late), labels=paste('Late = ', levels(late),sep='')), data=final.borer, type='h', lwd=8, col='grey', lineend=1, ylab='frequency', panel=function(x,y,subscripts){
panel.xyplot(x, y, type='h', lwd=8, col='grey', lineend=1)
panel.xyplot(x, final.borer$exp[subscripts], pch=16, cex=.6, type='o', col=1)
panel.xyplot(x, final.borer$exp2[subscripts], pch=22, cex=1, type='o', col='dodgerblue')
}, key=list(x=.7, y=.85, corner=c(0,0), lines=list(col=c('grey',1,'dodgerblue'), type=c('p','o','o'), pch=c(15,16,22), cex=c(1.2,.6, 1), size=2), text=list(c('observed','early main effect','interaction'), cex=.9), divide=1)
)->my.graph2
useOuterStrips(my.graph2)

fig. 2


hw8

Course Home Page

Jack Weiss
Phone: (919) 962-5930
E-Mail: jack_weiss@unc.edu
Address: Curriculum for Ecology and the Environment, Box 3275, University of North Carolina, Chapel Hill, 27599
Copyright © 2012
Last Revised--November 13, 2012
URL: https://sakai.unc.edu/access/content/group/3d1eb92e-7848-4f55-90c3-7c72a54e7e43/public/docs/solutions/assign8.htm