Assignment 6—Solution

Question 1

This is a simple split plot design with two levels. Pond is the whole plot unit and the six observations obtained from each pond (the clutches of eggs) are the split plot units. The variable "species" only varies between ponds and hence is a whole plot treatment. Depth and season vary within a pond and are split plot treatments. At the split plot level we have a crossed two-factor design in which depth and season are paired in all possible (six) ways. We can imagine that potential locations in a given pond were classified as deep and shallow. Three of the deep sites were randomly selected and three of the shallow sites were randomly selected. The three sites at each depth were then randomly assigned to season. (In a true crossed experimental design we would randomly choose six observations and these would then be assigned one of the six unique treatment combinations of depth and season.)

Without species this is just a randomized block design in which pond is the blocking variable where within each block we have six treatments obtained by combining the two factors season and depth in all possible combinations. When we add species, a variable that only varies between blocks (ponds), we have a split plot design.

The only other intelligible interpretation of this design is as a split plot design with a repeated measures design at the split plot level. For this design to hold you'd have to assume that there were only two locations used in each pond, one deep and one shallow. These two locations were then followed over time being measured repeatedly at season = 'early', 'middle', and 'late'. The hints to this problem essentially tell you to reject this as a possible design.

Question 2

fisheggs <- read.delim("fisheggs.txt")

fisheggs[1:4,]

sodium pond species season   depth
1    254    1       1 early shallow
2    257    1       1 early    deep
3    249    1       1 middle shallow
4    246    1       1 middle    deep

library(nlme)

model1 <- lme(sodium~factor(species)*season*depth, random=~1|pond, data=fisheggs)

anova(model1)

                             numDF denDF F-value p-value
(Intercept)                      1    20 3600.999 <.0001
factor(species)                  1     4   11.080 0.0291
season                           2    20   36.168 <.0001
depth                            1    20    5.067 0.0358
factor(species):season           2    20    3.960 0.0356
factor(species):depth            1    20    0.126 0.7260
season:depth                     2    20    0.285 0.7547
factor(species):season:depth     2    20    0.117 0.8902

The three-factor interaction is not significant. Of the two-factor interactions it appears that only the species × season interaction is significant. We can explore this further by dropping the three-factor interaction and then obtaining the marginal tests from lme.

model1a <- update(model1, .~. - factor(species):season:depth)

anova(model1a, type='marginal')

                       numDF denDF   F-value p-value
(Intercept)                1    22 1955.8191 <.0001
factor(species)            1     4   13.6370 0.0210
season                     2    22   25.8803 <.0001
depth                      1    22    2.3846 0.1368
factor(species):season     2    22    4.3054 0.0264
factor(species):depth      1    22    0.1374 0.7145
season:depth               2    22    0.3103 0.7364

In truth because of the balanced design the marginal tests of the two-factor interactions are the same as the sequential tests. In any case it appears we can drop all of the two-factor interactions except species × season.

model2 <- lme(sodium~factor(species)*season + depth, random=~1|pond, data=fisheggs)

anova(model2)

                       numDF denDF F-value p-value
(Intercept)                1    25 3600.999 <.0001
factor(species)            1    4   11.080 0.0291
season                     2    25   43.199 <.0001
depth                      1    25    6.052 0.0211
factor(species):season     2    25    4.730 0.0181

So our final model includes the main effect of depth and an interaction between season and species.

Question 3

The set-up using aov is nearly the same. We just replace random=~1|pond with Error(factor(pond)). It's important that we use factor(pond) because otherwise aov treats pond as continuous when setting up tests and interactions.

out.aov <- aov(sodium~factor(species)*season*depth + Error(factor(pond)), data=fisheggs)

summary(out.aov)

Error: factor(pond)
                Df Sum Sq Mean Sq F value Pr(>F)
factor(species) 1   6110    6110   11.08 0.0291 *
Residuals        4   2206     551
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Error: Within
                             Df Sum Sq Mean Sq F value   Pr(>F)
season                        2 1288.4   644.2 36.168 2.27e-07 ***
depth                         1   90.2    90.2   5.067   0.0358 *
factor(species):season        2 141.1    70.5   3.960   0.0356 *
factor(species):depth         1    2.3     2.3   0.126   0.7260
season:depth                  2   10.2     5.1   0.285   0.7547
factor(species):season:depth 2    4.2    2.1   0.117   0.8902
Residuals                    20 356.2    17.8
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Question 4

I start by obtaining the population average estimates of the means.

fisheggs$pred <- predict(model2, level=0)

Because season is an ordinal variable I choose to display it on the x-axis. I reset the levels of the season variable so that they're in time order.

levels(fisheggs$season)

[1] "early" "late" "middle"

fisheggs$season <- factor(fisheggs$season, levels=c('early','middle','late'))

levels(fisheggs$season)

[1] "early" "middle" "late"

Much of the code from lecture 10 can be used here with minimal modification. For the conditioning variables I specify them in the order depth + species so that species varies within a row to facilitate detecting the nature of the species × season interaction. I also improve the labeling of the species variable in the plot and add appropriate labels.

library(lattice)

xyplot(sodium~season|factor(species, levels=1:2, labels=paste("Species ", 1:2, sep='')) + depth, ylab=expression(paste('Sodium content ', (paste(mu, 'mol')/'g'), sep='')), xlab='Season', data=fisheggs, groups=pond, type='o', panel=function(x, y, subscripts, groups, ...){

panel.superpose(x, y, subscripts, groups, col='grey70', ...)

panel.xyplot(x[1:3], fisheggs$pred[subscripts][1:3], col=2, pch=8, type='o')})

library(latticeExtra)

panel.superpose(x, y, subscripts, groups, col='grey70', ...)

panel.xyplot(x[1:3], fisheggs$pred[subscripts][1:3], col=2, pch=8, type='o')}) -> graph1

useOuterStrips(graph1)

Fig. 1 Population average mean response (

) and raw data (–o–) from the split plot design.

While the interaction between season and species is more or less apparent in Fig. 1, it might be preferable to show both species profiles for a given depth in the same panel. The following code only conditions on depth and plots the profiles for both species in the same panel. It accomplishes this by subsetting the data by species within the panel function.

xyplot(sodium~season|depth, ylab=expression(paste('Sodium content ', (paste(mu,'mol')/'g'), sep='')), xlab='Season', data=fisheggs, groups=pond, type='o', panel=function(x, y, subscripts, groups, ...){

panel.superpose(x, y, subscripts, groups, col=c(rep('grey70',3), rep('pink2',3)),...)

# select predictions and species observations for the current panel

cur.pred <- fisheggs$pred[subscripts]

cur.sp <- fisheggs$species[subscripts]

# select species 1

p1 <- cur.pred[cur.sp==1]

panel.xyplot(x[1:3], p1[1:3], col=1, pch=8, type='o', lwd=2)

# select species 2

p1 <- cur.pred[cur.sp==2]

panel.xyplot(x[1:3], p1[1:3], col=2, pch=8, type='o', lwd=2)},

key=list(x=.75, y=.85, corner=c(0,0), lines=list(lty=1, col=c(1,2), size=2, lwd=2), text=list(c('Species 1', 'Species 2'), cex=.9)))

Fig. 2 Population average mean response (

) and raw data (–o–) from the split plot design.

Question 5

I convert the season variable to a numeric variable (the factor levels are coded 1, 2, and 3) and refit the model.

model1.num <- lme(sodium~factor(species)*as.numeric(season)*depth, random=~1|pond, data=fisheggs)

anova(model1.num)

                                         numDF denDF F-value p-value
(Intercept)                                  1    24 3600.999 <.0001
factor(species)                              1     4   11.080 0.0291
as.numeric(season)                           1    24   78.435 <.0001
depth                                        1    24    5.611 0.0262
factor(species):as.numeric(season)           1    24    8.715 0.0070
factor(species):depth                        1    24    0.140 0.7117
as.numeric(season):depth                     1   24    0.508 0.4830
factor(species):as.numeric(season):depth     1    24    0.259 0.6154

We get the same pattern as before, a significant main effect due to depth and a significant interaction between season and species.

model2.num <- lme(sodium~factor(species)*as.numeric(season) + depth, random=~1|pond, data=fisheggs)

anova(model2.num)

                                   numDF denDF F-value p-value
(Intercept)                            1    27 3600.999 <.0001
factor(species)                        1     4   11.080 0.0291
as.numeric(season)                     1    27   85.027 <.0001
depth                                  1    27    6.083 0.0203
factor(species):as.numeric(season)     1    27    9.447 0.0048

I obtain the predictions from this model and add them to the graph of Fig. 1

panel.superpose(x, y, subscripts, groups, col='grey70', ...)

# predictions of factor season model

panel.xyplot(x[1:3], fisheggs$pred[subscripts][1:3], col=2, pch=8, type='o')

# predictions of linear season model

panel.xyplot(x[1:3], fisheggs$pred2[subscripts][1:3], col=4, lty=2, type='l')},

key=list(x=.7, y=.84, corner=c(0,0), lines=list(lty=c(1,2), col=c(2,4), size=2), text=list(c('factor season', 'linear season'), cex=.9))) -> mygraph2

useOuterStrips(mygraph2)

Fig. 3 Comparing the population average mean response for a factor season model and a numeric season model.

Question 6

I redo Fig. 2 using the linear model instead of the factor model.

xyplot(sodium~season|depth, data=fisheggs, ylab=expression(paste('Sodium content ', (paste(mu,'mol')/'g'), sep='')), xlab='Season', groups=pond, type='o', panel=function(x, y, subscripts, groups,...){

panel.superpose(x, y, subscripts, groups, col=c(rep('grey70',3), rep('pink2',3)), ...)

# subset variables for current panel

cur.pred <- fisheggs$pred2[subscripts]

cur.sp <- fisheggs$species[subscripts]

cur.depth <- fisheggs$depth[subscripts]

# select species 1

p1 <- cur.pred[cur.sp==1]

panel.xyplot(x[1:3], p1[1:3], col=1, type='l', lwd=2)

# select species 2

p2 <- cur.pred[cur.sp==2]

panel.xyplot(x[1:3], p2[1:3], col=2, type='l', lwd=2)

# label the distance between the mean profiles

if(cur.depth[1]=='deep') {

# season = early

panel.arrows(as.numeric(x[1])-.05, p1[1]-.9, as.numeric(x[1])-.05, p2[1]+.9, angle=45, length=0.075, code=3, col='dodgerblue')

panel.text(x[1], (p1[1]+p2[1])/2, labels=30.89, col='dodgerblue',cex=.75, pos=2, offset=.65)

# season = late

panel.arrows(as.numeric(x[3])+.05, p1[3]-.75, as.numeric(x[3])+.05, p2[3]+.75, angle=45, length=0.075, code=3, col='dodgerblue')

panel.text(x[3],(p1[3]+p2[3])/2, labels=21.22, col='dodgerblue', cex=.75, pos=4, offset=.65)}

}, key=list(x=.75,y=.85,corner=c(0,0), lines=list(lty=1, col=c(1,2), size=2, lwd=2), text=list(c('Species 1','Species 2'), cex=.9)))

Fig. 4 Population average mean responses (—) and raw data (–o–) for a model in which season is treated as a numeric variable with equally spaced categories.

From the graph we see that there are two differences between the species.

The eggs of species 2 have consistently lower sodium levels than do those of species 1.
The rate of decline of egg sodium level over the course of the season is greater for species 1 (steeper slope) than it is for species 2.

The summary table of the model is shown below.

round(summary(model2.num)$tTable,3)

                                      Value Std.Error DF t-value p-value
(Intercept)                         268.806     5.999 27 44.805   0.000
factor(species)2                    -35.722     8.436 4 -4.235   0.013
as.numeric(season)                   -9.667     1.112 27 -8.694   0.000
depthshallow                         -3.167     1.284 27 -2.466   0.020
factor(species)2:as.numeric(season)   4.833     1.572 27   3.074   0.005

The estimated model is

Because the first numerical value of season is 1 the estimates of the intercept and the species main effect are not directly interpretable in this model. The intercept and the species main effect correspond to when season = 0, a value that is outside the range of the data. Even worse it is unclear what season = 0 could possibly mean given that season actually represents "early", "middle", and "late" , values that we are assuming are equally spaced.

To make the intercept interpretable we should code the numeric season variable so that zero corresponds to the level season = 'early'.

model2a.num <- lme(sodium~factor(species)*I(as.numeric(season)-1) + depth, random=~1|pond, data=fisheggs)

round(summary(model2a.num)$tTable,3)

                                             Value Std.Error DF t-value p-value
(Intercept)                                259.139     5.682 27 45.608   0.000
factor(species)2                           -30.889     7.984 4 -3.869   0.018
I(as.numeric(season) - 1)                   -9.667     1.112 27 -8.694   0.000
depthshallow                                -3.167     1.284 27 -2.466   0.020
factor(species)2:I(as.numeric(season) - 1)   4.833     1.572 27   3.074   0.005

Now we see that when season = 1 ('early') the sodium content of species 2 eggs is 30.9 units below that of species 1 eggs (the value of β₁). During the course of the season the sodium content of species 1 eggs decreases at a rate of 9.67 µmol/g per unit time (β₂), but the sodium content of species 2 eggs decreases at a rate of 4.83 µmol/g per unit time, obtained by adding the season effect to the season × species interaction (β₂ + β₄). (Note: it is entirely fortuitous that the rate for species 2, β₂ + β₄, and the difference in the rates between species 1 and species 2, β₂, are the same.)

fixef(model2a.num)[3] + fixef(model2.num)[5]

I(as.numeric(season) - 1)
-4.833333

When season = 1 ('early') the mean sodium content of the eggs of the two species differs by β₁ = 30.89 µmol/g. When season = 3 ('late') the mean sodium content of the eggs differs by only β₁ + (3 – 1)β₄ = 21.22 µmol/g. We can also calculate this difference by subtracting the prediction of species 2 from species 1 when season = 3.

stuff <- data.frame(season=c(3,3), depth=c('shallow','shallow'), species=c(1,2))

out.late <- predict(model2a.num, newdata=stuff, level=0)

out.late

[1] 236.6389 215.4167
attr(,"label")
[1] "Predicted values"

out.late[2]-out.late[1]

[1] -21.22222

Table 1 summarizes the estimated differences between the species.

Table 1 Estimated species effects on egg sodium content

Characteristic		Difference: Species 2 – Species 1	Species 1	Species 2
Rate of change of egg sodium content over time (μmol/g/time)		4.83	–9.67	–4.83
Egg sodium content (μmol/g )	Season = 'early'	–30.89	varies with depth
Egg sodium content (μmol/g )	Season = 'late'	–21.22	varies with depth

hw 6

Course Home Page

Jack Weiss
Phone: (919) 962-5930
E-Mail: jack_weiss@unc.edu
Address: Curriculum for Ecology and the Environment, Box 3275, University of North Carolina, Chapel Hill, 27599
Copyright © 2012
Last Revised--October 21, 2012
URL: https://sakai.unc.edu/access/content/group/3d1eb92e-7848-4f55-90c3-7c72a54e7e43/public/docs/solutions/assign6.htm