Final Part 3—Solution

fish <- read.table( 'ecol 562/fish.txt', sep='\t', header=T)

names(fish)

[1] "trial" "day" "position" "treatment" "freq"

Question 1

There are many ways to do this. They all involve two operations that we then combine into a single step. Here are a few possibilities.

table(tapply(fish$treatment, fish$trial, function(x) as.character(x)[1]))

CN GR LF RH SM SQ YH YT
14 14 14 7 6 7 7 7

table(tapply(fish$treatment, fish$trial, function(x) unique(as.character(x))))

CN GR LF RH SM SQ YH YT
14 14 14 7 6 7 7 7

apply(table(fish$trial, fish$treatment), 2, function(x) sum(x>0))

CN GR LF RH SM SQ YH YT
14 14 14 7 6 7 7 7

Question 2

We have a categorical response with three categories. I fit a multinomial model with treatment as the predictor. I then compare it to a model containing only an intercept using a likelihood ratio test.

library(nnet)

mult1 <- multinom(position~treatment, data=fish, weights=freq)

# weights: 27 (16 variable)
initial value 2877.265584
iter 10 value 2746.459908
iter 20 value 2730.572541
final value 2730.564623
converged

mult0 <- multinom(position~1, data=fish, weights=freq)

# weights: 6 (2 variable)
initial value 2877.265584
final value 2759.967497
converged

anova(mult0,mult1)

Likelihood ratio tests of Multinomial Models

Response: position
      Model Resid. df Resid. Dev   Test    Df LR stat.      Pr(Chi)
1         1       454   5519.935
2 treatment       440   5461.129 1 vs 2    14 58.80575 1.898106e-07

The test is highly significant. There is evidence of a treatment effect.

Question 3

A goodness of fit test can be obtained by comparing the current model to a saturated model, a model that estimates separate parameters for each multinomial observation. The variable trial in the data set identifies the 76 different multinomial observations in this experiment. I enter it as the sole predictor in a multinomial model.

mult.S <- multinom(position~trial, data=fish, weights=freq)

# weights: 231 (152 variable)
initial value 2877.265584
iter 10 value 2574.368825
iter 20 value 2567.289077
iter 30 value 2567.225977
final value 2567.225814
converged

anova(mult1,mult.S)

Likelihood ratio tests of Multinomial Models

Response: position
      Model Resid. df Resid. Dev   Test    Df LR stat. Pr(Chi)
1 treatment       440   5461.129
2     trial       304   5134.452 1 vs 2   136 326.6776       0

So it appears that we have a significant lack of fit. This test does have some minimal cell size restrictions that must be met in order for the chi-square distribution of the test statistic to be valid. The general rule is that the number of expected counts less than 5 should not exceed 20%. I obtain the totals for each trial and the predicted probabilities on each trial and then use these to obtain the expected counts.

out.n <- tapply(fish$freq,fish$trial,sum)

out.treat <- factor(tapply(fish$treatment, fish$trial, function(x) as.character(x[1])))

out.p <- predict(mult1, type="probs", newdata=data.frame(treatment=out.treat))

exp.counts <- out.p*as.vector(out.n)

sum(exp.counts<5)/length(as.vector(exp.counts))

[1] 0.01754386

So only about 2% of the expected counts are less than 5. The goodness of fit test is valid.

Question 4

Different fish were used on different days and there may have been changes in protocol from day to day. Thus it makes sense to treat day as a blocking variable and include day as a factor in the model. The experiment was carried out over 14 days with not every treatment tested on each day. So, it's not possible to include an interaction between treatment and day with these data. Day has numeric values so I have to explicitly declare it as a factor when fitting the model.

mult2 <- multinom(position~treatment + factor(day), data=fish, weights=freq)

# weights: 66 (42 variable)
initial value 2877.265584
iter 10 value 2703.954220
iter 20 value 2662.880771
iter 30 value 2661.383908
iter 40 value 2661.317292
final value 2661.314489
converged

anova(mult2, mult.S)

Likelihood ratio tests of Multinomial Models

Response: position
                    Model Resid. df Resid. Dev   Test    Df LR stat.      Pr(Chi)
1 treatment + factor(day)       414   5322.629
2                   trial       304   5134.452 1 vs 2   110 188.1774 5.046884e-06

The fit has improved but there is still a significant lack of fit.

Question 5

Two assumptions of a multinomial experiment are the following.

The observations making up the individual multinomial trials are independent.
The probabilities of the categories are constant except as dictated by predictors in the model.

Independence is clearly violated. We are told that this is a repeated measures design. The same fish were observed 12 times and their positions at these times were recorded. Given this it's pretty clear the observations for a given trial are not independent because we are repeatedly using observations from the same fish as well as different fish. Furthermore the three fish used on each trial can observe and respond to each other. This also violates independence.

It's also possible that the position selection probabilities of individual fish are not constant because if habituation occurs. If the fish grow accustomed to the presence of the predator fish during a trial their behavior may change. We're also told the same fish were reused during the course of the day, so it's possible that the fish had different selection probabilities early in the day than late in the day.

Question 6

To fit a multinomial model as a Poisson model we need to include specific variables that enforce the multinomial constraints. Here's a portion of the what the raw data look like when organized as multinomial trials.

xtabs(freq~trial+position, data=fish)[1:10,]

       position
trial    1 2 3
CN.1   6 5 25
CN.10 11 15 10
CN.11 8 21 7
CN.12 2 8 14
CN.13 6 15 15
CN.14 12 10 14
CN.2 12 12 12
CN.3 13 14 9
CN.4   7 18 11
CN.5 12 10 14

To obtain a multinomial model we need to include the variable trial as a predictor (to fix the row sums) and the variable position as a predictor (to fix the column sums). The null multinomial model when fit as a Poisson model is the following.

glm0 <- glm(freq~factor(position) + trial, data=fish, family=poisson)

This is called the independence model. It claims that trial and position are unrelated. So this is our "nothing's going on" model equivalent to the intercept-only multinomial model above. If we look at the predicted counts (fitted values) from this model we'd see that each trial is predicted to have the same fraction of 1s, 2s, and 3s.

out.tab <- xtabs(fitted(glm0)~fish$trial + fish$position)[1:10,]

out.tab

          fish$position
fish$trial        1        2         3
     CN.1 8.591065 17.08591 10.323024
     CN.10 8.591065 17.08591 10.323024
     CN.11 8.591065 17.08591 10.323024
     CN.12 5.727377 11.39061 6.882016
     CN.13 8.591065 17.08591 10.323024
     CN.14 8.591065 17.08591 10.323024
     CN.2 8.591065 17.08591 10.323024
     CN.3 8.591065 17.08591 10.323024
     CN.4 8.591065 17.08591 10.323024
     CN.5 8.591065 17.08591 10.323024

out.tab/apply(out.tab, 1, sum)

          fish$position
fish$trial         1         2         3
     CN.1 0.2386407 0.4746086 0.2867507
     CN.10 0.2386407 0.4746086 0.2867507
     CN.11 0.2386407 0.4746086 0.2867507
     CN.12 0.2386407 0.4746086 0.2867507
     CN.13 0.2386407 0.4746086 0.2867507
     CN.14 0.2386407 0.4746086 0.2867507
     CN.2 0.2386407 0.4746086 0.2867507
     CN.3 0.2386407 0.4746086 0.2867507
     CN.4 0.2386407 0.4746086 0.2867507
     CN.5 0.2386407 0.4746086 0.2867507

The two additive terms in the model just define the constraints of the experimental design so that our predictions have the same row and column totals as the data have. When you fit the multinomial model with the multinom function these constraints are already built in so you just need to include an intercept as the only predictor.

The independence model is uninteresting. If we add trial:factor(position) to the independence model we let every trial have its own values for position (still subject to the constraints of the experiment). The fractions of 1s, 2s and 3s are now different in every trial. This is also an uninteresting model. It says everything is different from everything else. It is the saturated model. It estimates one parameter for each cell of the table, a total of 228 parameters, and fits the data perfectly.

glm.S <- glm(freq~factor(position) + trial + trial*factor(position), data=fish, family=poisson

xtabs(fitted(glm.S)~fish$trial + fish$position)[1:10,]

          fish$position
fish$trial 1 2 3
     CN.1   6 5 25
     CN.10 11 15 10
     CN.11 8 21 7
     CN.12 2 8 14
     CN.13 6 15 15
     CN.14 12 10 14
     CN.2 12 12 12
     CN.3 13 14 9
     CN.4   7 18 11
     CN.5 12 10 14

Our goal is to replace trial:factor(position) with something more interesting and still have a model that fits the data. Rather than letting each trial have its own unique fractions for each position, we could force those trials experiencing the same treatment to have the same fractions but different from the fractions in trials receiving different treatments. To fit this model we replace the term trial:factor(position) with treatment:factor(position).

glm1 <- glm(freq~factor(position) + trial + treatment:factor(position), data=fish, family=poisson)

To include the blocking variable day in the model I add the interaction term factor(day):factor(position) to the model.

glm2 <- glm(freq~factor(position) + trial + treatment:factor(position) + factor(day):factor(position), data=fish, family=poisson)

We can verify these models are correct by carrying out the goodness of fit test we did previously and observe that we get the same likelihood ratio statistic here as we did using the multinom function.

anova(glm2, glm.S, test='Chisq')

Analysis of Deviance Table

Model 1: freq ~ factor(position) + trial + treatment:factor(position) +
    factor(day):factor(position)
Model 2: freq ~ factor(position) + trial + trial * factor(position)
Resid. Df Resid. Dev Df Deviance Pr(>Chi)
1       110     188.18
2         0       0.00 110   188.18 5.047e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Finally then, to fit the desired quasi-Poisson model I change the family argument of this last model to quasipoisson.

glm.q <- glm(freq~factor(position) + trial + treatment:factor(position) + factor(day):factor(position), data=fish, family=quasipoisson)

Question 7

To generate the estimates with the correct reference group for treatment we need to redefine treatment as a factor specifying "LF" as the reference group. As was explained in the hints it is unnecessary to to do this for position. The default order of position will surprisingly give us what we want.

levels(fish$treatment)

[1] "CN" "GR" "LF" "RH" "SM" "SQ" "YH" "YT"

fish$treat <- factor(fish$treatment, levels=levels(fish$treatment)[c(3, 1:2, 4:8)])

levels(fish$treat)

[1] "LF" "CN" "GR" "RH" "SM" "SQ" "YH" "YT"

I refit the model using the new treatment variable.

glm.q2 <- glm(freq~factor(position) + trial + treat:factor(position) + factor(day):factor(position), data=fish, family=quasipoisson)

We need the estimates that involve the interaction with treatment. I use the grep function to locate those rows in the coefficient table that contain the string 'treat'.

rownums <- grep('treat', rownames(summary(glm.q2)$coefficients))

rownums

[1] 79 80 81 82 83 84 85 86 87 88 89 90 91 92

summary(glm.q2)$coefficients[rownums,]

Estimate Std. Error t value Pr(>|t|)
factor(position)1:treatCN -0.152556414 0.2211594 -0.68980314 0.49177071
factor(position)2:treatCN -0.387261747 0.1979612 -1.95625106 0.05297074
factor(position)1:treatGR 0.184645284 0.2334866 0.79081744 0.43075227
factor(position)2:treatGR 0.344445238 0.2029892 1.69686476 0.09255017
factor(position)1:treatRH 0.004268110 0.3366681 0.01267750 0.98990806
factor(position)2:treatRH 0.550672237 0.2684590 2.05123385 0.04261984
factor(position)1:treatSM -0.176331045 0.3563426 -0.49483572 0.62170302
factor(position)2:treatSM 0.432508433 0.2768132 1.56245581 0.12105311
factor(position)1:treatSQ 0.116005526 0.3016724 0.38454138 0.70131966
factor(position)2:treatSQ 0.145488079 0.2556889 0.56900438 0.57051315
factor(position)1:treatYH 0.036769045 0.2833507 0.12976513 0.89698925
factor(position)2:treatYH -0.009626351 0.2597037 -0.03706666 0.97049904
factor(position)1:treatYT -0.077797133 0.3065899 -0.25374983 0.80016252
factor(position)2:treatYT 0.074804924 0.2542904 0.29417126 0.76918142

I extract just this portion of the coefficient summary table and the corresponding variance-covariance matrix.

out.coef <- summary(glm.q2)$coefficients[rownums,]

out.vcov <- vcov(glm.q2)[rownums, rownums]

The panel graph we want displays the two position comparisons in separate panels so it will be convenient to have things sorted so the position 1 estimates are together followed by the position 2 estimates.

neworder <- order(rownames(out.coef))

out.coef <- out.coef[neworder,]

out.vcov <- out.vcov[neworder, neworder]

out.coef

Estimate Std. Error t value Pr(>|t|)
factor(position)1:treatCN -0.152556414 0.2211594 -0.68980314 0.49177071
factor(position)1:treatGR 0.184645284 0.2334866 0.79081744 0.43075227
factor(position)1:treatRH 0.004268110 0.3366681 0.01267750 0.98990806
factor(position)1:treatSM -0.176331045 0.3563426 -0.49483572 0.62170302
factor(position)1:treatSQ 0.116005526 0.3016724 0.38454138 0.70131966
factor(position)1:treatYH 0.036769045 0.2833507 0.12976513 0.89698925
factor(position)1:treatYT -0.077797133 0.3065899 -0.25374983 0.80016252
factor(position)2:treatCN -0.387261747 0.1979612 -1.95625106 0.05297074
factor(position)2:treatGR 0.344445238 0.2029892 1.69686476 0.09255017
factor(position)2:treatRH 0.550672237 0.2684590 2.05123385 0.04261984
factor(position)2:treatSM 0.432508433 0.2768132 1.56245581 0.12105311
factor(position)2:treatSQ 0.145488079 0.2556889 0.56900438 0.57051315
factor(position)2:treatYH -0.009626351 0.2597037 -0.03706666 0.97049904
factor(position)2:treatYT 0.074804924 0.2542904 0.29417126 0.76918142

I begin to assemble the data frame we need for plotting purposes.

out.mod <- data.frame(expand.grid(1:7, 1:2), est=out.coef[,1], se=out.coef[,2])

out.mod

                          Var1 Var2          est        se
factor(position)1:treatCN    1    1 -0.152556414 0.2211594
factor(position)1:treatGR    2    1 0.184645284 0.2334866
factor(position)1:treatRH    3    1 0.004268110 0.3366681
factor(position)1:treatSM    4    1 -0.176331045 0.3563426
factor(position)1:treatSQ    5    1 0.116005526 0.3016724
factor(position)1:treatYH    6    1 0.036769045 0.2833507
factor(position)1:treatYT    7    1 -0.077797133 0.3065899
factor(position)2:treatCN    1    2 -0.387261747 0.1979612
factor(position)2:treatGR    2    2 0.344445238 0.2029892
factor(position)2:treatRH    3    2 0.550672237 0.2684590
factor(position)2:treatSM    4    2 0.432508433 0.2768132
factor(position)2:treatSQ    5    2 0.145488079 0.2556889
factor(position)2:treatYH    6    2 -0.009626351 0.2597037
factor(position)2:treatYT    7    2 0.074804924 0.2542904

I add the 95% confidence intervals.

out.mod$low95 <- out.mod$est - qnorm(.975) * out.mod$se

out.mod$up95 <- out.mod$est + qnorm(.975) * out.mod$se

Technically we should use a t-distribution with degrees of freedom taken from the model, but the difference here is slight.

qt(.975,glm.q2$df.residual)

[1] 1.981765

qnorm(.975)

[1] 1.959964

Next we need to compute the confidence levels for the pairwise comparisons. I collect the functions we've previously written for this purpose.

nor.func1 <- function(alpha, model, sig) 1 - pt(-qt(1-alpha/2, model$df.residual) * sum(sqrt(diag(sig))) / sqrt(c(1,-1) %*% sig %*%c (1,-1)), model$df.residual) - pt(qt(1-alpha/2, model$df.residual) * sum(sqrt(diag(sig))) / sqrt(c(1,-1) %*% sig %*% c(1,-1)), model$df.residual, lower.tail=F)

nor.func2 <- function(a,model,sigma) nor.func1(a, model, sigma)-.95

ci.func <- function(rowvals, glm.model, glm.vmat) {

n <- length(rowvals)

xvec1b <- numeric(n*(n-1)/2)

vmat <- glm.vmat[rowvals,rowvals]

ind <- 1

for(i in 1:(n-1)) {

for(j in (i+1):n){

sig <- vmat[c(i,j), c(i,j)]

#solve for alpha

xvec1b[ind] <- uniroot(function(x) nor.func2(x, glm.model, sig), c(.001,.999))$root

ind <- ind+1

}}

1-xvec1b

}

We need to use this function one panel at a time specifying the corresponding rows from the coefficient matrix, first 1:7 and then 8:14. Each run involves pairwise comparisons between 7 estimates.

vals1 <- ci.func(1:7, glm.q2, out.vcov)

vals2 <- ci.func(8:14, glm.q2, out.vcov)

vals1

[1] 0.6728149 0.7590149 0.7701199 0.7359922 0.7199309 0.7393951 0.7610261 0.7709923
[9] 0.7405050 0.7245067 0.7436374 0.7573342 0.7402503 0.7936049 0.7418959 0.7487930
[17] 0.7981490 0.7501636 0.7853993 0.7247910 0.7865266

vals2

[1] 0.6772997 0.7418073 0.7482880 0.7317290 0.7324057 0.7304830 0.7436115 0.7492921
[9] 0.7340044 0.7339015 0.7329391 0.7381925 0.7283715 0.7896563 0.7277297 0.7330560
[17] 0.7915383 0.7326971 0.7868835 0.7187160 0.7865723

The confidence levels range from 0.67 to 0.79 so we're going to need to do some experimenting in setting the levels. I start by using the lowest confidence level in each panel.

ci.val <- c(rep(min(vals1),7), rep(min(vals2),7))

out.mod$low50 <- out.mod$est + out.mod$se * qnorm((1-ci.val)/2)

out.mod$up50 <- out.mod$est + out.mod$se * qnorm(1-(1-ci.val)/2)

We're ready to produce the panel graph. First I collect the names of the predator fish in the order they're being presented in the coefficients table, omitting the lionfish LF.

preds <- levels(fish$treatment)[c(1:2,4:8)]

Next I produce the graph using the current choices for the pairwise confidence levels.

myprepanel.ci <- function(x,y,lx,ux,subscripts,...) {

list(xlim=range(x, ux[subscripts], lx[subscripts], finite=TRUE))

}

dotplot(factor(Var1, levels=1:7, labels=paste(preds, ' vs LF', sep='')) ~ est|factor(Var2, levels=1:2, labels=paste('Position ',1:2,' vs 3', sep='')), data=out.mod, xlab='log odds ratio', prepanel=myprepanel.ci, lx=out.mod$low95, ux=out.mod$up95, layout=c(2,1), panel=function(x, y, subscripts){

panel.dotplot(x, y, col=4, pch='+', cex=.6)

panel.segments(out.mod$low95[subscripts], y, out.mod$up95[subscripts], y, lwd=3, col='dodgerblue4', lineend=1)

panel.segments(out.mod$low50[subscripts], y, out.mod$up50[subscripts], y, lwd=6, col='dodgerblue1', lineend=1)

panel.points(x, y, col='white', pch=16, cex=1.1)

panel.points(x, y, col='dodgerblue4', pch=1, cex=1.2)

panel.abline(v=0, col=2, lty=2)

}, scales=list(x='free'),strip=strip.custom(par.strip.text = list(cex=0.9)))


Fig. 1 Preliminary graph of the log odds ratios using the minimum confidence level for the pairwise comparisons confidence intervals.

Everything overlaps in the first panel so we have nothing more to do there. If we increase the confidence levels the overlap will just increase. In the second panel we see that the first log odds ratio looks different from the next four. The confidence level we're using, 0.67, is only correct for the first comparison. The comparison that looks closest to being not significant is 1 vs 5. According to the vals1 vector the confidence level for this comparison should be 0.73, so I try that next.

ci.val <- c(rep(min(vals1),7), rep(vals2[4],7))

out.mod$low50 <- out.mod$est + out.mod$se * qnorm((1-ci.val)/2)

out.mod$up50 <- out.mod$est + out.mod$se * qnorm(1-(1-ci.val)/2)

ci.val

[1] 0.6728149 0.6728149 0.6728149 0.6728149 0.6728149 0.6728149 0.6728149 0.7317290
[9] 0.7317290 0.7317290 0.7317290 0.7317290 0.7317290 0.7317290

The graph using this confidence level is shown below.


Fig. 2 Point estimates, 95% confidence intervals (thin bar), and variable level confidence intervals suitable for making all pairwise comparisons of displayed estimates. A 67% confidence level is used for the position 1 vs 3 comparison and a 73% confidence level is used for the position 2 versus 3 comparison.

With that choice nothing else looks different that would require raising the confidence level except perhaps a comparison of the RH vs LF and the YH vs LF intervals. If we look at the table of confidence levels (below I've labeled then by what comparison they correspond) we see that the confidence level for the comparison of 3 versus 6 is 0.79 and is in position 14 of the vector.

round(rbind(rep(1:6,6:1), unlist(sapply(2:7,function(i) i:7)), vals2), 2)

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15]
      1.00 1.00 1.00 1.00 1.00 1.00 2.00 2.00 2.00 2.00 2.00 3.00 3.00 3.00 3.00
      2.00 3.00 4.00 5.00 6.00 7.00 3.00 4.00 5.00 6.00 7.00 4.00 5.00 6.00 7.00
vals2 0.68 0.74 0.75 0.73 0.73 0.73 0.74 0.75 0.73 0.73 0.73 0.74 0.73 0.79 0.73
      [,16] [,17] [,18] [,19] [,20] [,21]
       4.00 4.00 4.00 5.00 5.00 6.00
       5.00 6.00 7.00 6.00 7.00 7.00
vals2 0.73 0.79 0.73 0.79 0.72 0.79

If we change all the levels to 0.79 then the CN vs LF and SQ vs LF comparison is no longer significant. I instead change only the levels for the two observations involved in the comparison, RH and YH.

ci.val <- c(rep(min(vals1),7), vals2[4], rep(vals2[14],3), vals2[4], rep(vals2[14],2))

out.mod$low50 <- out.mod$est + out.mod$se * qnorm((1-ci.val)/2)

out.mod$up50 <- out.mod$est + out.mod$se * qnorm(1-(1-ci.val)/2)

The final graph is shown below.


Fig. 3 Point estimates, 95% confidence intervals (thin bar), and variable level confidence intervals (thick bars) that are suitable for making all pairwise comparisons of displayed estimates. A 67% confidence level is used for all the position 1 vs 3 comparisons and a 73% confidence level is used for the position 2 versus 3 comparisons. The only exceptions are for RH vs LF and YH vs LF intervals for which 79% confidence levels are shown.

Fig. 3 Point estimates, 95% confidence intervals (thin bar), and variable level confidence intervals (thick bars) that are suitable for making all pairwise comparisons of displayed estimates. A 67% confidence level is used for all the position 1 vs 3 comparisons and a 73% confidence level is used for the position 2 versus 3 comparisons. The only exceptions are for RH vs LF and YH vs LF intervals for which 79% confidence levels are shown.

Question 8

I exponentiate the point estimate and the confidence intervals and generate the new graph.

dotplot(factor(Var1, levels=1:7, labels=paste(preds, ' vs LF', sep='')) ~ exp(est)|factor(Var2, levels=1:2, labels=paste('Position ',1:2,' vs 3', sep='')), data=out.mod, xlab='Odds ratio', prepanel=myprepanel.ci, lx=exp(out.mod$low95), ux=exp(out.mod$up95), layout=c(2,1), panel=function(x, y, subscripts){

panel.dotplot(x, y, col=4, pch='+', cex=.6)

panel.segments(exp(out.mod$low95)[subscripts], y, exp(out.mod$up95)[subscripts], y, lwd=3, col='dodgerblue4', lineend=1)

panel.segments(exp(out.mod$low50)[subscripts], y, exp(out.mod$up50)[subscripts], y, lwd=6, col='dodgerblue1', lineend=1)

panel.points(x, y, col='white', pch=16, cex=1.1)

panel.points(x, y, col='dodgerblue4', pch=1, cex=1.2)

panel.abline(v=1, col=2, lty=2)

}, scales=list(x='free'),strip=strip.custom(par.strip.text = list(cex=0.9)))


Fig. 4 Odds ratios estimates for different predators versus lionfish separately for position 1 versus position 3 and position 2 versus position 3.

Question 9

None of the odds ratios for the position 1 versus position 3 comparison are significant. All of the 95% confidence intervals include an odds ratio of 1. Thus the ratio of times prey turn toward a predator or turn away from a predator are the same for any predator when compared to lionfish, even for the control in which no predator was present at all. The only significant odds ratios were found for the position 2 versus position 3 comparisons. If position 2 indicates that the prey are taking greater notice of the predator, then lionfish are significantly (in truth the p-value just exceeds 0.05) scarier than a control (a cage with nothing in it) while the red hind is significantly scarier than a lionfish. No other odds ratios were statistically significant.

It's worth noting that if we had used a Poisson rather than a quasi-Poisson distribution here, we would have found the odds ratios of LF with CN, GR, RH, and SM to all be statistically significant for position 2 versus position 3. The quasi-Poisson gives us more conservative results, which is probably justified here given that we know the model has a significant lack of fit.

Apparently the assumption that the fish would turn toward the predator when frightened is incorrect. The ratio of times a fish is in position 1 versus position 3 is not affected by the predator's identity (at least when using lionfish as the reference group). This is true even when there is no predator present at all. What we see instead is that a frightened fish turns perpendicular to the predator (position 2). This makes sense if the eyes of this fish are laterally placed so that the fish lacks good binocular vision. What supports this is that the only time the empty cage (control) versus lionfish comparison yielded an .odds ratio that was less than 1 was for the position 2 versus position 3 comparison (and not for the position 1 versus position 3 comparison).

The pairwise comparisons are not particularly interesting here. We see from the pairwise comparisons that individually the odds ratios for the different predators cannot be distinguished from each other. Still, four of them are significantly greater than the lionfish vs control odds ratio (red hind, grouper, schoolmaster, and squirrelfish) but two were not (yellowtail and yellowhead). So even though only the red hind was significantly scarier than the lionfish, there is a weak suggestion that the predators perhaps fall into two groups: lionfish, yellowhead, yellowtail in one group and grouper, red hind, schoolmaster, and squirrelfish in a second group.

Of the fish used in this experiment the red hind is the most imposing and was expected to be the scariest predator. It is interesting that although none of the other predator ORs are significant, their point estimates do exceed 1. The only exception to this is the yellowhead (OR = 0.99) which mostly eats invertebrates anyway. The rankings are close to what the researchers would have guessed based on their knowledge of these fish. The results suggest that while lionfish are not ignored by their potential prey they do currently rank at the low end of their concerns. The lack of significance of most of the ORs probably results from the inefficient experimental design (evidence for which is found in the persistent lack of fit). Fortunately the researchers measured other behavioral attributes that confirmed the weak trends that are shown here.

HW scores

Course Home Page

Jack Weiss
Phone: (919) 962-5930
E-Mail: jack_weiss@unc.edu
Address: Curriculum for the Environment and Ecology, Box 3275, University of North Carolina, Chapel Hill, 27599
Copyright © 2012
Last Revised--May 4, 2012
URL: https://sakai.unc.edu/access/content/group/2842013b-58f5-4453-aa8d-3e01bacbfc3d/public/Ecol562_Spring2012/docs/solutions/finalpart3.htm