Assignment 9—Solution

Question 1

birdbeetles <- read.delim('ecol 563/birdbeetles.txt')
birdbeetles[1:4,]
  grid beetles birds relief latitude longitude
1    1       3   261   7.77       36        73
2    2       5   224   4.27       34        71
3    3       6   171   5.27       34        75
4    4       5   326   6.80       34        78

My intention was that you just fit the six models treating beetle richness and relief as predictors in an additive model. The models in question are the following.

mod1 <- lm(birds~beetles+relief, data=birdbeetles)
mod2 <- glm(birds~beetles+relief, data=birdbeetles, family=poisson)
library(MASS)
mod3 <- glm.nb(birds~beetles+relief, data=birdbeetles)
library(gamlss)
mod4 <- gamlss(birds~beetles+relief, data=birdbeetles, family=NBII)
mod5 <- lm(log(birds)~beetles+relief, data=birdbeetles)
mod6 <- lm(sqrt(birds)~beetles+relief, data=birdbeetles)

Question 2

Models mod1, mod2, mod3, and mod4 can be compared directly using log-likelihood and AIC. Models mod5 and mod6 involve a transformed response variable and therefore require some manipulation in order to compare them with the untransformed response models. The functions for this purpose were presented in lecture 20.

# lognormal log-likelihood
lognorm.LL <- function(y, model, c=0) {
sigma2 <- sum(residuals(model)^2)/length(residuals(model))
prob <- ifelse(y==0, pnorm(log(y+c+0.5), mean=predict(model), sd=sqrt(sigma2)), pnorm(log(y+c+0.5), mean=predict(model), sd=sqrt(sigma2))-pnorm(log(y+c-0.5), mean=predict(model), sd=sqrt(sigma2)))
LL <- sum(log(prob))
df <- length(coef(model))+1
AIC <- -2*LL + 2*df
out <- data.frame(LL=LL, df=df, AIC=AIC)
out}

# square root normal log-likelihood
sqrtnorm.LL <- function(y, model, c=0) {
sigma2 <- sum(residuals(model)^2)/length(residuals(model))
prob <- ifelse(y==0, pnorm(sqrt(y+c+0.5), mean=predict(model), sd=sqrt(sigma2)), pnorm(sqrt(y+c+0.5), mean=predict(model), sd=sqrt(sigma2)) - pnorm(sqrt(y+c-0.5), mean=predict(model), sd=sqrt(sigma2)))
LL <- sum(log(prob))
df <- length(coef(model))+1
AIC <- -2*LL + 2*df
out <- data.frame(LL=LL, df=df, AIC=AIC)
out}

I obtain the log-likelihoods and AICs and assemble the results in a table.

part1 <- data.frame(LL=sapply(list(mod1, mod2, mod3, mod4), logLik), AIC(mod1, mod2, mod3, mod4))
# add lognormal and square root normal
my.out <- rbind(part1, lognorm.LL(birdbeetles$birds, mod5), sqrtnorm.LL(birdbeetles$birds, mod6))
rownames(my.out)[5:6] <- c('mod5', 'mod6')
cbind(model=c('normal', 'Poisson', 'NB-2', 'NB=1', 'lognormal', 'square root normal'), my.out)
                  model        LL df       AIC
mod1             normal -323.9595  4  655.9191
mod2            Poisson -498.0946  3 1002.1892
mod3               NB-2 -313.7719  4 
635.5438
mod4               NB=1 -315.8734  4  639.7468
mod5          lognormal -313.7582  4 
635.5165
mod6 square root normal -315.9871  4  639.9741

The lognormal and negative binomial (NB-2) models are essentially tied, although technically the lognormal model has a slighlty lower AIC. At this point either model could be chosen. I would probably choose the negative binomial model only because it is a discrete probability model and makes it easier to interpret the response variable, which is a discrete count.

Variations on this answer

Many people considered transformations of the predictor. I didn't expect you to explore this possibility but it is certainly a legitimate thing to pursue. There is no theory to motivate the choice here. If we carry out a preliminary plot of the response variable versus each predictor separately both plots reveal what appear to be linear relationships (Fig. 1).

par(mar=c(5.1,4.1,2.1,1.1))
par(mfrow=c(1,2))
plot(birds~beetles, data=birdbeetles)
plot(birds~relief, data=birdbeetles)

   fig. 1
Fig. 1   Scatter plots of bird richness versus beetle richness and relief

A plot of the residuals of the lognormal or NB-2 models versus the predictors doesn't reveal anything especially unusual (Fig. 2).

par(mar=c(5.1,4.1,2.1,1.1))
par(mfrow=c(1,2))
plot(residuals(mod3)~beetles, data=birdbeetles)
lines(lowess(residuals(mod3)~birdbeetles$beetles, f=.9), col=2)

   fig. 2
Fig. 2   Plot of residuals from negative binomial regression model (mod3) versus beetle richness and relief with a lowess curve superimposed.

plot(residuals(mod5)~relief, data=birdbeetles)
lines(lowess(residuals(mod5)~birdbeetles$relief, f=.9), col=2)

   fig. 3
Fig. 3   Plot of residuals from a lognormal regression model (mod5) versus beetle richness and relief with a lowess curve superimposed.

None the less it turns out that if you include log(beetles) as a predictor you do get a better model in terms of AIC.

mod1a <- lm(birds~log(beetles)+relief, data=birdbeetles)
mod2a <- glm(birds~log(beetles)+relief, data=birdbeetles, family=poisson)
mod3a <- glm.nb(birds~log(beetles)+relief, data=birdbeetles)
mod4a <- gamlss(birds~log(beetles)+relief, data=birdbeetles, family=NBII)
mod5a <- lm(log(birds)~log(beetles)+relief, data=birdbeetles)
mod6a <- lm(sqrt(birds)~log(beetles)+relief, data=birdbeetles)
part1a <- data.frame(LL=sapply(list(mod1a, mod2a, mod3a, mod4a), logLik), AIC(mod1a, mod2a, mod3a, mod4a))
myout1a <- rbind(part1a, lognorm.LL(birdbeetles$birds,mod5a), sqrtnorm.LL(birdbeetles$birds,mod6a))
rownames(my.out1a)[5:6] <- c('mod5a','mod6a')
my.out1a
             LL df      AIC
mod1a -325.0018  4 658.0036
mod2a -455.8539  3 917.7078
mod3a -308.2056  4 624.4112
mod4a -310.6032  4 629.2064
mod5a -307.6964  4 623.3928
mod6a -313.5032  4 635.0064

Once again the lognormal and negative binomial (NB-2) models are extremely competititve with the lognormal model ranking slightly ahead of the negative binomial. A residual plot does show a slight improvement.

plot(residuals(mod5a)~beetles, data=birdbeetles)
lines(lowess(residuals(mod5a)~birdbeetles$beetles), col=2)

   fig. 4
Fig. 4   Plot of residuals from a lognormal regression model with log(beetles) as a predictor (mod5a) versus beetle richness with a lowess curve superimposed.

Given that log-transforming the predictor does permit interpreting the bird-beetle relationship as a power law relationship one could argue that log-transforming beetles yields a model that is easier to interpret.

Question 3

Negative binomial model with beetles and relief as predictors

The fitted model is

NB model

Exponentiating both sides yields the following.

NB 2

So beetle richness has a multiplicative effect on the mean. If beetle richness is increased by one unit we find

NB 3

Raising beetle richness by one unit multiplies the mean by exp(β1). Similarly increasing relief by one unit multiplies the mean by the factor exp(β2).

coef(mod3)
(Intercept)     beetles      relief
 4.88846537  0.01365378  0.09552305
exp(coef(mod3)[2:3])
 beetles   relief
1.013747 1.100234

So a one unit change in beetle richness leads to multiplying the mean by 1.013747, which is roughly a 1.4% increase in the mean. A one unit change in relief leads to multiplying the mean by 1.1, which is a 10% increase in the mean.

Negative binomial model with log(beetles) and relief as predictors

The fitted model is

NB log beetles

The interpretation for a change in relief is the same as was described above for the previous model. With log predictors the proper way to think of effects is multiplicatively. Thus rather than adding one unit to beetle richness, we can think in terms of, say, a 1% change in beetle richness. To obtain a 1% change we multiply beetles by 1.01.

log 1

Subtracting the log old mean from the log new mean yields the following.

log 2

So a 1% change in beetle richness multiplies bird richness by 1.01 b1 = 1.00192, i.e., a .19% increase in the mean.

coef(mod3a)
 (Intercept) log(beetles)       relief
   4.6450623    0.1927450    0.1075616
1.01^0.1927450
[1] 1.00192

We can obtain a simpler interpretation of the effect as follows. The binomial series, a generalization of the binomial theorem, is given as follows.

binomial series

for Δx small. Therefore we have

binomial series 2

So a 1% increase in x leads to a β1% increase in the mean, which again is approximately a .19% increase for a 1% increase in beetle richness.

coef(mod3a)[2]*.01
log(beetles)
  0.00192745

Lognormal models

The fitted model with beetles and relief as predictors is

lognormal

As was discussed in lecture 21, the mean of a log is not the same as the log of a mean. If we exponentiate the lefthand side we don't get the mean of the lognormal distribution. Instead we get the median. Thus the same interpretations described above for the negative binomial model apply to the lognormal except we apply it to the median rather than the mean. Thus a one unit change in beetles multiplies the median by exp(b1) while a one unit change in relief multiplies the median by exp(b2).

coef(mod5)
(Intercept)     beetles      relief
 4.86905760  0.01356396  0.09554528
exp(coef(mod5)[2:3])
 beetles   relief
1.013656 1.100259

The fitted model with log(beetles) and relief as predicts is the following.

lognormal2

Now a 1% increase in beetles leads to a β1% increase in the median, an approximately .19% increase.

1.01^coef(mod5a)[2]
log(beetles)
    1.001887
coef(mod5a)[2]
log(beetles)
   0.1894447

Question 4

I plot the mean from the negative binomial model, mod3, but all of the predicted surfaces look about the same. To plot the regression surface we need to determine the range of the data and plot the surface only over that range.

range(birdbeetles$beetles)
[1]  1 58
range(birdbeetles$relief)
[1] 0.18 8.78

I create a grid of values over this range, construct a function that defines the regression surface, and then evaluate the function over this grid.

g <- expand.grid(beetles=seq(1,59,3), relief=seq(0.15,8.9,.6))
surf <- function(p) exp(coef(mod3)[1]+coef(mod3)[2]*p[1]+coef(mod3)[3]*p[2])
g$z<-apply(g,1,surf)

The persp function requires that the z values be organized in a matrix that has the same dimensions as the grid over which we'll plot.

z.matrix <- matrix(g$z, nrow=length(seq(1,59,3)))
dim(z.matrix)
[1] 20 15
length(seq(1,59,3))
[1] 20
length(seq(0.15,8.9,.6))
[1] 15

I play with the angle settings to get a surface in the most useful orientation.

out.persp <- persp(seq(1,59,3), seq(0.15,8.9,.6), z.matrix, xlab="beetles", ylab="relief", zlab="birds", phi=0, theta=30, ticktype="detailed", zlim=c(10,750), border='grey70')

To plot the points we need model predictions at each of the observed values. The trans3d function converts the 3-dimensional coordinates into the projected two-dimensional coordinates for plotting.

z.est <- predict(mod3,type='response')
vals2 <- trans3d(birdbeetles$beetles, birdbeetles$relief, z.est, out.persp)
vals1 <- trans3d(birdbeetles$beetles, birdbeetles$relief, birdbeetles$birds, out.persp)

I use segments to draw the line segments and points to draw the points. For the colors I test if the observed points lie above the surface, convert the result to a number (0 for FALSE, 1 for TRUE), and 1 to the result so that we obtain 1 or 2. I then use this to select the first or second component of my color vector, c(4,2).

segments(vals1[[1]], vals1[[2]], vals2[[1]], vals2[[2]], col=c(4,2)[as.numeric(vals1[[2]] > vals2[[2]])+1])
points(trans3d(birdbeetles$beetles, birdbeetles$relief, birdbeetles$birds, out.persp), pch=16, col=c(4,2)[as.numeric(vals1[[2]]>vals2[[2]])+1], cex=.7)

   fig. 3
Fig. 5   Plot of the regression surface of the negative binomial regression model (mod3).

Question 5

The code from lecture 20 can be used almost as is. The conditioning variable is "group" and there are 61 observations. I arrange the graph in a 4 x 4 layout. The code below graphically examines the fit of the negative binomial model mod3.

birdbeetles$mu <- fitted(mod3)
birdbeetles$p <- dnbinom(birdbeetles$birds, mu=birdbeetles$mu, size=mod3$theta)
max(dnbinom(1000, mu=birdbeetles$mu, size=mod3$theta))
[1] 1.409487e-05
out.p <- lapply(1:61, function(x) dnbinom(0:1000, mu=birdbeetles$mu[x], size=mod3$theta))
birdbeetles$ux <- sapply(1:61, function(x) max((0:1000)[out.p[[x]]>1e-5]))
birdbeetles$lx <- sapply(1:61, function(x) min((0:1000)[out.p[[x]]>1e-5]))
birdbeetles$uy <- sapply(out.p,max)
birdbeetles$ly<-rep(0,61)
# prepanel function to set limits for each graph
prepanel.ci2 <- function(x, y, ly, lx, uy, ux, subscripts, ...) {
list(ylim=range(y, uy[subscripts], ly[subscripts], finite=T), xlim=range(x, ux[subscripts], lx[subscripts], finite=T))}
xyplot(p~birds|factor(grid), data=birdbeetles, ux=birdbeetles$ux, lx=birdbeetles$lx, uy=birdbeetles$uy, ly=birdbeetles$ly, layout=c(4,4), prepanel=prepanel.ci2, panel=function(x,y,subscripts){
panel.xyplot(0:1000, dnbinom(0:1000, mu=birdbeetles$mu[subscripts], size=mod3$theta), type='h', col='grey')
panel.abline(v=x, col=2, lty=2)},
scale=list(x='free', y='free'))

   fig 6a
Fig. 6   (a) Predicted negative binomial distributions with observed bird richness values superimposed. Grid cells 1–16.
   fig. 6b
Fig. 6   (b) Predicted negative binomial distributions with observed bird richness values superimposed. Grid cells 17–32.
   fig. 6c
Fig. 6   (c) Predicted negative binomial distributions with observed bird richness values superimposed. Grid cells 33–48.
   fig. 6d
Fig. 6   (d) Predicted negative binomial distributions with observed bird richness values superimposed. Grid cells 49–61.

From the graphs the obvious instances of lack of fit are grid cells 61, 33, and perhaps 23.

Analytical test

I calculate upper and lower p-values for each observed value and count the number that are less than .025, (two-tailed test). I display the results in a graph.

upper.pval <- 1-pnbinom(birdbeetles$birds-1, mu=birdbeetles$mu, size=mod3$theta)
lower.pval <- pnbinom(birdbeetles$birds, mu=birdbeetles$mu, size=mod3$theta)
sum(c(upper.pval,lower.pval)<.025)
[1] 3
# choose upper or lower p-value to plot
p.to.plot <- ifelse(pnbinom(birdbeetles$birds, mu=birdbeetles$mu, size=mod3$theta) < .5, pnbinom(birdbeetles$birds, mu=birdbeetles$mu, size=mod3$theta), pnbinom(birdbeetles$birds-1, mu=birdbeetles$mu, size=mod3$theta))
library(grid)
dotplot(birdbeetles$grid~p.to.plot, xlab=' ', xlim=c(-0.03,1.03), ylab='grid', panel=function(x,y){
panel.dotplot(x,y)
panel.abline(v=c(.025,.975), col=2, lty=2)
panel.abline(v=0.5, col='grey70', lwd=2)}, scales=list(y=list(at=seq(2,60,2),cex=.7), x=list(at=seq(0,1,.2), labels=c(0, 0.2, 0.4, 0.4, 0.2, 0))), page = function(n) {
grid.text("upper-tailed p-value", x = .85, y = 0.025, default.units = "npc", just = c("right", "bottom"))
grid.text("lower-tailed p-value", x = .45, y = 0.025, default.units = "npc", just = c("right", "bottom"))})

   fig. 7
Fig. 7   Lower and upper-tailed p-values for the observed bird richness values with respect to negative binomial regression model. Vertical lines represent lower and upper 2.5% quantiles.

Three observations out of 61 are clear outliers, which is less than 5% of the values, not more than we'd expect to obtain by chance.

Fit of lognormal model

To examine the fit of the lognormal model the same code will work except we need to us the dlnorm function rather than dnbinom. The arguments of dlnorm are meanlog, which we obtain as the fitted values of our model, and sdlog, which is the square root of the average of the squared residuals. In the graph I display the fit for only the first 16 grid cells for the lognormal model with log(beetles) and relief as predictors.

birdbeetles$logmu <- fitted(mod5a)
mod5a.sd <- sqrt(sum(residuals(mod5a)^2)/length(residuals(mod5a)))
birdbeetles$p <- dlnorm(birdbeetles$birds, meanlog=birdbeetles$logmu, sdlog=mod5a.sd)
max(dlnorm(1000, meanlog=birdbeetles$logmu, sdlog=mod5a.sd))
[1] 7.528603e-06
out.p <- lapply(1:61, function(x) dlnorm(0:1000, meanlog=birdbeetles$logmu[x], sdlog=mod5a.sd))
birdbeetles$ux <- sapply(1:61, function(x) max((0:1000)[out.p[[x]]>1e-5]))
birdbeetles$lx <- sapply(1:61, function(x) min((0:1000)[out.p[[x]]>1e-5]))
birdbeetles$uy <- sapply(out.p,max)
birdbeetles$ly<-rep(0,61)
# prepanel function to set limits for each graph
prepanel.ci2 <- function(x, y, ly, lx, uy, ux, subscripts, ...) {
list(ylim=range(y, uy[subscripts], ly[subscripts], finite=T), xlim=range(x, ux[subscripts], lx[subscripts], finite=T))}
xyplot(p~birds|factor(grid), data=birdbeetles, ux=birdbeetles$ux, lx=birdbeetles$lx, uy=birdbeetles$uy, ly=birdbeetles$ly, layout=c(4,4), prepanel=prepanel.ci2, panel=function(x,y,subscripts){
panel.xyplot(0:1000, dlnorm(0:1000, meanlog=birdbeetles$logmu[subscripts],sdlog=mod5a.sd), type='h', col='grey')
panel.abline(v=x, col=2, lty=2)}, scale=list(x='free', y='free'))

   fig. 8
Fig. 7   Predicted lognormal distributions with observed bird richness values superimposed. Grid cells 49–61.

upper.pval2 <- 1-plnorm(birdbeetles$birds-.5,meanlog=birdbeetles$logmu, sdlog=mod5a.sd)
lower.pval2 <- plnorm(birdbeetles$birds+.5,meanlog=birdbeetles$logmu, sdlog=mod5a.sd)
sum(c(upper.pval2,lower.pval2)<.025)
[1] 5
p.to.plot2 <- ifelse(plnorm(birdbeetles$birds+0.5, meanlog=birdbeetles$logmu, sdlog=mod5a.sd) < .5, plnorm(birdbeetles$birds+0.5, meanlog=birdbeetles$logmu, sdlog=mod5a.sd), plnorm(birdbeetles$birds-.5, meanlog=birdbeetles$logmu, sdlog=mod5a.sd))
library(grid)
dotplot(birdbeetles$grid~p.to.plot2, xlab=' ', xlim=c(-0.03,1.03), ylab='grid', panel=function(x,y){
panel.dotplot(x,y)
panel.abline(v=c(.025,.975), col=2, lty=2)
panel.abline(v=0.5, col='grey70', lwd=2)}, scales=list(y=list(at=seq(2,60,2),cex=.7),x=list(at=seq(0,1,.2), labels=c(0, 0.2, 0.4, 0.4, 0.2, 0))), page = function(n) {
grid.text("upper-tailed p-value", x = .85, y = 0.025, default.units = "npc", just = c("right", "bottom"))
grid.text("lower-tailed p-value", x = .45, y = 0.025, default.units = "npc", just = c("right", "bottom"))})

   fig. 9
Fig. 9   Lower and upper-tailed p-values for the observed bird richness values with respect to a lognormal regression model with log(beetles) and relief as predictors. Vertical lines represent lower and upper 2.5% quantiles.

Five observations out of 61 are clear outliers, which is about 8% of the values. This is more than we'd expect to obtain by chance. So, although the lognormal model with log(beetles) and relief as predictors had a lower AIC than a negative binomial model with beetles and relief as predictors, the lognormal model appears to fit fewer grid cells.

Question 6

# negative binomial
plot(latitude~longitude, data=birdbeetles, pch=16, col=c(2,4)[as.numeric(residuals(mod3)<0)+1])

   fig. 10
Fig. 10   Plot of negative binomial residuals by sign (red = positive, blue = negative) by their geographic location
# lognormal
plot(latitude~longitude, data=birdbeetles, pch=16, col=c(2,4)[as.numeric(residuals(mod5a)<0)+1])

   fig. 11
Fig. 11   Plot of lognormal residuals by sign (red = positive, blue = negative) by their geographic location

Both models show that the residuals of the same sign are clustered geographically suggesting that the model residuals are not independent. This is a violation of one of the assumptions of the regression model. We can formally assess this correlation analytically by carrying out a Mantel test. A Mantel test calculates the correlation between the geographic distances of points with the distance in the value of their residuals. It assesses the significance of the correlation using a randomization test. Residuals are randomly assigned to geographic location and the correlation of the randomized residuals is used to construct a null distribution for the statistic.

# Mantel test for negative binomial residuals
dist1 <- dist(cbind(birdbeetles$latitude, birdbeetles$longitude))
dist2 <- dist(residuals(mod3))
library(vegan)
mantel(dist1, dist2)
Mantel statistic based on Pearson's product-moment correlation

Call:
mantel(xdis = dist1, ydis = dist2)

Mantel statistic r: 0.335
      Significance:
0.001

Empirical upper confidence limits of r:
   90%    95%  97.5%    99%
0.0765 0.1040 0.1199 0.1418

Based on 999 permutations

# Mantel test for lognormal residuals
dist2a <- dist(residuals(mod5a))
mantel(dist1, dist2a)
Mantel statistic based on Pearson's product-moment correlation

Call:
mantel(xdis = dist1, ydis = dist2a)

Mantel statistic r: 0.2382
      Significance:
0.001

Empirical upper confidence limits of r:
  90%   95% 97.5%   99%
0.080 0.101 0.122 0.143

Based on 999 permutations

Both tests indicate a significant degree of spatial correlation in the residuals. A solution would be to add a spatial correlation structure for the residuals. This is easy to do for the lognormal model using generalized least squares (gls function of the nlme package), but more difficult for the negative binomial model. At this point I would switch exclusively to the lognormal model and adding a model for the residual correlation component using the gls function from the nlme package.


hw 9 scores

Course Home Page

Jack Weiss
Phone: (919) 962-5930
E-Mail: jack_weiss@unc.edu
Address: Curriculum for Ecology and the Environment, Box 3275, University of North Carolina, Chapel Hill, 27599
Copyright © 2012
Last Revised--November 23, 2012
URL: https://sakai.unc.edu/access/content/group/3d1eb92e-7848-4f55-90c3-7c72a54e7e43/public/docs/solutions/assign9.htm