Final Exam—Part 3

Due Date

Monday, April 30, 2012

Instructions

Same rules and policies apply to this "exam" as apply to ordinary assignments.

Data Source

The file fish.txt contains the data for this assignment. This is a tab-delimited text file in which the variable names appear in the first row.

Background

An experiment was carried out to compare the behavior of a native Caribbean fish species Haemulon plumieri to different native fish species (mostly natural predators) with its behavior to an invasive predator (lionfish). Three individuals of Haemulon plumieri were placed in a cage and a "treatment" predator was placed in an adjacent cage. The three "prey" fish were videotaped and twelve pictures, spaced at regular time intervals, were created from the video. In each picture the positions of the three prey fish were noted relative to the predator fish. It was decided to broadly classify position into the three categories shown in the figure below: facing the predator (category 1), turned to one side (category 2), and turned away from the predator. Only in positions 1 and 2 can the prey fish see the predator in the adjacent cage.

lionfish

It was not possible to distinguish the individual fish in the photos so no attempt was made to track individual fish over time. Instead in each picture the three fish were individually classified as being in one of the three different position categories. (ambiguous cases were assigned to the most similar category). These values were then summed over all 12 pictures to yield the total number of times each of the three different positions occurred. This constitutes a single trial in the experiment. For most trials n = 36 (three fish × twelve pictures), but a few of the trials used only two fish yielding n = 24. Other random anomalies occurred so that not all totals were 36.

76 trials were conducted over 14 days divided unequally among eight different predator treatments. The hope was that the typical position assumed by the prey fish would indicate how scary they found a potential predator to be. It was thought that the position order 1, 2, 3 might correspond to decreasing scariness but this ranking was by no means certain. The treatments were coded by the type of fish used: CN (control), GR (grouper), LF (lionfish), RH (red hind), SM (schoolmaster), SQ (squirrelfish), YH (yellowhead), YT (yellowtail). The control (CN) was an empty predator cage. On any given day between four to seven of the treatments were administered. For all of the trials on that day the same three prey fish were reused as much as possible. Different predator and prey fish individuals were used on different days.

The lionfish is a recent introduction into the Caribbean, so native fish may not be fully accustomed to its presence and may not view it as very frightening. Because the focus of this study is lionfish, lionfish should be used as the reference group for all treatment comparisons in your analysis. You should also use position 3 as the reference group for all comparisons of the three positions.

The variables contained in fish.txt are described in the table below.

Variable	Description
trial	A single run of the experiment in which typically three prey fish were exposed to a predator (treatment) and their positions in twelve photographs recorded. Each trial corresponds to three lines of data in the file corresponding to the three possible positions.
day	The day on which the experiment was conducted.
position	One of the three recorded positions of the prey fish: 1, 2, or 3.
treatment	The predator that the prey fish were exposed to: CN, GR, LF, RH, SM, SQ, YH, or YT.
freq	The number of times in a trial that a given position was observed.

Questions

R coding question: Using one line of R code determine how many trials there were for each of the eight treatments.
Fit a model that is appropriate for assessing whether predator treatment had an effect on prey position.
Carry out an appropriate goodness of fit test for your model of Question 2.
To improve the fit of your model add day as an additional predictor to the model of Question 2 and check the fit again.
Provide a possible explanation of the persistent lack of fit by describing two characteristics of this experiment that may violate assumptions of the multinomial distribution. In each case clearly state the nature of the violation.
Fit a quasi-Poisson distribution to the model of Question 4 as a last ditch effort to deal with the model's lack of fit.
Produce a graphical summary of the results of this experiment using your quasi-Poisson model. Plot the different log odds ratios for treatment and position as a panel graph. Each panel should display all seven of the treatment comparisons. Different panels should depict different position comparisons. In the graph display point estimates and 95% confidence intervals for the point estimates of individual log odds ratios. Also include a second set of confidence intervals that can be used to make pairwise comparisons between all the treatment effects that are displayed in that panel. Add a vertical line to each panel that permits testing whether each of the individual log odds ratio is statistically significant.
Repeat the graph from Question 7 but this time display odds ratios rather than log odds ratios.
What are your conclusions? How scary are lionfish?

Hints

Question 3

The saturated model for this problem fits a separate parameter for each position at each trial. To verify that your saturated model is correct you can fit the Poisson version of it and verify that the residual deviance is zero.

Question 4

Don't just blindly put day in the model. Think about what you're trying to accomplish here. Day is a nuisance variable that you're including only because you think the results may have changed depending on the day of the experiment. In the language of experimental design day is a blocking variable in this experiment.

Question 5

The assumptions of the multinomial distribution are the same as the assumptions of the binomial distribution.

Question 6

The null Poisson model requires that the variables trial and position be included as covariates. To add a predictor to the null model you need to add it as an interaction with the position variable.

To make "LF" the reference group for treatment you will need to change the reference level of treatment using the levels argument of factor. Set 3 as the reference group of position for the multinomial model but not for the Poisson model. Because of the weirdness of the Poisson model we're fitting, weird because it looks like there are interactions in the model without all of the component main effects, R mixes up the order of the levels of position in portions of the model. (The truth is trial is playing the role of the main effect of the two factors treatment and day in this model, but R doesn't know that.) If you leave the levels of position in their default order 1, 2, 3 when you fit the glm model, R will estimate the 2 and 3 effects of position relative to 1 in the main effect terms, which we don't care about, but R will estimate the 1 and 2 effects using 3 as the reference group in the interaction of position with treatment, which is what we want.

Question 7

When there are this many comparisons it can be difficult to find a single confidence level that works for all of them. You may need to use different confidence levels for different estimates and finding the right combination (if one exists) may be more trouble than it's worth. For this problem most of the pairwise comparisons are clearly not significant so it is doable.

Start with the confidence level that yields the shortest confidence intervals in each panel. Any pairwise comparison not significantly different at this setting will not be different for any other choice either. Create the graph and note whether the pair for which this is the correct setting turn out to be significantly different. If they're not then you can use the next highest level instead. If they are significantly different it may be possible to increase the level and still have them display as being different. Note which other pairs are shown as significantly different. For each of these significant pairs change the confidence level to a value that is actually appropriate for comparing them and recreate the graph to see if they're still significant. If the pair is no longer significant you're going to need to play with the levels individually so that the pairs that should be significantly different are shown as such in the final display.

Course Home Page

Jack Weiss
Phone: (919) 962-5930
E-Mail: jack_weiss@unc.edu
Address: Curriculum in Ecology and the Environment, Box 3275, University of North Carolina, Chapel Hill, 27599
Copyright © 2012
Last Revised--April 28, 2012
URL: https://sakai.unc.edu/access/content/group/2842013b-58f5-4453-aa8d-3e01bacbfc3d/public/Ecol562_Spring2012/docs/assignments/finalpart3.htm