Assignment 9

Due Date

Friday, March 23, 2012

Data Source

The data set for this assignment is ozone.txt, a tab-delimited text file.

Overview

The data set comes from a study of the relationship between atmospheric ozone concentration and meteorology in the Los Angeles basin. It was first presented by Breiman and Friedman (1985) and was further analyzed by Faraway (2006). The data consist of daily measurements of ozone concentration (maximum one hour average) and eight meteorological quantities for 330 days of 1976. The variables are listed below.

Dependent Variable

  03: Upland ozone concentration (ppm)

Predictors

  1. temp: Sandburg Air Force Base temperature (°C)
  2. ibh: inversion base height (ft.)
  3. dpg: Daggett pressure gradient (mm Hg)
  4. vis: visibility (miles)
  5. vh: Vandenburg 500 millibar height (in)
  6. humidity: humidity (percent)
  7. ibt: inversion base temperature (°F)
  8. wind: wind speed (mph)
  9. day: day of the year to account for possible seasonal effects not captured by the meteorological variables

Questions

  1. Fit an additive model that includes O3 as the response along with separate smooths of each of the nine predictor variables.
  2. According to the Wald tests in the summary table two of the smooths may not be statistically significant. The reported Wald tests are somewhat unreliable though. Better statistical tests can be obtained with method="ML". Refit the model using method="ML" and try dropping the two variables in question one at a time. Use the anova function with test='F' to compare two nested models at a time. Remove those variables whose smooths are not significant as reported by the anova function.
  3. Plot the smooths and determine which of the smooths are roughly linear. Replace the smooths for those variables with parametric linear terms.
  4. Plots of the four remaining smooths suggest that a piecewise linear curve with two pieces might approximate the pattern displayed by the smooth. Separately try replacing each of the four smooths with an appropriate piecewise linear curve. Estimate the location of the breakpoint (knot) by fitting a large number of models and selecting the model that provides the best fit.
  5. Argue that it is statistically defensible to replace only one of the smooths with a piecewise linear curve. (Remember that the estimated breakpoint location should count as one of the estimated parameters.)
  6. For this variable, plot the smooth along with the replacement breakpoint model on the same graph so that the two functions can be readily compared.

Hints

Cited references

Course Home Page


Jack Weiss
Phone: (919) 962-5930
E-Mail: jack_weiss@unc.edu
Address: Curriculum in Ecology and the Environment, Box 3275, University of North Carolina, Chapel Hill, 27599
Copyright © 2012
Last Revised--March 21, 2012
URL: https://sakai.unc.edu/access/content/group/2842013b-58f5-4453-aa8d-3e01bacbfc3d/public/Ecol562_Spring2012/docs/assignments/assign9.htm