In lecture 11 we gave some basic information about the negative binomial distribution and used it as the probability model for the response in a regression problem. Here we restate the basic properties of this distribution, give some examples of its flexibility, and provide some motivation for why the negative binomial distribution is particularly suitable as a probability model for count data in ecology.
A negative binomial (NB) random variable is discrete. Like the Poisson distribution it is bounded below by 0 but is theoretically unbounded above. A typical use of the negative binomial distribution is as a model for count data. An example would be in describing the number of cases of Lyme disease in a North Carolina county in a year.
There are two distinct parameterizations of the negative binomial. One parameterization appears in most introductory probability texts and the second appears in most ecology texts. In the ecological parameterization there are two parameters: μ and θ. μ is the mean and θ is the dispersion parameter. θ is also called the size (in R) or the scale parameter. θ is defined differently in different statistical packages. For example in SAS the dispersion parameter for the negative binomial is the reciprocal of the one reported by R.
The mean of the negative binomial distribution is μ. The variance of the negative binomial distribution is . Thus the variance is a quadratic function of the mean. Contrast this with the variance-mean relationship for the Poisson where the variance is equal to the mean. Using the R parameterization, if we let the dispersion parameter go to infinity the random variable that results is a Poisson random variable with parameter μ, i.e., if
then
. The R negative binomial probability functions are dnbinom, pnbinom, qnbinom, rnbinom.
To appreciate the flexibility of the negative binomial distribution and to understand the role θ plays we graph a number of negative binomial distributions that have the same mean but have different values of θ. In the example below I fix the mean at 10 and set θ equal to 10, 1, and 0.1. For comparison I also show a Poisson distribution with a mean of 10. The plots are arranged in a 2 × 2 grid using the mfrow argument of par.
![]() |
Fig. 1 Three negative binomial distributions with μ = 10 but with θ = 10, 1, and 0.1. A Poisson distribution with λ = 10 is included for comparison. |
As the plots reveal varying θ can have a dramatic effect on the shape of the distribution. As θ gets larger the negative binomial distribution more closely approaches a Poisson distribution with the same mean. One thing that is perhaps not apparent from the graphs is how much more probable extreme values are under a negative binomial distribution. The code below calculates the probability of observing a count of 30 under the three negative binomial models in Fig. 1 and also under the Poisson distribution. Notice that for all three negative binomial distributions observing a count of 30 is at least 1000 times more likely than it is for a Poisson distribution with the same mean.
Obtaining both very large and very small counts is quite likely with a negative binomial distribution. With a Poisson distribution such a combination is essentially impossible.
Boswell and Patil (1970) list 12 distinct ways in which a negative binomial distribution can arise in practice. Of these only three appear to have much relevance to environmental science. Each is a consequence of violating a specific assumption of the Poisson distribution. The assumptions of the Poisson distribution were given in lecture 10 and are repeated below.
Here are three ways the negative binomial distribution can arise from violations of a Poisson distribution.
It turns out there are two ways the conditioning can be done and they yield different mean-variance relationships. One method yields a linear mean-variance relationship, , where k is constant. Notice that the Poisson is a special case of this with k = 1. This is called the NB1 model in the econometrics literature. The second method yields the mean-variance relationship given above,
. This is sometimes called the NB2 model.
It's hard to know how useful this result is in practice. What is important is that the Polya-Eggenberger scheme clearly violates the independent increments hypothesis that was one of the three basic assumptions of the Poisson model. Thus we see that if either one of the major assumptions of the Poisson model, homogeneity or independence, is violated we can be led, under certain circumstances, to a negative binomial model.
It is easy to imagine each of these three mechanisms occurring in practice.
A further application of the negative binomial distribution is with zero-inflated count data (meaning that we see more zeros than a fitted Poisson model can predict). The negative binomial distribution can often be successfully fit to zero-inflated data.
The negative binomial distribution is useful as a model for count data whenever we suspect heterogeneous rates of occurrence, lack of independence, clustering, or zero inflation.
Generalized linear models are regression models that include the normal regression model as a special case. They extend linear regression to probability distributions that are members of the exponential family of distributions. These include the normal, Poisson, binomial, and gamma distributions, among others. The negative binomial distribution is technically not a member of this group but it is possible to extend the methodology to include it. Generalized linear models were developed to fix the various problems we've already encountered with normal regression models.
Generalized linear models (GLIMs) were designed to overcome the limitations described above. Proposed in the 1970s by McCullagh and Nelder (McCullagh and Nelder, 1989), GLIMs serve to unify a number of disparate modeling protocols under a single umbrella. While some revisionist statisticians think the significance of the GLIM paradigm is overstated (Lindsey 2004), I think it is fair to say that generalized linear models did for statistical modeling what plate tectonics did for geology. GLIMs provided a unifying vision to statistical modeling and brought a degree of order to what was otherwise a set of rather chaotic and seemingly arbitrary collection of procedures.
Formally a generalized linear model consists of the following three components.
Because we require g to be monotonic, it follows that g is invertible. Thus we can also write this last expression in terms of the inverse link function, .
Written this way we can see that GLIMs have the potential of solving the range restriction problem described above. Because the linear predictor is connected to the mean through , a judicious choice of link function can constrain the predictions to map onto a desired range. The so-called canonical link functions for the normal, Poisson, binomial, and gamma distributions are respectively the identity, log, logit, and reciprocal links. In most software packages a log link is used for the negative binomial distribution.
To see that normal regression models are also generalized linear models, we identify the random component, the systematic component, and the link function.
Table 1 lists the probability distributions typically used as the random component in generalized linear models along with their canonical and usual link functions. When a probability distribution is written in its exponential form the mean parameter can appear as part of a functional expression. This function of μ is referred to as the canonical link. Historically, it was useful to use the canonical link as the link function because it made things simpler. This is no longer necessary and a number of other link functions are available in R. For instance, although the reciprocal function is the canonical link for the gamma distribution, the log link is more commonly used because it guarantees that the mean as a function of the linear predictor will be positive. The canonical link function for the negative binomial distribution is rarely used because it is difficult to interpret.
Table 1 Canonical links for members of the exponential family |
Probability Distribution | Canonical Link g(μ) | Usual link | Links available in R |
---|---|---|---|
Poisson |
![]() |
![]() |
log, identity, sqrt |
Normal |
![]() |
![]() |
identity, log, inverse |
Binomial |
![]() |
![]() |
logit, probit, cauchit, log, cloglog (complementary log-log) |
Gamma |
![]() |
![]() |
inverse, identity, log |
Inverse Gaussian |
![]() |
![]() |
1/mu^2, inverse, identity, log |
Negative binomial |
![]() |
![]() |
log, sqrt, identity |
Jack Weiss Phone: (919) 962-5930 E-Mail: jack_weiss@unc.edu Address: Curriculum for the Environment and Ecology, Box 3275, University of North Carolina, Chapel Hill, 27599 Copyright © 2012 Last Revised--February 13, 2012 URL: https://sakai.unc.edu/access/content/group/2842013b-58f5-4453-aa8d-3e01bacbfc3d/public/Ecol562_Spring2012/docs/lectures/lecture12.htm |