Gamma distribution and Poisson distribution

The gamma distribution is important in many statistical applications. This post discusses the connections of the gamma distribution with Poisson distribution. The following is the probability density function of the gamma distribution.

    \displaystyle (1) \ \ \ \ \ f(x)=\frac{1}{\Gamma(\beta)} \ \alpha^\beta \ x^{\beta-1} \ e^{-\alpha x} \ \ \ \ \ \ x>0

The numbers \alpha and \beta, both positive, are fixed constants and are the parameters of the distribution. The symbol \Gamma(\cdot) is the gamma function. This density function describes how the potential gamma observations distribute across the positive x-axis. Baked into this gamma probability density function are two pieces of information about the Poisson distribution. We can simply read off the information from the parameters \alpha and \beta in the density function.

Poisson-Gamma Mixture

For a reason that will be given shortly, the parameters \alpha and \beta in (1) gives a negative binomial distribution. The following is the probability function of this negative binomial distribution.

    \displaystyle (2) \ \ \ \ \ P(N=n)=\binom{\beta+n-1}{n} \ \biggl(\frac{\alpha}{1+\alpha} \biggr)^\beta \ \biggl(\frac{1}{1+\alpha} \biggr)^n \ \ \ \ \ \ n=0,1,2,3,\cdots

If the parameter \beta is a positive integer, then (2) has a nice interpretation in the context of a series of independent Bernoulli trials. Consider performing a series of independent trials where each trial has one of two distinct outcomes (called success or failure). Assume that the probability of a success in each trial is p=\alpha/(1+\alpha). In this case, the quantity in (2) is the probability of having n failures before the occurrence of the \betath success.

Note that when \beta is a positive integer, the binomial coefficient \binom{\beta+n-1}{n} has the usual calculation \frac{(\beta+n-1)!}{n! (\beta-1)!}. When \beta is not an integer but is only a positive number, the binomial coefficient \binom{\beta+n-1}{n} is calculated as follows:

    \displaystyle \binom{\beta+n-1}{n}=\frac{(\beta+n-1) (\beta+n-2) \cdots (\beta+1) \beta}{n!}

For this new calculation to work, \beta does not have to be a positive integer. It only needs to be a positive real number. Then the distribution in (2) does not have a natural interpretation in terms of performing a series of independent Bernoulli trials. It is simply a counting distribution, i.e. a random discrete variable modeling the number of occurrences of a type of random events.

What does this have to do with gamma distribution and Poisson distribution? When a conditional random variable X \lvert \lambda has a Poisson distribution such that its mean \lambda is an unknown random quantity but follows a gamma distribution with parameters \beta and \alpha as described in (1), the unconditional distribution for X has a negative binomial distribution as described in (2). In other words, the mixture of Poisson distributions with gamma mixing weights is a negative binomial distribution.

There is an insurance interpretation of the Poisson-gamma mixture. Suppose that in a large pool of insureds, the annual claim frequency of an insured is a Poisson distribution with mean \lambda. The quantity \lambda varies from insured to insured but is supposed to follow a gamma distribution. Here we have a family of conditional Poisson distributions where each one is conditional on the characteristics of the particular insured in question (a low risk insured has a low \lambda value and a high risk insured has a high \lambda value). The “average” of the conditional Poisson distributions will be a negative binomial distribution (using the gamma distribution to weight the parameter \lambda). Thus the claim frequency for an “average” insured in the pool should be modeled by a negative binomial distribution. For a randomly selected insured from the pool, if we do not know anything about this insured, we can use the unconditional negative binomial distribution to model the claim frequency.

A detailed discussion of the negative binomial distribution is found here and here. The notion of mixture distributions and Poisson-gamma mixture in particular are discussed here. Many distributions applicable in actuarial applications are mixture distributions (see here for examples).

Waiting Times in a Poisson Process

The parameter \beta in (1) is called the shape parameter while the parameter \alpha is called the rate parameter. The name of rate parameter will be made clear shortly. For this interpretation of gamma distribution to work, the shape parameter \beta must be a positive integer.

The gamma distribution as described in (1) is intimately connected to a Poisson process through the rate parameter \alpha. In a Poisson process, event of a particular interest occurs at random at the rate of \alpha per unit time. Two questions are of interest. How many times will this event occur in a given time interval? How long do we have to wait to observe the first occurrence, the second occurrence and so on? The gamma distribution as described in (1) gives an answer to the second question when \beta is a positive integer.

A counting process is a random experiment that observes the random occurrences of a certain type of events of interest in a time period. A Poisson process is a counting process that satisfies three simplifying assumptions that deal with independence and uniformity in time.

A good example of a Poisson process is the well known experiment in radioactivity conducted by Rutherford and Geiger in 1910. In this experiment, \alpha-particles were emitted from a polonium source and the number of \alpha-particles were counted during an interval of 7.5 seconds (2,608 many such time intervals were observed). In these 2,608 intervals, a total of 10,097 particles were observed. Thus the mean count per period of 7.5 seconds is 10097 / 2608 = 3.87. In this experiment, \alpha-particles are observed at a rate of 3.87 per unit time (7.5 seconds).

In general, consider a Poisson process with a rate of \lambda per unit time, i.e. the event that is of interest occurs at the rate of \lambda per unit time. There are two random variables of interest. One is N_t which is the number of occurrences of the event of interest from time 0 to time t. The other is X_n which is the waiting time until the nth occurrence of the event of interest where n is a positive integer.

What can we say about the random variables N_t? First, the random variable N_1, the number of occurrences of the event of interest in a unit time interval, has a Poisson distribution with mean \lambda. The following is the probability function.

    \displaystyle (3) \ \ \ \ \ P(N_1=k)=\frac{1}{k!} \ \lambda^k \ e^{-\lambda} \ \ \ \ \ \ \ \ \ k=0,1,2,3,\cdots

For any positive real number t, the random variable N_t, which is a discrete random variable, follows a Poisson distribution with mean \lambda t. The following is the probability function.

    \displaystyle (4) \ \ \ \ \ P(N_t=k)=\frac{1}{k!} \ (\lambda t)^k \ e^{-\lambda t} \ \ \ \ \ k=0,1,2,3,\cdots

For a more detailed discussion of why P(N_1=k) and P(N_t=k) are Poisson, see here.

We now discuss the distributions for X_1 and X_n. The random variable X_1 would be the mean time to the first occurrence. It has an exponential distribution with mean 1/\lambda. The following is the probability density function.

    \displaystyle (5) \ \ \ \ \ f_{X_1}(t)=\lambda \ e^{-\lambda t} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ t>0

The random variable X_n has a gamma distribution with shape parameter \beta=n and rate parameter \alpha=\lambda. The following is the probability density function.

    \displaystyle (6) \ \ \ \ \ f_{X_n}(t)=\frac{1}{(n-1)!} \ \lambda^n \ t^{n-1} \ e^{-\lambda t} \ \ \ \ \ \ t>0

The density functions in (5) and (6) are derived from the Poisson distributions in (3) and (4). For example, for X_1>t (the first occurrence taking place after time t) to happen, there is no occurrence in the time interval (0,t). Thus P(X_1>t)=P(N_t=0). Note that N_t is a Poisson random variable with mean \lambda t. The following derives the density function in (5).

    \displaystyle P(X_1>t)=P(N_t=0)=e^{-\lambda t}

    \displaystyle P(X_1 \le t)=1-e^{-\lambda t}

Note that the cumulative distribution function P(X_1 \le t) is that for an exponential distribution. Taking the derivative gives the density function in (5). Extending the same reasoning, for X_n>t (the nth occurrence taking place after time t) to happen, there must be at most n-1 occurrences in the time interval (0,t). Once again, we are relating X_n to the Poisson random variable N_t. More specifically P(X_n>t)=P(N_t \le n-1).

    \displaystyle (7) \ \ \ \ \ P(X_n>t)=P(N_t \le n-1)=\sum \limits_{k=0}^{n-1} \biggl[ \frac{(\lambda t)^k}{k!} \ e^{-\lambda t} \biggr]

    \displaystyle (8) \ \ \ \ \  P(X_n \le t)=1-\sum \limits_{k=0}^{n-1} \biggl[ \frac{(\lambda t)^k}{k!} \ e^{-\lambda t} \biggr]=\sum \limits_{k=n}^{\infty} \biggl[ \frac{(\lambda t)^k}{k!} \ e^{-\lambda t} \biggr]

Taking the derivative of (8) gives the density function in (6). See here for a more detailed discussion of the relation between gamma distribution and Poisson distribution in a Poisson process.

Evaluating Gamma Survival Function

The relation (7) shows that the gamma survival function is the cumulative distribution function (CDF) of the corresponding Poisson distribution. Consider the following integral.

    \displaystyle (9) \ \ \ \ \ \int_t^\infty \frac{1}{(n-1)!} \ \lambda^n \ x^{n-1} \ e^{-\lambda x} \ dx

The integrand in the above integral is the density function of a gamma distribution (with the shape parameter being a positive integer). The limits of the integral are from t to \infty. Conceptually the integral is identical to (7) since the integral gives P(X_n>t), which is the survival function of the gamma distribution in question. According to (7), the integral in (9) has a closed form as follows.

    \displaystyle (10) \ \ \ \ \ \int_t^\infty \frac{1}{(n-1)!} \ \lambda^n \ x^{n-1} \ e^{-\lambda x} \ dx=\sum \limits_{k=0}^{n-1} \biggl[ \frac{(\lambda t)^k}{k!} \ e^{-\lambda t} \biggr]

The above is a combination of (7) and (9) and is a way to evaluate the right tail of the gamma distribution when the shape parameter is a positive integer n. Instead of memorizing it, we can focus on the thought process.

Given the integral in (9), note the shape parameter n and the rate parameter \lambda. Consider a Poisson process with rater parameter \lambda. In this Poisson process, the random variable N_1, the number of occurrences of the event of interest in a unit time interval, has a Poisson distribution with mean \lambda. The random variable N_t, the number of occurrences of the event of interest in a time interval of length t, has a Poisson distribution with mean \lambda t. Then the integral in (9) is equivalent to P(N_t \le n-1), the probability that there are at most n-1 occurrences in the time interval (0,t).

The above thought process works even if the gamma distribution has no relation to any Poisson process, e.g. it is just the model for size of insurance losses. We can still pretend there is a Poisson relationship and obtain a closed form evaluation of the survival function. It must be emphasized that this evaluation of the gamma survival function in closed form is possible only when the shape parameter is a positive integer. If it is not, we then need to evaluate the gamma survival function using software.


The discussion here presents two relationships between the gamma distribution and the Poisson distribution. In the first one, mixing conditional Poisson distributions with gamma mixing weights produces a negative binomial distribution. Taking the shape parameter \beta and rate parameter \alpha in (1) produces the negative binomial distribution as in (2). This is one interpretation of the two gamma parameters.

In the second interpretation, the rate parameter of the gamma model is intimately tied to the rate parameter of a Poisson process. The interplay between the number of occurrences of the event of interest and the waiting time until the nth occurrence connects the gamma distribution with the Poisson distribution. For a fuller discussion of this interplay, see here.

The gamma distribution is mathematically derived from the gamma function. See here for an introduction to the gamma distribution.

The three assumptions for a Poisson process that are alluded to above allow us to obtain a binomial distribution when dividing a time interval into n subintervals (for a sufficiently large n). As the subintervals get more and more granular, the binomial distributions have Poisson distribution as the limiting distribution. The derivation from binomial to Poisson is discussed here, here and here.

Practice problems can be found here in a companion blog for practice problems.

Dan Ma gamma distribution
Dan Ma Poisson distribution
Dan Ma math

Daniel Ma gamma distribution
Daniel Ma mathematics
Daniel Ma Poisson distribution

\copyright 2018 – Dan Ma

Parametric severity models

Modeling the severity of losses is an important part of actuarial modeling (severity is the dollar value per claim). One approach is to employ parametric models in the modeling process. For example, the process may involve using claims data to estimate the parameters of the fitted model and then using the fitted model for estimation of future claim costs. A companion blog called Topics in Actuarial Modeling introduces a catalog of parametric severity models (the catalog is found here).

The catalog lists the models according to their mathematical properties (e.g. how they are derived and how they are related to the gamma distribution). The models have been discussed quite extensively in the blog for Topics in Actuarial Modeling. The catalog puts all the models in one place so to speak. It has the links to the blog posts that describe the models. Just click on a link if anyone is interested in knowing more about a particular parametric model. The parametric models in the catalog include the following:

  • Gamma distribution
  • Erlang distribution
  • Exponential distribution
  • Chi-squared distribution
  • Hypo-Exponential distribution
  • Hyper-Exponential distribution
  • Lognormal distribution
  • Weibull distribution
  • Transformed exponential distribution
  • Inverse exponential distribution
  • Inverse transformed exponential distribution
  • Transformed gamma distribution
  • Inverse gamma distribution
  • Inverse transformed gamma distribution
  • Transformed Pareto distribution = Burr distribution
  • Inverse Pareto distribution
  • Inverse transformed Pareto distribution = Inverse Burr distribution
  • Paralogistic distribution
  • Inverse paralogistic distribution
  • Pareto distribution
  • Generalized Pareto distribution
  • Loglogistic distribution
  • Student t distribution
  • Beta distribution
  • Generalized beta distribution

This list is by no means comprehensive or exhaustive. It should be a good resource to begin the modeling process (or the study of the actuarial modeling process). In fact, the modeling process is part of the exam syllabus of the Society of Actuaries. The catalog of models is found here.

actuarial modeling

parametric severity models

Dan Ma math

Daniel Ma mathematics

\copyright 2017 – Dan Ma