Suppose is a mixture distribution that is the result of mixing a family of conditional distributions indexed by a parameter random variable
. The uncertainty in the parameter variable
has the effect of increasing the unconditional variance of the mixture
. Thus,
is not simply the weighted average of the conditional variance
. The unconditional variance
is the sum of two components. They are:
The above relationship is called the law of total variance, which is the proper way of computing the unconditional variance . The first component
is called the expected value of conditional variances, which is the weighted average of the conditional variances. The second component
is called the variance of the conditional means, which represents the additional variance as a result of the uncertainty in the parameter
.
We use an example of a two-point mixture to illustrate the law of total variance. The example is followed by a proof of the total law of variance.
Example
Let be the uniform distribution on the unit interval
. Suppose that a large population of insureds is composed of “high risk” and “low risk” individuals. The proportion of insured classified as “low risk” is
where
. The random loss amount
of a “low risk” insured is
. The random loss amount
of a “high risk” insured is
shifted by a positive constant
, i.e.
. What is the variance of the loss amount of an insured randomly selected from this population?
For convenience, we use as a parameter to indicate the risk class (
is “low risk” and
is “high risk”). The following shows the relevant conditional distributional quantities of
.
The unconditional mean loss is the weighted average of the conditional mean loss amounts. However, the same idea does not work for variance.
The conditional variance is the same for both risk classes since the “high risk” loss is a shifted distribution of the “low rosk” loss. However, the unconditional variance is more than since the mean loss for the two casses are different (heterogeneous risks across the classes). The uncertainty in the risk classes (i.e. uncertainty in the parameter
) introduces additional variance in the loss for a randomly selected insured. The unconditional variance
is the sum of the following two components:
The additional variance is in the amount of . This is the variance of the conditional means of the risk classes. Note that
is the additional mean loss for a “high risk” insured. The higher the additional mean loss
, the more heterogeneous in risk between the two classes, hence the larger the dispersion in unconditional loss.
The total law of variance gives the unconditional variance of a random variable that is indexed by another random variable
. The unconditional variance of
is the sum of two components, namely, the expected value of conditional variances and the variance of the conditional means. The formula is:
The following is the derivation of the formula:
Additional Practice
See this blog post for practice problems on mixture distributions.
Reference
- Klugman S.A., Panjer H. H., Wilmot G. E. Loss Models, From Data to Decisions, Second Edition., Wiley-Interscience, a John Wiley & Sons, Inc., New York, 2004
I was struggling with this concept, but your example and explanation really helped me to finally understand. Thank you!
Wonderful explanation, the best online tutorial on the law of total variance by far!