What is a Probability Generating Function (PGF)? Definition, Examples, and Properties

Probability generating function (PGF) explained with Poisson example, derivatives, and coefficient extraction

What is a Probability Generating Function?

A probability generating function for a non-negative integer-valued, discrete random variable is a power series whose coefficients are the probabilities that it takes different values.

A power series is like an infinite polynomial, where the powers can continue increasing forever.

For instance, consider the power series

$$1+z+z^2+z^3 \dots $$

You should recognise this as a geometric series. If it so happens that $-1 \lt z \lt 1$, this defines a function, $f(z)=\frac{1}{1-z}$. But often it is useful to look at power series even if the sum doesn’t converge like this.

Given a sequence $(a_0,a_1, a_2, \dots)$, the ordinary power series of this sequence is one where the $n^{th}$ term of the sequence is the coefficient multiplying the $n^{th}$ power of $z$.

$$ a_0 + a_1 z + a_2 z^2 + \dots $$

Given a random variable $X$, taking values in $\mathbb{N}$, its probability generating function (PGF) is the ordinary power series of the sequence of the probabilities $\mathbb{P}(X=0), \mathbb{P}(X=1), \mathbb{P}(X=2), \dots $.

That is, the PGF of $X$ is a power series such that the coefficient of $z^n$ is the probability that $X=n$:

$$P_X(z) = \mathbb{P}(X=0) + \mathbb{P}(X=1) z + \mathbb{P}(X=2) z^2 + \mathbb{P}(X=3) z^3 + \dots $$

This will always define a function $P_X : [-1,1] \to \mathbb{R}$, though often it can be defined on a larger domain including $[-1,1]$.

Looking at the series, we see that it can also be thought of as an expected value:

$$P_X(z)= \mathbb{E}(z^X)$$

A PGF completely determines the distribution of a random variable. That is, once we know the PGF, this pins down uniques what random variable we have.

What are Probability Generating Functions For?

We call it a probability generating function because we can use it to “generate” or “find” the probabilities that $X$ takes any value $n$.

For instance, if we set $z=0$ in the above formula, this clears out all the terms except the first one, giving:

$$P_X(0) = \mathbb{P}(X=0)$$

To find other probabilities for $X$, we first differentiate to move the term we want into the “constant” position, then apply the same trick.

For instance, suppose we want to find $\mathbb{P}(X=2)$. Then we first differentiate:

$$ \frac{ d P_X(z)}{dz} = \mathbb{P}(X=1) + 2 \ \mathbb{P}(X=2) z + 3 \ \mathbb{P}(X=3) z^2 + 4 \ \mathbb{P}(X=4) z^3 + \dots $$

Differentiation of power series indeed works in the obvious way like this; i.e. we can proceed term by term.

Then, we differentiate again:

$$ \frac{ d^2 P_X(z)}{dz^2} = 2 \ \mathbb{P}(X=2) + 6 \ \mathbb{P}(X=3) z + 12 \ \mathbb{P}(X=4) z^2 + 20 \ \mathbb{P}(X=5) z^3 + \dots $$

Finally, we set $z=0$:

$$ \frac{ d^2 P_X(0)}{dz^2} = 2 \ \mathbb{P}(X=2) $$

From here we can divide by $2$ to get the probability we want.

For instance, if $X \sim Poisson(\lambda)$, it is possible to show that its PGF is:

$$P_X(z) = e^{\lambda(z-1)}$$

To find $\mathbb{P}(X=2)$, we again differentiate twice:

$$P’_X(z) = \lambda e^{\lambda(z-1)}, \quad P’’_X(z) = \lambda^2 e^{\lambda(z-1)} $$

Then we set $z=0$ and divide by $2$:

$$ \mathbb{P}(X=2) = \frac{1}{2} \lambda^2 e^{\lambda(-1)} = \frac{e^{-\lambda}\lambda^2} {2!} $$

In general, we differentiate $n$ times, set $z=0$, and divide by the $n!$ that will appear from bringing down powers. That is:

$$ \mathbb{P}(X=n) = \frac{1}{n!} \frac{ d^n P_X(0)}{dz^n} $$

From this formula:

$$ \frac{ d P_X(z)}{dz} = \mathbb{P}(X=1) + 2 \ \mathbb{P}(X=2) z + 3 \ \mathbb{P}(X=3) z^2 + \dots $$

We can also set $z=1$, giving

$$ P’_Z(z) = \mathbb{P}(X=1) + 2 \ \mathbb{P}(X=2) + 3 \ \mathbb{P}(X=3) + \dots = \mathbb{E}(X) $$

So, amongst another things, PGFs can be used to find expected values.

How Do We Find Probability Generating Functions?

One way to find the probability generating function of a random variable is by summing the infinite series.

For instance, suppose that $X$ is geometric with parameter $p$; the number of trials required to get a success. Then we have:

$$P_X(z) = \sum_{n=0}^{\infty} \mathbb{P}(X=n) \ z^n = \sum_{n=1}^{\infty} (1-p)^{n-1}p z^n = pz \sum_{m=0}^{\infty} (1-p)^m z^m = \frac{pz}{1-(1-p)z} $$

Here we found the sum by using the standard formula, noting that $X \ge 1$, then pulling $pz$ out, reindexing the sum with $m=n-1$, and using the “geometric series” formula given above.

So far this may not seem terribly useful, since in this case we needed to already know the probabilities for $X$ anyway in order to find the PGF. However, once we have the PGFs of some basic distributions like this to hand, we can use these to find the PGFs of more exotic ones.

For instance, if $X$ and $Y$ are independent, then to get the PGF of their sum we can just multiply together their individual PGFs:

$$P_{X+Y}(z) = P_{X}(z) P_{Y}(z) $$

We see this in detail on the next slide. This means we can find the PGF of a negative binomial distribution, since this is the sum of some independent geometric ones. From these we can extract its probabilities, and expected values as above.

If we take a linear transformation from $X$ to $aX+b$, the PGF of $X$ also scales nicely:

$$P_{aX+b}(z) = z^{b}\ P_X(z^a) $$

This is easily seen from the $P_X(z)= \mathbb{E}(z^X)$ formula above, which also be used to show the previous result.

Combining these two properties is very useful when working with averages, where we first add up some independent random variables and then divide by $n$.

In general, PGFs are useful because in some cases we know a lot about how to manipulate series, or the closed-form expressions we get from summing them.

For instance, it is often quite easy to multiply functions together, so if we know the PGFs of $X$ and $Y$ we can easily find the PGF of $X+Y$, and perhaps from there identify what kind of random variable it is.