|
|
Probability Generating Function

Have you ever wanted to know what the probability of extinction is for a population? Hopefully, your answer is yes! Statisticians use methods in stochastic processes involving the use of the probability generating function (PGF) of a distribution to find the extinction probability of certain populations. For example, they use PGFs to work out the probability that an infectious disease (like Covid-19) dies out before it reaches the level of an epidemic. Now, we won't get that far in this article but we can appreciate just how useful probability generating functions are in analysing distributions. 

Mockup Schule

Explore our app and discover over 50 million learning materials for free.

Probability Generating Function

Illustration

Lerne mit deinen Freunden und bleibe auf dem richtigen Kurs mit deinen persönlichen Lernstatistiken

Jetzt kostenlos anmelden

Nie wieder prokastinieren mit unseren Lernerinnerungen.

Jetzt kostenlos anmelden
Illustration

Have you ever wanted to know what the probability of extinction is for a population? Hopefully, your answer is yes! Statisticians use methods in stochastic processes involving the use of the probability generating function (PGF) of a distribution to find the extinction probability of certain populations. For example, they use PGFs to work out the probability that an infectious disease (like Covid-19) dies out before it reaches the level of an epidemic. Now, we won't get that far in this article but we can appreciate just how useful probability generating functions are in analysing distributions.

What is the Probability Generating Function?

In statistics, the probability distribution of a discrete random variable can be specified by the probability mass function, or by the cumulative distribution function. Another way to specify the distribution of a discrete random variable is by its probability generating function. The probability generating function is a power series representation of the random variable’s probability density function. These generating functions have interesting properties and can often reduce the amount of work involved in analysing a distribution.

The probability generating function (PGF) of a discrete random variable is given by:

$$G_X(t)=\mathbb{E}\left(t^X\right)=\sum_{x} t^x\mathbb{P}(X=x)$$

where \(t\) is known as a dummy variable.

This comes from the formula of the expectation of a function of a discrete random variable:

$$\mathbb{E}(g(X))=\sum_{x} g(x)\mathbb{P}(X=x)$$

where \(g(X)=t^X\).

From the formula, you can see that the each term of the PGF is a \(t^x\) term with a coefficient. The value of the exponent, \(x\), corresponds to a value that the random value can take and the coefficient of each \(t^x\) term corresponds to the probability of the random variable taking the value of the exponent.

Find the probability generating function for the distribution given by:

\(x\)\(-2\)\(0\)\(1\)\(3\)
\(\mathbb{P}(X=x)\)\(\frac{1}{6}\)\(\frac{5}{12}\)\(\frac{1}{3}\)\(\frac{1}{12}\)
Solution:

Using the formula

\[G_X(t)=\mathbb{E}\left(t^X\right)=\sum_{x} t^x\mathbb{P}(X=x)\]

you have

\[\begin{align} G_X(t)&=\frac{1}{6}t^{-2}+\frac{5}{12}t^0+\frac{1}{3}t^1+\frac{1}{12}t^3 \\ &=\frac{1}{6}t^{-2}+\frac{5}{12}+\frac{1}{3}t+\frac{1}{12}t^3. \end{align}\]

Naturally you will want to use the properties of PGF to make your work quicker.

Probability Generating Function: Properties

The probability generating functions have interesting properties that can often reduce the amount of work needed to analyse a distribution. For example, as you will see, the PGF can make it easier to work out the expectation or the variance.

For a discrete random variable you have:

1. \(G_X(t)=\mathbb{E}(t^X)=\sum_{x} t^x\mathbb{P}(X=x).\)

2. For any PGF of a discrete random variable: \[\begin{align} G_X(1)&=\sum_{x} 1^x\mathbb{P}(X=x) \\ &=\sum_{x}\mathbb{P}(X=x)\\ &=1. \end{align}\]

Suppose the discrete random variable \(X\) has a PGF given by

$$G_X(t)=\frac{1}{8}(1+t)^3.$$

Then,

$$G_X(1)=\frac{1}{8}{2}^3=1.$$

3. \(\begin{align} G'(t)&=\frac{\mathrm{d} }{\mathrm{d} t} G(t) \\ &= \frac{\mathrm{d} }{\mathrm{d} t} \mathbb{E}\left(t^X\right) \\ &=\mathbb{E}\left(Xt^{X-1}\right) \end{align}\)

4. \(G_X'(1)=\mathbb{E}(X)\)

Let \(X\) have a PGF given by

$$G_X(t)=\frac{1}{8}(1+t)^3.$$

Then

\[\begin{align} G_X'(t)&=\frac{3}{8}(1+t)^2 \\ G_X'(t)&=\frac{3}{8}(2)^2=\frac{3}{2} .\end{align}\]

Therefore

\[ \mathbb{E}(X)=\frac{3}{2}.\]

5. \(\begin{align} G_X''(t)&=\frac{\mathrm{d^2} }{\mathrm{d} x^2} G_X(t) \\ &= \mathbb{E}\left(X(X-1)t^{X-2}\right) \end{align}\)

6. \(\begin{align} G_X''(1) &=\mathbb{E}(X(X-1)) \\ &=\mathbb{E}\left(X^2-X\right) \\ &=\mathbb{E}\left(X^2\right)-\mathbb{E}(X)\end{align}\)

7. \(\begin{align}\text{Var}(X) &=\mathbb{E}\left(X^2\right)-(\mathbb{E}(X))^2 \\ &=G_X''(1)+G_X'(1)-(G_X'(1))^2 \end{align}\)

Let \(X\) have a PGF given by

$$G_X(t)=\frac{1}{8}(1+t)^3.$$

Then

\[\begin{align} G_X'(1)&=\frac{3}{2} \\ G_X''(t)&=\frac{3}{4}(1+x) \\ G_X''(1)&=\frac{3}{2} \end{align}\] Therefore

\[ \begin{align}\text{Var}(X)&=G_X''(1)+G_X'(1)-(G_X'(1))^2 \\ &=\frac{3}{2}+\frac{3}{2}-\left(\frac{3}{2}\right)^2\\ &=\frac{3}{4} .\end{align}\]

8. If the random variables \(X\) and \(Y\) are discrete and independent with PGFs given by \(G_X(t)\) and \(G_Y(t)\) respectively, then the PGF of \(Z=X+Y\) is given by \(G_Z(t)=G_X(t) \cdot G_Y(t)\).

Suppose a discrete random variable \(X\) has a PGF given by

$$G_X(t)=\frac{t^2}{(2-t)^5}$$

and a discrete random variable \(Y\) has a PGF given by

$$G_Y(t)=\frac{t}{(4-3t)^2}.$$

Given that \(X\) and \(Y\), find the PGF of \(Z=X+Y\):

Solution:

Using property 8,

\[\begin{align} G_Z(t)&=G_X(t) \cdot G_Y(t) \\ &=\frac{t^2}{(2-t)^5} \cdot\frac{t}{(4-3t)^2} = \frac{t^3}{(2-t)^5(4-3t)^2}. \end{align}\]

Probability Generating Function Examples

These are some examples using the different properties of the PGF:

The probability generating function of a discrete random variable \(X\) is given by

$$G_X(t)=z(1+2t+2t^2)^2.$$

a) Find the value of \(z\).

b) Give the probability distribution of \(X\).

Solution:

a) Using property 2 above, you know that for any PGF,

\[\begin{align} G_X(1) &=\sum_{x} 1^x\mathbb{P}(X=x) \\ &=\sum_{x}\mathbb{P}(X=x) \\ &=1, \end{align}\]

so

\[\begin{align} G_X(1)&=z(1+2(1)+2(1)^2)^2 \\ 1&=z(1+2+2)^2 \\ z&=\frac{1}{25}. \end{align}\]

b) You have \(G_X(t)=\frac{1}{25}(1+2t+2t^2)^2.\)

Expanding brackets gives

\[\begin{align} G_X(t)&=\frac{1}{25}(1+2t+2t^2)(1+2t+2t^2) \\ &=\frac{1}{25}(1+2t+2t^2+2t+4t^2+4t^3+2t^2+4t^3+4t^4) \\ &=\frac{1}{25}(1+4t+8t^2+8t^3+4t^4) \\ &=\frac{1}{25}+\frac{4t}{25}+\frac{8t^2}{25}+\frac{8t^3}{25}+\frac{4t^4}{25} \\ &=\frac{1t^0}{25}+\frac{4t^1}{25}+\frac{8t^2}{25}+\frac{8t^3}{25}+\frac{4t^4}{25}. \end{align}\]

Now you have a function where you can read off the values of the values of \(x\) with the corresponding probabilities of \(x\) using the fact that the coefficients of \(t^x\) are the probabilities \(\mathbb{P}(X = x)\). Therefore, the probability distribution of X is:

\(x\)\(0\)\(1\)\(2\)\(3\)\(4\)
\(\mathbb{P}(X=x)\)\(\frac{1}{25}\)\(\frac{4}{25}\)\(\frac{8}{25}\)\(\frac{8}{25}\)\(\frac{4}{25}\)

A good way to check you answer is to make sure that \(\sum_{x}\mathbb{P}(X=x)=1\).

Let's take a look at another example.

Suppose a random variable \(X\) has a PGF given by

$$G_X(t)=\frac{1}{10}(4t+3t^2+2t^3+t^4).$$

Find the variance of \(X\).

Solution:

Using property 7 above, you have

\[\begin{align} \text{Var}(X) &=\mathbb{E}\left(X^2\right)-(\mathbb{E}(X))^2 \\ &=G_X''(1)+G_X'(1)-(G_X'(1))^2, \end{align}\]

so

\[\begin{align} G_X'(t)&=\frac{\mathrm{d} }{\mathrm{d} t} G_X(t) \\ &= \frac{1}{10}\left(4+6t+6t^2+4t^3\right) \\ G_X'(1)&=2 \\ G_X''(t)&=\frac{\mathrm{d^2} }{\mathrm{d} x^2} G_X(t) \\ &= \frac{3\left(2t^2+2t+1\right)}{5} \\ G_X''(1)&=3. \end{align}\]

Therefore

\[\begin{align} \text{Var}(X)&=G_X''(1)+G_X'(1)-(G_X'(1))^2 \\ &=3+2-2^2\\ &=1. \end{align}\]

Let's look at the PGFs of some of the standard distributions.

PGF of the Poisson Distribution

The Poisson distribution is a discrete distribution that is used for modelling the number of times that a random event occurs in a fixed interval of time or space, assuming that the events occur independently and happen at a constant rate.

If a discrete random variable \(X\sim \text{Poi}(\lambda)\) the PGF of \(X\) is given by

$$G_X(t)=e^{\lambda(t-1)}.$$

Now let's see how to use it.

The number of website visitors is given by a rate of \(4\) per hour. Given that the random variable \(X\) is the number of visitors to the website that come in a random hour, and that the visits are independent and random, show from first principles, that the probability generating function for \(X\) is

$$G_X(t)=e^{4(t-1)}.$$

Solution:

From the event description, you can see that the random variable has the property that \(X\sim Poi(4)\) since each visit independent of each another, and occur in a fixed time period (an hour) at a constant average rate \(4.\)

Therefore,

\[\mathbb{P}(X=x)=\frac{e^{-4}4^x}{x!},\]

so

\[\begin{align} G_X(t)&=\mathbb{E}(t^X)=\sum_{x} t^x\mathbb{P}(X=x) \\ &=\sum_{x=0}^{\infty} t^x\frac{e^{-4}4^x}{x!} \\ &=e^{-4}\sum_{x=0}^{\infty}\frac{t^x 4^x}{x!} \\ &=e^{-4} \sum_{x=0}^{\infty}\frac{(4t)^x}{x!} \\ &=e^{-4}e^{4t}\\ &=e^{-4+4t} \\ &=e^{4(t-1)} . \end{align}\]

The final equality follows from the Maclaurin expansion of \(e^x\) where \(x=4t\). Equivalently you can use the exponential summation:

\[\sum_{k=0}^{\infty} \frac{a^k}{k!} =e^a.\]

PGF of the Binomial Distribution

You will have come across the binomial distribution. Suppose that you perform an experiment that consists of repeating independently the same trial \(n\) times. Each time the trial results in either of two possible outcomes, success or failure. Let \(p\) be the the probability of success, then \(X\sim \text{Bin}(n,p)\) denotes the number of successes in \(n\) trials.

Now, let's see the PGF of the binomial distribution.

If a discrete random variable \(X\sim \text{Bin}(n,p)\) the PGF of \(X\) is given by

$$G_X(t)=(1-p+pt)^n.$$

Prove from first principles that the PGF of \(X\sim \text{Bin}(n,p)\) is given by

$$G_X(t)=(1-p+pt)^n.$$

Solution:

\[\begin{align} G_X(t)&=\mathbb{E}(t^X)\\ &=\sum_{k=0}^{n}t^k\binom{n}{k}p^k(1-p)^{(n-k)} \\ &=\sum_{k=0}^{n}\binom{n}{k}(tp)^k(1-p)^{(n-k)} \\ &=(tp+(1-p))^n, \end{align}\]

where the final equality follows from the binomial summation:

\[(a+b)^n=\sum_{k=0}^{n}\binom{n}{k}a^k b^{(n-k)}.\]

Lets look at an example,

The probability of a seed will germinate is \(0.35\). Let the random variable \(X\) denote the number of seeds that have germinated out of \(4\) planted seeds.

a) Show, from first principles, that the probability generating function for \(X\) is

$$G_X(t)=(0.65+0.35t)^4.$$

b) Using your answer to a), determine the mean of \(X\).

Solution:

a) Observe that \(X ∼ \text{Bin}(4, 0.35)\), so

\[\mathbb{P}(X=x)= \binom{6}{x}0.35^x(1-0.35)^{6-x}\]

for \(x=0,\dots ,6\).

Using the formula for the probability generating function:

\[\begin{align} G_X(t)&=\sum_{x=0}^{4}t^x\mathbb{P}(X=x) \\ &=\sum_{x=0}^{4}t^x\binom{4}{x}0.35^x(1-0.35)^{6-x} \\ &=(0.65)^4+4(0.35)(0.65)^3t+6(0.35)^2(0.65)^2t^2+4(0.35)^3(0.65)t^3+(0.35)^4t^4 \\ &=(0.65)^4+4(0.65)^3(0.35t)+6(0.65)^2(0.35t)^2+4(0.65)(0.35t)^3+(0.35t)^4 \\ &=(0.65+0.35t)^4 ,\end{align}\]

where the last equality follows from the binomial formula:

\[(a+b)^n=\sum_{k=0}^{n}\binom{n}{k}a^k b^{(n-k)}.\]

Hence, the probability generating function for \(X\) is:

$$G_X(t)=(0.65+0.35t)^4.$$

b) Using property 4 above, you have that \(G_X'(t)=\mathbb{E}(X)\), so

\[\begin{align} G_X'(t)&=1.4(0.65+0.35t)^3 \\ G_X'(1)&=1.4 \end{align}\]

PGF of the Geometric Distribution

If a discrete random variable \(X\sim \text{Geo}(p)\), the PGF of \(X\) is given by

$$G_X(t)=\frac{pt}{1-(1-p)t}.$$

Suppose a random variable \(X \sim \text{Geo}(p)\). Show from first principles that the PGF of \(X\) is

$$G_X(t)=\frac{tp}{1-(1-p)t}.$$

Solution:

Since \( X \sim \text{Geo}(p)\) you have that \(\mathbb{P}(X=x)=(1-p)^{x-1}p\) , so

\[\begin{align} G_X(t)&=\sum_{x} t^x\mathbb{P}(X=x) \\ &=\sum_{x=1}^{\infty} t^x(1-p)^{x-1}p \\ &=tp\sum_{x=1}^{\infty} [t(1-p)]^{x-1} \\ &=tp\sum_{i=0}^{\infty} [t(1-p)]^{i} , \end{align} \]

where in the last line the substitution \( i=x-1 \) was done. Therefore

\[ G_X(t) =\frac{tp}{1-t(1-p)},\]

where the equality follows from the geometric summation:

\[\sum_{k=0}^{\infty} a^k = \frac{1}{1-a}.\]

Remember that if the random variable \(X\) has a Geometric distribution i.e. \(X\sim \text{Geo}(p)\), then, assuming independent trials with a constant probability of success \(p\), \(X\) denotes the number of trials until a success occurs. With this in mind, let's take a look at a couple of examples.

Becky rolls a fair six-sided dice. Let the random variable \(X\) denote the number of rolls it takes for her to get a multiple of \(2\). Given that each roll is independent, find the probability generating function of \(X\).

Solution:

Let \(p\) be the probability that Becky rolls an even number. Then \(p=0.5\) and the random variable \(X\sim \text{Geo}(0.5).\)

Therefore, using the formula given above, the probability generating function of \(X\) is

\[\begin{align} G_X(t)&=\frac{pt}{1-(1-p)t} \\ &=\frac{0.5t}{1-0.5t}. \end{align}\]

Let's see another example.

Let the random variable \(X\sim \text{Geo}(p)\), use the PGF of \(X\) to show that \(\mathbb{E}(X)=\dfrac{1}{p}\) and \(\text{Var}(X)=\dfrac{1-p}{p^2}.\)

Solution:

Using properties 4 and 7 you have that \(G_X'(1)=\mathbb{E}(X)\) and

\[\text{Var}(X)=G_X''(1)+G_X'(1)-(G_X'(1))^2.\]

From the definition above, a random variable \(X\sim \text{Geo}(p)\) has the PGF given by

\[G_X(t)=\frac{pt}{1-(1-p)t}.\]

So, using the quotient rule, you have that,

\[\begin{align} G_X'(t)&=\frac{(1-(1-p)t)(p)-(pt)(-(1-p))}{(1-(1-p)t)^2} \\ &=\frac{p(1-(1-p)t+(1-p)t)}{(1-(1-p)t)^2} \\ &=\frac{p}{(1-(1-p)t)^2} \\ G_X'(1)&=\frac{p}{(1-(1-p))^2} \\ &=\frac{p}{p^2} \\ &=\frac{1}{p}, \end{align} \]

therefore

\[\mathbb{E}(X)=\frac{1}{p} .\]

Using the chain rule, you have that,

\[\begin{align} G_X''(t)&=\frac{-2(-(1-p))p}{(1-(1-p)t)^3} \\ &=\frac{2p(1-p)}{(1-(1-p)t)^3} \\ G_X''(1)&=\frac{2p(1-p)}{p^3} \\ &=\frac{2(1-p)}{p^2} .\end{align}\]

Therefore,

\[\begin{align} \text{Var}(X)&=G_X''(1)+G_X'(1)-(G_X'(1))^2 \\ &=\frac{2(1-p)}{p^2}+\frac{1}{p}-\left(\frac{1}{p}\right)^2 \\ &=\frac{2(1-p)+p-1}{p^2} \\ &=\frac{1-p}{p^2}.\end{align}\]

Probability Generating Functions - Key takeaways

  • The probability generating function (PGF) of a discrete random variable is given by \(G_X(t)=\mathbb{E}(t^X)=\sum_{x} t^x\mathbb{P}(X=x),\) where \(t\) is known as a dummy variable.
  • Many of tasks in analysing random variables, such as finding the variance or expectation, are simplified by using the random variable's PGF.
  • If a discrete random variable \(X\sim \text{Poi}(\lambda)\) the PGF of \(X\) is given by \(G_X(t)=e^{\lambda(t-1)}.\)

  • If a discrete random variable \(X\sim \text{Bin}(n,p)\) the PGF of X is given by \(G_X(t)=(1-p+pt)^n.\)

  • If a discrete random variable \(X\sim \text{Geo}(p)\), the PGF of X is given by \(G_X(t)=\frac{pt}{1-(1-p)t}.\)

Frequently Asked Questions about Probability Generating Function

In statistics, the probability distribution of a discrete random variable can be specified by the probability mass function, or by the cumulative distribution function. Another way to specify the distribution of a discrete random variable is by its probability generating function

The PGF for the Poisson distribution is G(t) = e^{lambda(t-1)}.

G(t) = (1-p+pt)^n

Find the expected value of t^X.

Not if the function doesn't have a power series representation of the random variable's probability density function.

Test your knowledge with multiple choice flashcards

What is the formula for the probability generating function?

What is the formula for the expectation of a discrete random variable in terms of its probability generating function?

What is the formula for the variance of a discrete random variable in terms of its probability generating function?

Next

Join over 22 million students in learning with our StudySmarter App

The first learning app that truly has everything you need to ace your exams in one place

  • Flashcards & Quizzes
  • AI Study Assistant
  • Study Planner
  • Mock-Exams
  • Smart Note-Taking
Join over 22 million students in learning with our StudySmarter App Join over 22 million students in learning with our StudySmarter App

Sign up to highlight and take notes. It’s 100% free.

Entdecke Lernmaterial in der StudySmarter-App

Google Popup

Join over 22 million students in learning with our StudySmarter App

Join over 22 million students in learning with our StudySmarter App

The first learning app that truly has everything you need to ace your exams in one place

  • Flashcards & Quizzes
  • AI Study Assistant
  • Study Planner
  • Mock-Exams
  • Smart Note-Taking
Join over 22 million students in learning with our StudySmarter App