|
|
T-distribution

Suppose you wanted to know how big the average dog was. To do this with statistics and confidence intervals, you would generally need to know something about the variance of the sizes of dogs overall. But in general, you won't know the variance of your population, so what to do? Well, you could increase your sample size, but that can take time and money you might not have. So if you only have a small sample, and you don't know the variance of the population, it is the student \(t\)-distribution to the rescue!

Mockup Schule

Explore our app and discover over 50 million learning materials for free.

T-distribution

Illustration

Lerne mit deinen Freunden und bleibe auf dem richtigen Kurs mit deinen persönlichen Lernstatistiken

Jetzt kostenlos anmelden

Nie wieder prokastinieren mit unseren Lernerinnerungen.

Jetzt kostenlos anmelden
Illustration

Suppose you wanted to know how big the average dog was. To do this with statistics and confidence intervals, you would generally need to know something about the variance of the sizes of dogs overall. But in general, you won't know the variance of your population, so what to do? Well, you could increase your sample size, but that can take time and money you might not have. So if you only have a small sample, and you don't know the variance of the population, it is the student \(t\)-distribution to the rescue!

T-distribution big and small dog showing variance in size StudySmarterFig. 1 - There is quite a bit of variation in the size of dogs!

Definition of the t-distribution

You might be familiar with the normal distribution as a bell-shaped curve, but it is not the only bell-shaped distribution out there!

There are many others that share this shape, one of which is the \(t\)-distribution. While these two distributions are very similar, they are used in different situations.

You would use a normal distribution if you were making a confidence interval or hypothesis test where:

  • the populations are normally distributed and have equal variance;

  • the population variance is known; or

  • the sample size is large.

On the other hand, you would use a \(t\)-distribution if you were making a confidence interval or hypothesis test where:

  • populations are normally distributed and you don't know the population variances; or

  • the population is normally distributed but the sample size is small.

Remember that if you know the population variance, or have a sufficiently large sample, for a normally distributed random variable, \(X\), where

\[\bar{X} \sim \text{N}\left(\mu, \dfrac{\sigma ^2}{n}\right)\]

you can construct a confidence interval or a hypothesis test.

In reality, you are not likely to know the actual population variance just as you don't generally know the population mean, which is often what you are testing for.

When the sample size \(n\) is large enough, you can use the sample variance \(S\) instead of the population variance \(\sigma\). In this instance, the Central Limit Theorem gives you that

\[\dfrac{\bar{X}-\mu}{\dfrac{S}{\sqrt{n}}}\]

is approximately normal, and

\[\frac{\bar{X}-\mu}{\dfrac{S}{\sqrt{n}}} \approx \text{N}(0,1^2).\]

When \(n\) is small, rather than use the normal distribution, you can use \(t\)-distribution. The value of \(t\) is given by

\[t=\frac{\bar{X}-\mu}{\dfrac{S}{\sqrt{n}}}.\]

Below you can see the graph of the standard normal distribution as compared to the \(t\)-distribution for various values of \(n\).

T-distribution standard normal distribution and t-distribution graphed together for n=1, n=3, and n=10 showing that as n increases the t-distribution gets closer to the standard normal distribution StudySmarterFig. 2 - Standard normal distribution as compared to the \(t\)-distribution for various values of \(n\).

As you can see in the graph above, as \(n\) increases the \(t\)-distribution gets closer to the standard normal distribution. This is one of the reasons statisticians will say that a sample size of \(20\) is often sufficiently large to switch from using a \(t\)-distribution to a normal distribution.

Since the sample size has an important part to play in \(t\)-distributions, it is given a special name, as you will see in the next section.

Degrees of freedom in the t-distribution

Just like with the chi-squared distribution and \(F\)-distribution, the sample size \(n\) determines the number of degrees of freedom. The sample size tells you two things about the degrees of freedom of the \(t\)-distribution:

  1. The number of degrees of freedom, \(\upsilon\), is determined by the sample size minus \(1\): \(\upsilon = n-1\).

  2. As \(\upsilon \to \infty\), the \(t\)-distribution approaches \(\text{N}(0,1^2)\).

Indeed, the normal and \(t\)-distributions are pretty similar. Both are symmetrical and exhibit a bell-curve shape, and they have the same end behaviour.

To indicate you are using a specific degree of freedom for a \(t\)-distribution you can write \(t_\upsilon\)-distribution.

The t-distribution formula

The following is the formula you'll need for the \(t\)-distribution.

If a random sample \(X_1,X_2,X_3, \dots,X_n\) is selected from a normal distribution with an unknown variance \(\sigma ^2\), then

\[t=\dfrac{\bar{X}-\mu}{\dfrac{S}{\sqrt{n}}}\]

where \(t\) is a \(t_{n-1}\)-distribution and \(S^2\) is an unbiased estimator of \(\sigma^2\).

For a reminder of what it means to be unbiased, see the article Estimator Bias.

Just like with the standard normal distribution, there are tables of values you can use with the \(t\)-distribution.

Tables for the t-distribution

The table below is a section of a \(t\)-distribution probability table.

Table 1. \(t\)-distribution probability table

\(\upsilon\)

\(0.100\)

\(0.050\)

\(0.025\)

\(1\)

\(3.0777\)

\(6.3138\)

\(12.7062\)

\(2\)

\(1.8856\)

\(2.9200\)

\(4.3027\)

\(3\)

\(1.6377\)

\(2.3534\)

\(3.1824\)

The values in the table are that which exceed the probability along the top of the table given a certain number of degrees of freedom.

For example, suppose that \(X\) has \(3\) degrees of freedom. The number \(3.1824\) in the lower right corner of the table above means that:

  • \(P(X>3.1824) = 0.025\); and

  • \(P(X<3.1824) = 1-0.025=0.975\).

Since the \(t\)-distribution is symmetric for any degrees of freedom, you also know that

  • \(P(X<-3.1824) = 0.025\); and

  • \(P(X>-3.1824) = 1-0.025=0.975\).

The area \(P(X>3.1824 )=0.025\) for a \(t\)-distribution curve with \(3\) degrees of freedom is shaded the graph below. Remember that when \(\upsilon = 3\) the sample size is \(n=4\).

T-Distribution A t-distribution curve with 3 degrees of freedom and the probability area of 0.025 or greater shaded in blue on the right tail of the curve. StudySmarterFig. 3 - \(t_3\)-distribution with the shaded area equaling \(0.025\).

Let's take a look at an example.

Suppose \(X\) is a random variable with degrees of freedom \(\upsilon\). Find the value of \(s\) where \(P(|X|<s)=0.80\) where \(\upsilon = 3\).

Solution

Notice that \(P(|X|<s)=0.80\) is the same as \(P(|X|>s)=0.20\) because the \(t\)-distribution is symmetric. This looks a little odd, but it simply means that \(P(X<-s)=0.1\) and \(P(X>s)=0.1\). It can often help to draw a picture of what you are looking for.

T-Distribution A graph of a bell curve with the left and right 0.1 areas shaded. StudySmarterFig. 4 - The total shaded area is \(0.2\).

You can use the \(t\)-distribution table or a calculator to find that the value of \(s\) that gives you \(P(X>s)=0.1\) is \(s=1.6377 \).

Critical values for the t-distribution

Critical values are used when constructing confidence intervals. Confidence intervals depend on the confidence level, you are using. Remember that the confidence limits for a \(100(1-\alpha)\%\) always have the form

test statistic \(\pm\) (\(t\)-critical value)(standard error).

In the case of the \(t\)-distributions, the standard error is given by

\[ \text{standard error} = \frac{s}{\sqrt{n}},\]

and the \(t\)-critical value is

\[ \text{critical value} =t^*= t_{n-1}\left(\frac{\alpha}{2}\right) .\]

Suppose you have a \(t_2\)-distribution. Find the critical values for the \(90\%\), \(95\%\), and \(99\%\) confidence levels.

Solution:

For the \(90\%\) confidence level, the first goal is to find \(\alpha\). Here

\[ 90\% = 100\%(1-\alpha) \]

so

\[ 0.9 = 1 - \alpha\]

and

\[ \alpha = 0.1.\]

Then for the \(t\)-critical value,

\[\begin{align} t^*& = t_{n-1}\left(\frac{\alpha}{2}\right) \\ & = t_2\left(\frac{0.10}{2}\right) \\ &= t_2(0.05) \\&= 2.92 . \end{align}\]

Similarly, for the \(95\%\) confidence level the \(t\)-critical value is

\[\begin{align} t^*& = t_{n-1}\left(\frac{\alpha}{2}\right) \\ & = t_2\left(\frac{0.05}{2}\right) \\ &= t_2(0.025) \\&= 4.3027, \end{align} \]

and for the \(99\%\) confidence level the \(t\)-critical value is

\[\begin{align} t^*& = t_{n-1}\left(\frac{\alpha}{2}\right) \\ & = t_2\left(\frac{0.01}{2}\right) \\ &= t_2(0.005) \\&= 9.925 . \end{align}\]

Notice that as the confidence level increases the \(t\)-critical value does as well, meaning that your confidence interval gets larger. That makes sense for two main reasons:

  • the more confident you are in a prediction, the harder it is to guarantee you have captured the population parameter in the confidence interval; and

  • the \(t\)-critical value is related to the area under the \(t\)-distribution curve.

For example, at the \(80\%\) confidence level you are actually asking for \(80\%\) of the area under the curve to be captured in the shaded area. The higher your confidence level, the larger the shaded area!

T-distribution area under center of curve corresponds to confidence level StudySmarterFig. 5 - \(t\)-distribution showing how confidence level relates to the area under the curve.

This is one of the reasons it can be helpful to draw a picture of what you are trying to find before you reach for a calculator or \(t\)-distribution table!

T-Distribution - Key takeaways

  • If the random sample \(X_1,X_2,X_3, \dots,X_n\) is normally distributed with an unknown variance, \(\sigma ^2\), then you have \[t=\dfrac{\bar{X}-\mu}{\dfrac{S}{\sqrt{n}}}\] where \(t\) has a \(t_{n-1}\)-distribution and \(S^2\) is an unbiased estimator for \(\sigma ^2\).
  • The number of degrees of freedom is determined by the sample size minus \(1\),\(\upsilon = n-1\).
  • As \(\upsilon \to \infty\), the \(t\) distribution approaches \(\text{N}(0,1^2)\).
  • The critical value, \(t^*\), for the \(\alpha\) confidence level can be found with the formula \[ t^*= t_{n-1}\left(\frac{\alpha}{2}\right). \]

Frequently Asked Questions about T-distribution

We use the T-distribution when we do not know the population variance and the sample size of the sample variance is small.

The T-distribution is similar to the normal distribution but we do not know the population variance and the sample size of the sample variance is small.

The T-distribution is similar to the normal distribution but we do not know the population variance and the sample size of the sample variance is small.

Like the normal distribution, the T-distribution is a symmetric bell-curve.

We use the T-distribution when we do not know the population variance and the sample size of the sample variance is small.

Test your knowledge with multiple choice flashcards

If you are constructing a confidence interval and you have a large sample size, which distribution would you use?

If you are constructing a confidence interval and you have a small sample size, which distribution would you use?

If you are doing a hypothesis test involving two populations which are normally distributed

Next
More about T-distribution

Join over 22 million students in learning with our StudySmarter App

The first learning app that truly has everything you need to ace your exams in one place

  • Flashcards & Quizzes
  • AI Study Assistant
  • Study Planner
  • Mock-Exams
  • Smart Note-Taking
Join over 22 million students in learning with our StudySmarter App Join over 22 million students in learning with our StudySmarter App

Sign up to highlight and take notes. It’s 100% free.

Entdecke Lernmaterial in der StudySmarter-App

Google Popup

Join over 22 million students in learning with our StudySmarter App

Join over 22 million students in learning with our StudySmarter App

The first learning app that truly has everything you need to ace your exams in one place

  • Flashcards & Quizzes
  • AI Study Assistant
  • Study Planner
  • Mock-Exams
  • Smart Note-Taking
Join over 22 million students in learning with our StudySmarter App