|
|
Degrees of Freedom

Your life is made up of constraints on your time. When you go to work, how much time you spend studying, and the amount of sleep you need are all examples of constraints placed on you. You can think about how free you are in terms of how many constraints are placed upon you. 

Mockup Schule

Explore our app and discover over 50 million learning materials for free.

Degrees of Freedom

Illustration

Lerne mit deinen Freunden und bleibe auf dem richtigen Kurs mit deinen persönlichen Lernstatistiken

Jetzt kostenlos anmelden

Nie wieder prokastinieren mit unseren Lernerinnerungen.

Jetzt kostenlos anmelden
Illustration

Your life is made up of constraints on your time. When you go to work, how much time you spend studying, and the amount of sleep you need are all examples of constraints placed on you. You can think about how free you are in terms of how many constraints are placed upon you.

In statistics, there are constraints as well. The Chi Squared Tests use degrees of freedom to describe how free a test is based on the constraints placed on it. Read on to figure out how free the Chi Squared Test really is!

Degrees of freedom meaning

Many tests use degrees of freedom, but here you will see degrees of freedom as it relates to Chi Squared Tests. In general, the degrees of freedom is a way to measure how many test statistics you have calculated from the data. The more test statistics you have calculated using your sample, the less freedom you have to make choices with your data. Of course, there is a more formal way to describe these constraints as well.

A constraint, also called a restriction, is a requirement placed on the data by the model for the data.

Let's look at an example to see what that means in practice.

Suppose you are doing an experiment where you roll a four sided die \(200\) times. Then the sample size is \(n=200\). One constraint is that your experiment needs the sample size to be \(200\).

The number of constraints will also depend on the number of parameters you need to describe a distribution, and whether or not you know what these parameters are.

Next, let's look at how the constraints relate to degrees of freedom.

Degrees of freedom formula

For most cases, the formula

degrees of freedom = number of observed frequencies - number of constraints

can be used. If you go back to the example with the four sided die above, there was one constraint. The number of observed frequencies is \(4\) (the number of sides on the die. So the degrees of freedom would be \(4-1 = 3\).

There is a more general formula for the degrees of freedom:

degrees of freedom = number of cells (after combining) - number of constraints.

You are probably wondering what a cell is and why you might combine it. Let's look at an example.

You send out a survey to \(200\) people asking how many pets people have. You get back the following table of responses.

Table 1. Responses from pet ownership survey.

Pets\(0\)\(1\)\(2\)\(3\)\(4\)\(>4\)
Expected\(60\)\(72\)\(31\)\(20\)\(7\)\(10\)

However, the model you are using is only a good approximation if none of the expected values falls below \(15\). So you could combine the last two columns of data (known as cells) into the table below.

Table 2. Responses from pet ownership survey with combined cells.

Pets\(0\)\(1\)\(2\)\(3\)\(>3\)
Expected\(60\)\(72\)\(31\)\(20\)\(17\)

Then there are \(5\) cells, and one constraint (that the total of the expected values is \(200\)). So the degrees of freedom is \(5 - 1= 4\).

You will usually only combine adjoining cells in your tables of data. Next, let's look at the official definition of degrees of freedom with the Chi-Squared distribution.

Degrees of freedom definition

If you have a random variable \(X\) and want to do an approximation for the statistic \(X^2\), you would use the \(\chi^2\) family of distributions. This is written as

\[\begin{align} X^2 &= \sum \frac{(O_t - E_t)^2}{E_t} \\ &= \sum \frac{O_t ^2}{E_t} -N \\ & \sim \chi^2, \end{align}\]

where \(O_t\) is the observed frequency, \(E_t\) is the expected frequency, and \(N\) is the total number of observations. Remember that the Chi-Squared tests are only a good approximation if none of the expected frequencies is below \(5\).

For a reminder of this test and how to use it, see Chi Squared Tests.

The \(\chi^2\) distributions are actually a family of distributions that depend on the degrees of freedom. The degrees of freedom for this kind of distribution are written using the variable \(\nu\). Since you may need to combine cells when using \(\chi^2\) distributions, you would use the definition below.

For the \(\chi^2\) distribution, the number of degrees of freedom, \(\nu\) is given by

\[ \nu = \text{number of cells after combining}-1.\]

There will be cases where cells won't be combined, and in that case, you can simplify things a bit. If you go back to the four sided die example, there are \(4\) possibilities that could come up on the die, and these are the expected values. So for this example \(\nu = 4 - 1 = 3\) even if you are using a Chi-Squared distribution to model it.

To be sure you know how many degrees of freedom you have when using the Chi-Squared distribution, it is written as a subscript: \(\chi^2_\nu \).

Degrees of freedom table

Once you know that you are using a Chi-Squared distribution with \(\nu\) degrees of freedom, you will need to use a degrees of freedom table so that you can do hypothesis tests. Here is a section out of a Chi-Squared table.

Table 3. Chi-Squared table.

degrees of freedom

\(0.99\)

\(0.95\)

\(0.9\)

\(0.1\)

\(0.05\)

\(0.01\)

\(2\)

\(0.020\)

\(0.103\)

\(0.211\)

\(4.605\)

\(5.991\)

\(9.210\)

\(3\)

\(0.155\)

\(0.352\)

\(0.584\)

\(6.251\)

\(7.815\)

\(11.345\)

\(4\)

\(0.297\)

\(0.711\)

\(1.064\)

\(7.779\)

\(9.488\)

\(13.277\)

The first column of the table contains the degrees of freedom, and the first row of the table are areas to the right of the critical value.

The notation for a critical value of \(\chi^2_\nu\) which is exceeded with probability \(a\%\) is \(\chi^2_\nu(a\%)\) or \(\chi^2_\nu(a/100)\).

Let's take an example using the Chi-Squared table.

Find the critical value for \(\chi^2_3(0.01)\).

Solution:

The notation for \(\chi^2_3(0.01)\) tells you that there are \(3\) degrees of freedom and you are interested in the \(0.01\) column of the table. Looking at the intersection of the row and column in the table above, you get \(11.345\). So

\[\chi^2_3(0.01) = 11.345 . \]

There is a second use for the table, as demonstrated in the next example.

Find the smallest value of \(y\) such that \(P(\chi^2_3 > y) = 0.95\).

Solution:

Remember that the significance level is the probability that the distribution exceeds the critical value. So asking for the smallest value \(y\) where \(P(\chi^2_3 > y) = 0.95\) is the same as asking what \(\chi^2_3(0.95)\) is. Using the Chi-Squared table you can see that \(\chi^2_3(0.95) =0.352 \), so \(y=0.352\).

Of course, a table can't list all of the possible values. If you need a value which is not in the table, there are many different statistics packages or calculators that can give you Chi-Squared table values.

Degrees of freedom t-test

The degrees of freedom in a \(t\)-test is calculated depending on if you are using paired samples or not. For more information on these topics, see the articles T-distribution and Paired t-test.

Degrees of Freedom - Key takeaways

  • A constraint, also called a restriction, is a requirement placed on the data by the model for the data.
  • In most cases, degrees of freedom = number of observed frequencies - number of constraints.
  • A more general formula for degrees of freedom is: degrees of freedom = number of cells (after combining) - number of constraints.
  • For the \(\chi^2\) distribution, the number of degrees of freedom, \(\nu\) is given by

    \[ \nu = \text{number of cells after combining}-1.\]

Frequently Asked Questions about Degrees of Freedom

It depends on the kind of test you are doing. Sometimes it is the sample size minus 1, sometimes it is the sample size minus 2.

The degree of freedom is related to the sample size and the kind of test you are doing. For example in a paired t-test the degree of freedom is the sample size minus 1.

It is the number of degrees of freedom.

It tells you how many independent values that can vary without breaking any constraints in the problem.

In statistics, the degrees of freedom tells you how many independent values that can vary without breaking any constraints in the problem. 

Test your knowledge with multiple choice flashcards

When you are combining cells for a Chi-Squared test, how do you decide to combine cells?

Which of these is the notation for degrees of freedom with a Chi-Squared test?

For a Chi-squared test, where in the notation would you find the degrees of freedom?

Next

Join over 22 million students in learning with our StudySmarter App

The first learning app that truly has everything you need to ace your exams in one place

  • Flashcards & Quizzes
  • AI Study Assistant
  • Study Planner
  • Mock-Exams
  • Smart Note-Taking
Join over 22 million students in learning with our StudySmarter App Join over 22 million students in learning with our StudySmarter App

Sign up to highlight and take notes. It’s 100% free.

Entdecke Lernmaterial in der StudySmarter-App

Google Popup

Join over 22 million students in learning with our StudySmarter App

Join over 22 million students in learning with our StudySmarter App

The first learning app that truly has everything you need to ace your exams in one place

  • Flashcards & Quizzes
  • AI Study Assistant
  • Study Planner
  • Mock-Exams
  • Smart Note-Taking
Join over 22 million students in learning with our StudySmarter App