|
|
Confidence Interval for Slope of Regression Line

With what confidence would you say that the relationship between the hours of sleep you get at night and your success in school are related? And that this relationship is a linear relationship?

Mockup Schule

Explore our app and discover over 50 million learning materials for free.

Confidence Interval for Slope of Regression Line

Illustration

Lerne mit deinen Freunden und bleibe auf dem richtigen Kurs mit deinen persönlichen Lernstatistiken

Jetzt kostenlos anmelden

Nie wieder prokastinieren mit unseren Lernerinnerungen.

Jetzt kostenlos anmelden
Illustration

With what confidence would you say that the relationship between the hours of sleep you get at night and your success in school are related? And that this relationship is a linear relationship?

In this article, you will learn about a confidence interval for the slope of a regression model, its meaning, the conditions necessary to be able to construct them, the formula, and how to actually determine them. For information on drawing conclusions about a population from the confidence interval, see the article Justifying Claims Based on the Confidence Interval for the Slope of a Regression Model.

Meaning of Confidence Interval for Slope of Regression Line

By now you know that when there is a linear relationship between a variable \(x\) and a variable \(y\) – the linear correlation coefficient \(r\) is non-zero – you can model it with a linear regression. This regression consists of:

\[\hat{y}=\beta_0+\beta_1x\]

where:

  • \(\beta_0\) is the y-intercept;

  • \(\beta_1\) is the slope of the regression;

  • \(x\) is the independent variable; and

  • \(\hat{y}\) the predicted value of the dependent variable.

For a better reminder of this topic, see our article Least-Squares Regression. Remember that the correlation coefficient \(r\) tells you how much of a correlation there is between the two variables. If \(r\) is close to zero, then there is little to no correlation between the variables, while \(r\) values close to \(-1\) or \(1\) indicate that there is a strong correlation between the two variables.

On the other hand, the slope \(\beta_1\) represents how much \(\hat{y}\) changes to the changes in the \(x\)-values, that is, for each unit of increase of \(x\), \(\hat{y}\) increases \(\beta_1\) units.

Suppose you suspect that an increase in book price means that fewer books will be sold. You collect data, and find the line of best fit to be:

\[\hat{y}=3500-10x\]

where \(x\) is the price is the book and \(hat{y}\) is the predicted number of books sold. What a \(\$1\) increase in \(x\) mean about the number of books you predict will sell?

Solution:

From the equation given you can see that \(\beta_0 = 3500\) and \(\beta_1 = -10\). Notice that the slope of the regression model is negative. That means an increase of \(\$1\) in the book price corresponds to a predicted increase of \(-10\) books sold, or in other words you can predict that 10 fewer books will be sold for every dollar increase in book price.

By calculating a confidence interval with a high confidence level, say \(c\%\), for the slope \(\beta_1\), you get two values that define the limits of a range of values in which you can find the slope. You can say with \(c\%\) confidence that the value of the slope will be between those two values.

Furthermore, you can say that the method used to construct the interval is successful in capturing the actual slope of the linear regression model about \(c\%\) of the time.

Conditions for Confidence Interval for the Slope of a Regression Line

The conditions for constructing a confidence interval for the slope of a linear regression are the same as for constructing a linear regression. These conditions are:

  1. Quantitative variable condition: Correlation only applies if both variables are quantitative.

  2. Straight enough condition: Look at the scatter plot and make sure your data has an approximately linear relationship. Correlation only measures the strength in a linear association. This can also be done by looking at the correlation coefficient of the data.

  3. Independence of Variables: Data should be collected randomly, and if sampling without replacement is done, the sample size is less than or equal to \(10\%\) of the total population.

  4. Normal: The independent variable is normally distributed.

Formula of Confidence Interval for Slope of Regression Line

Like any confidence interval you have studied so far, a confidence interval for the slope \(\beta_1\) of the least squares regression line has the following structure:

sample statistic – margin of error \(\le \beta_1\le\) sample statistic + margin of error,

where margin of error = critical value \(\times\) standard error.

Now, you just have to understand what each of those three elements is for the slope \(\beta_1\):

  • The sample statistic will be \(\hat{\beta}_1\), the point estimator of the slope \(\beta_1\);

  • For the margin of error:

    • this time the critical value will be of a \(t\)-distribution with \(n-2\) degrees of freedom, i.e., \(t\) with \(df=n-2\);

    • the standard error for the slope, written \(SE_{\beta_1} \), will be:\[SE_{\beta_1}=\frac{s}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}}\]where \(s\) is the sample standard deviation calculated as:\[s={\sqrt{\frac{\sum_{i=1}^{n}(y_i-\hat{y}_i)^2}{n-2}}}\ \]

The reason why you'll be using a critical \(t\) value instead of a critical \(z\) value is that the standard error of the slope \(\hat{\beta}_1\) is an estimate. You might not actually know the standard deviation of the sampling distribution.

Thus, the formula for a confidence interval for the slope \(\beta_1\) is:

\[\hat{\beta}_1- t\cdot SE_{\beta_1}\le \beta_1\le \hat{\beta}_1+ t\cdot SE_{\beta_1}\]

or an even shorter version:

\[\hat{\beta}_1\pm t\cdot SE_{\beta_1}\\]

This confidence interval is for any confidence level, but confidence levels that you will see most often are \(90\%\), \(95\%\), and \(99\%\). These are the values you should consider when calculating the critical value \(t\).

Calculations for Confidence Interval for Slope of Regression Line

From what you have read so far, the formula for a confidence interval for the slope suggests a set of steps you should follow when you want to find it.

Step 1: Find the sample statistic \(\hat{\beta}_1\).

You get the value of the point estimator \(\hat{\beta}_1\) by constructing the regression line for the data set you are working with.

Step 2: Select a confidence level \(c\%\).

The confidence level describes the uncertainty of a sampling method. You will most often be asked for a confidence level of \(90\%\), \(95\%\), or \(99\%\).

The purpose of knowing the confidence level is to be able to find the critical value \(t\), by consulting a \(t\) table, with two bits of information:

  1. the degrees of freedom, given by the:\[ \text{sample size } -2 = n-2\]where \(n\) is the sample size; and

  2. the confidence level adjusted for the table you are using.

Depending on the table you consult, the confidence level may have to be adjusted to \(1-\tfrac{\alpha}{2}\) or to \(\tfrac{\alpha}{2} \).

For example, for a confidence level of \(99\%\), you know that \(c=100(1-\alpha)\%\) and so:

\[\begin{align} 99\%&=100\%(1-\alpha) \\ 0.99&=1-\alpha \\ \alpha&=0.01 .\end{align}\]

Now, depending on the table you consult, you'll do:

\[1-\frac{\alpha}{2}=1-\frac{0.01}{2}=0.995\]

or

\[\frac{\alpha}{2} = \frac{0.01}{2}=0.005\]

Step 3: Find the margin of error \(t\cdot SE_{\beta_1}\).

As you already know, the margin of error is the product of the critical value \(t\) with the value of the standard error. The formula for the standard error is:

\[SE_{\beta_1}=\frac{s}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}}\]

where \(s\) is the sample standard deviation.

Step 4: Find the confidence interval.

Here you just have to replace the values you got in the previous step in the formula:

\[\hat{\beta}_1\pm t\cdot SE_{\beta_1}\\]

Let's look at an example where you can apply the steps by hand.

Given that the data set in the table below

xy
13
24
27
38
59

Table 1. Example data.

find a confidence interval of \(95\%\) for the slope knowing that the least squares regression line of this data is:

\[\hat{y}=2.41+1.46x\]

the sample variance is \(s^2=2.39\) and \(t=3.182\).

Solution:

Step 1: Find the sample statistic \(\hat{\beta}_1\)

You were given the equation of the regression line, so you know that \(\hat{\beta}_1=1.46\).

Step 2: Select a confidence level \(c\%\)

The confidence level is given: \(c=95\%\). You’re also given the critical value \(t=3.182\).

If you had to consult a \(t\) table, you would first see that \(df=5-2=3\), second that \(95\%=100\%(1-\alpha)\) if and only if \(0.95=1-\alpha\) if and only if \(\alpha=0.05\), and then that \(1-\alpha/2=1-0.05/2=0.975\).

Step 3: Find the margin of error \(t\cdot SE_{\beta_1}\).

You know that:

\[SE_{\beta_1}=\frac{s}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}}\\]

You know \(s^2=2.39\), so the sample standard deviation is \(s=1.55\).

For the sum in the denominator, you first need the sample mean of the \(x-\)values.

\[\bar{x}=\frac{1+2+2+3+5}{5}=2.6\]

Now the sum:

\[\begin{align} \sum_{i=1}^{n}(x_i-\bar{x})^2=&(1-2.6)^2+(2-2.6)^2+(2-2.6)^2+\\&+(3-2.6)^2+(5-2.6)^2 \\ &=9.2 \end{align}\]

Finally, for the margin of error:

\[\begin{align} t\cdot SE_{\beta_1}&=3.182\left( \frac{1.55}{\sqrt{9.2}}\right)\\ &=3.182(0.51)\\ &=1.62282. \end{align} \]

Step 4: Find the confidence interval

Now just substitute the values you determined in the previous steps into the formula:

\[\hat{\beta}_1\pm t\cdot SE_{\beta_1}= 1.46\pm 1.62282\]

which gives you

\[ -0.16282\le \beta_1 \le 3.08282\ \]

If you have satisfied the conditions for doing a confidence interval for the slope of a regression model, you can say with \(95\%\) confidence that the true value of the slope \(\beta_1\) is between \(-0.16282\) and \(3.08282\).

Example of Confidence Interval for Slope of Regression Line

Let's look at an example of doing the calculations necessary for finding the confidence interval for the slope of a regression line.

Between \(2010\) and \(2022\), data was collected on the average cost of college textbooks required for a semester that year. That data is in the table below. Find the confidence interval for the slope of the regression line at a \(99\%\) confidence level.

Year
Average Book Cost (in \($\))
Year
Average Book Cost (in \($\))
\(2010\)
\(660\)
\(2017\)
\(1125\)
\(2011\)
\(678\)
\(2018\)
\(1100\)
\(2012\)
\(596\)
\(2019\)
\(1300\)
\(2013\)
\(550\)
\(2020\)
\(1320\)
\(2014\)
\(770\)
\(2021\)
\(1369\)
\(2015\)
\(790\)
\(2022\)
\(1400\)
\(2016\)
\(860\)

Table 2. Data sample.

Solution:

First, draw a scatter plot of the data.

Confidence Intervals for the Slope of a Regression Model scatter plot of average book cost vs. year showing an approximately linear relationship which is increasing StudySmarter

It certainly looks reasonable to consider a linear regression model, and there are no obvious outliers. Assume year \(2010\) corresponds to \(x=1\). You can find the correlation coefficient \(r = 0.96\) and the line of best fit \(\hat{y} = 79.9x+ 458.1\). With the correlation coefficient being close to \(1\) you can see there is a strong linear relationship between the year and the average book cost.

For a reminder of how to find the correlation coefficient and the line of best fit see Linear Regression and Least-Squares Regression

In fact if you graph the line of best fit you can see immediately that there is a strong linear relationship.

Confidence Intervals for the Slope of a Regression Model scatter plot of average book cost vs year with the line of best fit StudySmarter

Now let's follow the steps to find the confidence interval for the slope of the regression line.

Step 1: Find the sample statistic \(\hat{\beta}_1\).

The line of best fit is \(\hat{y} = 79.9x + 458.1\), so \(\beta_1 = 79.9\). This is the point estimator for the data.

Step 2: Select a confidence level \(c\%\).

The confidence level for this problem is \(99\%\). There are \(13\) samples, which means the degree of freedom is \(13-2=11\). Consulting a \(t\)-table then gives the \(t\) critical value as \(3.11\), so \(t = 3.11\).

Step 3: Find the margin of error \(t\cdot SE_{\beta_1}\).

To do this you first need to calculate \(s^2\). Given the equation for the line:

\[ y_i-\hat{y}_i = y_i - (79.9x_i - 458.1 ) \]

To make the calculations for \(s\) a little easier to follow it can help to make a table.

\(x_i\)\(y_i\)\(\hat{y}_i\)\((y_i-\hat{y}_i )^2 \)
16605383844
2678617.93612.01
3596697.810363.24
4550777.751847.29
5770857.624837.76
6790937.521756.25
78601017.424774.76
811251097.3767.29
911001177.25959.84
1013001257.11840.41
1113201337289
1213691416.92294.41
1314001496.89370.24

Table 3. Data sample.

Using the formula and the information in the table above:

\[\begin{align} s &=\sqrt{\frac{\sum_{i=1}^{n}(y_i-\hat{y}_i)^2}{n-2}} \\ &= \sqrt{\frac{\sum_{i=1}^{13}(y_i-\hat{y}_i)^2}{11}} \\ &= \sqrt{\frac{161556.5 }{11}} \\ &\approx 121.2 \end{align}\]

Then you have:

\[\begin{align} SE_{\beta_1}&=\frac{s}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}} \\ &= \frac{121.2}{182} \\ &\approx 0.67 \end{align} \]

You have already found the critical value \(t = 3.11\), so:

\[ \begin{align} \text{margin of error} &= t\cdot SE_{\beta_1} \\ &= (3.11)(0.67 ) \\ &\approx 2.08 \end{align}\]

Step 4: Find the confidence interval

Substituting the values you found in the previous steps into the formula:

\[\hat{\beta}_1\pm t\cdot SE_{\beta_1}= 79.9\pm 2.08\]

which gives you a confidence interval of \( (77.82, 79.98) \).

If you have satisfied the conditions for doing a confidence interval for the slope of a regression model, you can say with \(99\%\) confidence that the true value of the slope \(\beta_1\) is between \(77.82 \) and \(79.98 \).

Confidence Intervals for the Slope of a Regression Model – Key takeaways

  • By calculating a confidence interval with a high confidence level, say \(c\%\), for the slope \(\beta_1\), you get two values that define the limits of a range of values in which you can find the slope. You can say with \(c\%\) confidence that the value of the slope will be between those two values.
  • You can say that the method used to construct the interval is successful in capturing the actual slope of the linear regression model about \(c\%\) of the time.
  • The formula for the confidence interval for the slope of a regression model is \[\hat{\beta}_1\pm t\cdot SE_{\beta_1}\, ,\] where
    • \(\hat{\beta}_1\) is the estimate of the slope \(\beta_1\)
    • \(t\cdot SE_{\beta_1}\) is the margin of error
    • \(t\) is the critical value from the \(t-\)distribution with parameter \(df=n-2\) (\(n-2\) degrees of freedom)
    • \(SE_{\beta_1}\) is the standard error for the slope

Frequently Asked Questions about Confidence Interval for Slope of Regression Line

c% of the time, the estimated slope β1* is going to overlap with the true value of the slope βthat you’re estimating.

It is a range of values in which you have c% confidence that the estimated value of the slope, β1*, is in that range.

For a small data set like

x  1  2  2  3  5

y  3  4  7  8  9

the confidence interval for the slope is 

-0.16282 ≤ β1 ≤ 3.08282

To calculate the confidence interval for the slope, follow these steps:

   Step 1: Find the slope estimate, β1*

   Step 2: Select a confidence level c%

   Step 3: Find the margin of error t×SEβ1

   Step 4: Find the confidence interval

The formula is β1* ± t×SEβ1, where β1* is the slope estimate, t is the critical value, and SEβ1 is the standard error of the slope.

Test your knowledge with multiple choice flashcards

The expression \(t\cdot SE_{\beta_1}\) is known as ____.

The \(SE_{\beta_1}\) is know as ____.

The \(t\) in \(t\cdot SE_{\beta_1}\) is known as ____.

Next
More about Confidence Interval for Slope of Regression Line

Join over 22 million students in learning with our StudySmarter App

The first learning app that truly has everything you need to ace your exams in one place

  • Flashcards & Quizzes
  • AI Study Assistant
  • Study Planner
  • Mock-Exams
  • Smart Note-Taking
Join over 22 million students in learning with our StudySmarter App Join over 22 million students in learning with our StudySmarter App

Sign up to highlight and take notes. It’s 100% free.

Entdecke Lernmaterial in der StudySmarter-App

Google Popup

Join over 22 million students in learning with our StudySmarter App

Join over 22 million students in learning with our StudySmarter App

The first learning app that truly has everything you need to ace your exams in one place

  • Flashcards & Quizzes
  • AI Study Assistant
  • Study Planner
  • Mock-Exams
  • Smart Note-Taking
Join over 22 million students in learning with our StudySmarter App