|
|
Sample Mean

You are about to finish high school, and you have decided it is time for a change of scenery, so you want to go to a university in another city, let's say San Francisco, California. Among your considerations are, how much will I pay for the rent of an apartment, or how much will I spend on public transportation? So, you decide to ask some of your acquaintances who live over there to see how much they spend on average.

Mockup Schule

Explore our app and discover over 50 million learning materials for free.

Sample Mean

Illustration

Lerne mit deinen Freunden und bleibe auf dem richtigen Kurs mit deinen persönlichen Lernstatistiken

Jetzt kostenlos anmelden

Nie wieder prokastinieren mit unseren Lernerinnerungen.

Jetzt kostenlos anmelden
Illustration

You are about to finish high school, and you have decided it is time for a change of scenery, so you want to go to a university in another city, let's say San Francisco, California. Among your considerations are, how much will I pay for the rent of an apartment, or how much will I spend on public transportation? So, you decide to ask some of your acquaintances who live over there to see how much they spend on average.

This process is called taking a sample mean and in this article you will find the definition, how to calculate a sample mean, standard deviation, variance, the sampling distribution and examples.

Definition of Sample Means

The mean of a set of numbers is just the average, that is, the sum of all the elements in the set divided by the number of elements in the set.

The sample mean is the average of the values obtained in the sample.

It is easy to see that if two sets are different, they will most likely also have different means.

Calculation of Sample Means

The sample mean is denoted by \(\overline{x}\), and is calculated by adding up all the values obtained from the sample and dividing by the total sample size \(n\). The process is the same as averaging a data set. Therefore, the formula is \[\overline{x}=\frac{x_1+\ldots+x_n}{n},\]

where \(\overline{x}\) is the sample mean, \(x_i\) is each element in the sample and \(n\) is the sample size.

Let's go back to the San Francisco example. Suppose you asked \(5\) of your acquaintances how much they spend on public transport per week, and they said \(\$20\), \(\$25\), \(\$27\), \(\$43\), and \(\$50\). So, the sample mean is calculated by:

\[\overline{x}=\frac{20+25+27+43+50}{5}=\frac{165}{5}=33.\]

Therefore, for this sample, the average amount spent on public transportation in a week is \($33\).

Standard Deviation and Variance of the Sample Mean

Since the variance is the square of the standard deviation, to calculate either value, two cases must be considered:

1. You know the population standard deviation.

2. You do not know the population standard deviation.

The following section shows how to calculate this value for each case.

The Mean and Standard Deviation Formula for Sample Means

The mean of the sample mean, denoted by \(\mu_\overline{x}\), is given by the population mean, that is if \(\mu\) is the population mean, \[\mu_\overline{x}=\mu.\]

To calculate the standard deviation of the sample mean (also called the standard error of the mean (SEM)), denoted by \(\sigma_\overline{x}\), the two previous cases must be considered. Let's explore them in turn.

Calculating the Sample Mean Standard Deviation using the Population Standard Deviation

If the sample of size \(n\) is drawn from a population whose standard deviation \(\sigma\) is known, then the standard deviation of the sample mean will be given by \[\sigma_\overline{x}=\frac{\sigma}{\sqrt{n}}.\]

A sample of \(81\) people was taken from a population with standard deviation \(45\), what is the standard deviation of the sample mean?

Solution:

Using the formula stated before, the standard deviation of the sample mean is \[\sigma_\overline{x}=\frac{45}{\sqrt{81}}=\frac{45}{9}=5.\]

Note that to calculate this, you do not need to know anything about the sample besides its size.

Calculating the Sample Mean Standard Deviation without using the Population Standard Deviation

Sometimes, when you want to estimate the mean of a population, you do not have any information other than just the data from the sample you took. Fortunately, if the sample is large enough (greater than \(30\)), the standard deviation of the sample mean can be approximated using the sample standard deviation. Thus, for a sample of size \(n\), the standard deviation of the sample mean is \[\sigma_\overline{x}\approx\frac{s}{\sqrt{n}},\] where \(s\) is the sample standard deviation (see the article Standard Deviation for more information) calculated by:

\[s=\sqrt{\frac{(x_1-\overline{x})^2+\ldots+(x_n-\overline{x})^2}{n-1}},\]

where \(x_i\) is each element in the sample and \(\overline{x}\) is the sample mean.

❗❗ The sample standard deviation measures the dispersion of data within the sample, while the sample mean standard deviation measures the dispersion between the means from different samples.

Sampling Distribution of the Mean

Recall the sampling distribution definition.

The distribution of the sample mean (or sampling distribution of the mean) is the distribution obtained by considering all the means that can be obtained from fixed-size samples in a population.

If \(\overline{x}\) is the sample mean of a sample of size \(n\) from a population with mean \(\mu\) and standard deviation \(\sigma\). Then, the sampling distribution of \(\overline{x}\) has mean and standard deviation given by \[\mu_\overline{x}=\mu\,\text{ and }\,\sigma_\overline{x}=\frac{\sigma}{\sqrt{n}}.\]

Furthermore, if the distribution of the population is normal or the sample size is large enough (according to the Central Limit Theorem, \(n\geq 30\) is enough), then the sampling distribution of \(\overline{x}\) is also normal.

When the distribution is normal, you can calculate probabilities using the standard normal distribution table, for this you need to convert the sample mean \(\overline{x}\) into a \(z\)-score using the following formula

\[z=\frac{\overline{x}-\mu_\overline{x}}{\sigma_\overline{x}}=\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}.\]

You may be wondering, what happens when the population distribution is not normal and the sample size is small? Unfortunately, for those cases, there is no general way to obtain the shape of the sampling distribution.

Let's see an example of a graph of a sampling distribution of the mean.

Going back to the example of public transportation in San Francisco, let's suppose you had managed to survey thousands of people, grouped the people into groups of size \(10\), averaged them in each group and obtained the following graph.

Sample Means relative frenquency histogram of 360 sample means from samples of size 10 for the public transport example StudySmarterFigure 1. Relative frenquency histogram of 360 sample means for the public transport example

This graph approximates the graph of the sampling distribution of the mean. Based on the graph, you can deduce that an average of \(\$37\) is spent on public transportation in San Francisco.

Examples of Sample Means

Let's see an example of how to calculate probabilities.

It is assumed that the human body temperature distribution has a mean of \(98.6\, °F\) with a standard deviation of \(2\, °F\). If a sample of \(49\) people are taken at random, calculate the following probabilities:

(a) the average temperature of the sample is less than \(98\), that is, \(P(\overline{x}<98)\).

(b) the average temperature of the sample is greater than \(99\), that is, \(P(\overline{x}>99)\).

(c) the average temperature is between \(98\) and \(99\), that is, \(P(98<\overline{x}<99)\).

Solution:

1. Since the sample size is \(n=49>30\), you can assume the sampling distribution is normal.

2. Calculating the mean and the standard deviation of the sample mean. Using the formulas stated before, \(\mu_\overline{x}=98.6\) and the standard deviation \(\sigma_\overline{x}=2/\sqrt{49}=2/7\).

3. Converting the values into \(z-\)scores and using the standard normal table (see the article Standard Normal Distribution for more information), you'll have for (a):

\[\begin{align} P(\overline{x}<98) &=P\left(z<\frac{98-98.6}{\frac{2}{7}}\right) \\ &= P(z<-2.1) \\ &=0.0179. \end{align}\]

For (b) you'll have:

\[\begin{align} P(\overline{x}>99) &=P\left(z>\frac{99-98.6}{\frac{2}{7}}\right) \\ &= P(z>1.4) \\ &=1-P(z<1.4) \\ &=1-0.9192 \\ &= 0.0808. \end{align}\]

Finally, for (c):

\[\begin{align} P(98<\overline{x}<99) &=P(\overline{x}<99)-P(\overline{x}<98) \\ &= P(z<1.4)-P(z<-2.1) \\ &= 0.9192-0.0179 \\ &=0.9013. \end{align}\]

Sample Mean - Key takeaways

  • The sample mean allows you to estimate the population mean.
  • The sample mean \(\overline{x}\) is calculated as an average, that is, \[\overline{x}=\frac{x_1+\ldots+x_n}{n},\] where \(x_i\) is each element in the sample and \(n\) is the sample size.
  • The sampling distribution of the mean \(\overline{x}\) has mean and standard deviation given by \[\mu_\overline{x}=\mu\,\text{ and }\,\sigma_\overline{x}=\frac{\sigma}{\sqrt{n}}.\]
  • When the sample size is greater than \(30\), according to the Central Limit Theorem, the sampling distribution of the mean is similar to a normal distribution.

Frequently Asked Questions about Sample Mean

The sample mean is the average of the values obtained in the sample.

By adding up all the values obtained from a sample and dividing by the number of values in the sample.

The formula for calculating the sample mean is (x1+...+xn)/n, where xi is each element in the sample and n is the sample size.

The most obvious benefit of computing the sample mean is that it provides reliable information that can be applied to the bigger group/population. This is significant since it allows for statistical analysis without the impossibility of polling every person involved.  

The main disadvantage is that you cannot find extreme values, either very high or very low, since taking the average of them makes you get a value close to the mean. Another disadvantage is that it is sometimes difficult to select good samples, so there is a possibility of getting biased answers. 

Test your knowledge with multiple choice flashcards

If you have information about the population, which formula will you use to calculate the standard deviation of a sample mean \(\overline{x}\)?

If you don't have information about the population, which formula will you use to calculate the standard deviation of a sample mean \(\overline{x}\)?

The distribution of the sample mean can be normal even if the distribution of the population is not normal.

Next

Join over 22 million students in learning with our StudySmarter App

The first learning app that truly has everything you need to ace your exams in one place

  • Flashcards & Quizzes
  • AI Study Assistant
  • Study Planner
  • Mock-Exams
  • Smart Note-Taking
Join over 22 million students in learning with our StudySmarter App Join over 22 million students in learning with our StudySmarter App

Sign up to highlight and take notes. It’s 100% free.

Entdecke Lernmaterial in der StudySmarter-App

Google Popup

Join over 22 million students in learning with our StudySmarter App

Join over 22 million students in learning with our StudySmarter App

The first learning app that truly has everything you need to ace your exams in one place

  • Flashcards & Quizzes
  • AI Study Assistant
  • Study Planner
  • Mock-Exams
  • Smart Note-Taking
Join over 22 million students in learning with our StudySmarter App