You are about to finish high school, and you have decided it is time for a change of scenery, so you want to go to a university in another city, let's say San Francisco, California. Among your considerations are, how much will I pay for the rent of an apartment, or how much will I spend on public transportation? So, you decide to ask some of your acquaintances who live over there to see how much they spend on average.
Explore our app and discover over 50 million learning materials for free.
Lerne mit deinen Freunden und bleibe auf dem richtigen Kurs mit deinen persönlichen Lernstatistiken
Jetzt kostenlos anmeldenNie wieder prokastinieren mit unseren Lernerinnerungen.
Jetzt kostenlos anmeldenYou are about to finish high school, and you have decided it is time for a change of scenery, so you want to go to a university in another city, let's say San Francisco, California. Among your considerations are, how much will I pay for the rent of an apartment, or how much will I spend on public transportation? So, you decide to ask some of your acquaintances who live over there to see how much they spend on average.
This process is called taking a sample mean and in this article you will find the definition, how to calculate a sample mean, standard deviation, variance, the sampling distribution and examples.
The mean of a set of numbers is just the average, that is, the sum of all the elements in the set divided by the number of elements in the set.
The sample mean is the average of the values obtained in the sample.
It is easy to see that if two sets are different, they will most likely also have different means.
The sample mean is denoted by \(\overline{x}\), and is calculated by adding up all the values obtained from the sample and dividing by the total sample size \(n\). The process is the same as averaging a data set. Therefore, the formula is \[\overline{x}=\frac{x_1+\ldots+x_n}{n},\]
where \(\overline{x}\) is the sample mean, \(x_i\) is each element in the sample and \(n\) is the sample size.
Let's go back to the San Francisco example. Suppose you asked \(5\) of your acquaintances how much they spend on public transport per week, and they said \(\$20\), \(\$25\), \(\$27\), \(\$43\), and \(\$50\). So, the sample mean is calculated by:
\[\overline{x}=\frac{20+25+27+43+50}{5}=\frac{165}{5}=33.\]
Therefore, for this sample, the average amount spent on public transportation in a week is \($33\).
Since the variance is the square of the standard deviation, to calculate either value, two cases must be considered:
1. You know the population standard deviation.
2. You do not know the population standard deviation.
The following section shows how to calculate this value for each case.
The mean of the sample mean, denoted by \(\mu_\overline{x}\), is given by the population mean, that is if \(\mu\) is the population mean, \[\mu_\overline{x}=\mu.\]
To calculate the standard deviation of the sample mean (also called the standard error of the mean (SEM)), denoted by \(\sigma_\overline{x}\), the two previous cases must be considered. Let's explore them in turn.
If the sample of size \(n\) is drawn from a population whose standard deviation \(\sigma\) is known, then the standard deviation of the sample mean will be given by \[\sigma_\overline{x}=\frac{\sigma}{\sqrt{n}}.\]
A sample of \(81\) people was taken from a population with standard deviation \(45\), what is the standard deviation of the sample mean?
Solution:
Using the formula stated before, the standard deviation of the sample mean is \[\sigma_\overline{x}=\frac{45}{\sqrt{81}}=\frac{45}{9}=5.\]
Note that to calculate this, you do not need to know anything about the sample besides its size.
Sometimes, when you want to estimate the mean of a population, you do not have any information other than just the data from the sample you took. Fortunately, if the sample is large enough (greater than \(30\)), the standard deviation of the sample mean can be approximated using the sample standard deviation. Thus, for a sample of size \(n\), the standard deviation of the sample mean is \[\sigma_\overline{x}\approx\frac{s}{\sqrt{n}},\] where \(s\) is the sample standard deviation (see the article Standard Deviation for more information) calculated by:
\[s=\sqrt{\frac{(x_1-\overline{x})^2+\ldots+(x_n-\overline{x})^2}{n-1}},\]
where \(x_i\) is each element in the sample and \(\overline{x}\) is the sample mean.
❗❗ The sample standard deviation measures the dispersion of data within the sample, while the sample mean standard deviation measures the dispersion between the means from different samples.
Recall the sampling distribution definition.
The distribution of the sample mean (or sampling distribution of the mean) is the distribution obtained by considering all the means that can be obtained from fixed-size samples in a population.
If \(\overline{x}\) is the sample mean of a sample of size \(n\) from a population with mean \(\mu\) and standard deviation \(\sigma\). Then, the sampling distribution of \(\overline{x}\) has mean and standard deviation given by \[\mu_\overline{x}=\mu\,\text{ and }\,\sigma_\overline{x}=\frac{\sigma}{\sqrt{n}}.\]
Furthermore, if the distribution of the population is normal or the sample size is large enough (according to the Central Limit Theorem, \(n\geq 30\) is enough), then the sampling distribution of \(\overline{x}\) is also normal.
When the distribution is normal, you can calculate probabilities using the standard normal distribution table, for this you need to convert the sample mean \(\overline{x}\) into a \(z\)-score using the following formula
\[z=\frac{\overline{x}-\mu_\overline{x}}{\sigma_\overline{x}}=\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}.\]
You may be wondering, what happens when the population distribution is not normal and the sample size is small? Unfortunately, for those cases, there is no general way to obtain the shape of the sampling distribution.
Let's see an example of a graph of a sampling distribution of the mean.
Going back to the example of public transportation in San Francisco, let's suppose you had managed to survey thousands of people, grouped the people into groups of size \(10\), averaged them in each group and obtained the following graph.
This graph approximates the graph of the sampling distribution of the mean. Based on the graph, you can deduce that an average of \(\$37\) is spent on public transportation in San Francisco.
Let's see an example of how to calculate probabilities.
It is assumed that the human body temperature distribution has a mean of \(98.6\, °F\) with a standard deviation of \(2\, °F\). If a sample of \(49\) people are taken at random, calculate the following probabilities:
(a) the average temperature of the sample is less than \(98\), that is, \(P(\overline{x}<98)\).
(b) the average temperature of the sample is greater than \(99\), that is, \(P(\overline{x}>99)\).
(c) the average temperature is between \(98\) and \(99\), that is, \(P(98<\overline{x}<99)\).
Solution:
1. Since the sample size is \(n=49>30\), you can assume the sampling distribution is normal.
2. Calculating the mean and the standard deviation of the sample mean. Using the formulas stated before, \(\mu_\overline{x}=98.6\) and the standard deviation \(\sigma_\overline{x}=2/\sqrt{49}=2/7\).
3. Converting the values into \(z-\)scores and using the standard normal table (see the article Standard Normal Distribution for more information), you'll have for (a):
\[\begin{align} P(\overline{x}<98) &=P\left(z<\frac{98-98.6}{\frac{2}{7}}\right) \\ &= P(z<-2.1) \\ &=0.0179. \end{align}\]
For (b) you'll have:
\[\begin{align} P(\overline{x}>99) &=P\left(z>\frac{99-98.6}{\frac{2}{7}}\right) \\ &= P(z>1.4) \\ &=1-P(z<1.4) \\ &=1-0.9192 \\ &= 0.0808. \end{align}\]
Finally, for (c):
\[\begin{align} P(98<\overline{x}<99) &=P(\overline{x}<99)-P(\overline{x}<98) \\ &= P(z<1.4)-P(z<-2.1) \\ &= 0.9192-0.0179 \\ &=0.9013. \end{align}\]
The sample mean is the average of the values obtained in the sample.
By adding up all the values obtained from a sample and dividing by the number of values in the sample.
The formula for calculating the sample mean is (x1+...+xn)/n, where xi is each element in the sample and n is the sample size.
The most obvious benefit of computing the sample mean is that it provides reliable information that can be applied to the bigger group/population. This is significant since it allows for statistical analysis without the impossibility of polling every person involved.
The main disadvantage is that you cannot find extreme values, either very high or very low, since taking the average of them makes you get a value close to the mean. Another disadvantage is that it is sometimes difficult to select good samples, so there is a possibility of getting biased answers.
If you have information about the population, which formula will you use to calculate the standard deviation of a sample mean \(\overline{x}\)?
\(\sigma_\overline{x}=\frac{s}{\sqrt{n}}\).
If you don't have information about the population, which formula will you use to calculate the standard deviation of a sample mean \(\overline{x}\)?
\(\sigma_\overline{x}=\frac{s}{\sqrt{n}}\).
How do you calculate the sample mean?
\(\overline{x}=\frac{x_1+\ldots+x_n}{n}.\)
What is the formula for converting a sample mean into a \(z\)-score?
\(z=\frac{\overline{x}-\mu_\overline{x}}{\sigma_\overline{x}}\).
The distribution of the sample mean can be normal even if the distribution of the population is not normal.
True.
Which condition regarding the sample size must be met for the sampling distribution of the mean to be normal?
\(n\geq 30\).
Already have an account? Log in
Open in AppThe first learning app that truly has everything you need to ace your exams in one place
Sign up to highlight and take notes. It’s 100% free.
Save explanations to your personalised space and access them anytime, anywhere!
Sign up with Email Sign up with AppleBy signing up, you agree to the Terms and Conditions and the Privacy Policy of StudySmarter.
Already have an account? Log in
Already have an account? Log in
The first learning app that truly has everything you need to ace your exams in one place
Already have an account? Log in