|
|
Skewness

If someone has a very different opinion than yours, you might say they have a skewed perspective. In fact, you might say they lean more toward one direction than another. In statistics, distributions can be described in the same way. Which way is your data skewed, how much skewness is there, and how do you interpret it? Read on to find out!

Mockup Schule

Explore our app and discover over 50 million learning materials for free.

Skewness

Illustration

Lerne mit deinen Freunden und bleibe auf dem richtigen Kurs mit deinen persönlichen Lernstatistiken

Jetzt kostenlos anmelden

Nie wieder prokastinieren mit unseren Lernerinnerungen.

Jetzt kostenlos anmelden
Illustration

If someone has a very different opinion than yours, you might say they have a skewed perspective. In fact, you might say they lean more toward one direction than another. In statistics, distributions can be described in the same way. Which way is your data skewed, how much skewness is there, and how do you interpret it? Read on to find out!

Definition for skewed

First, let's look at the definition for skewed.

If a distribution deviates from the normal distribution, it is said to be skewed.

So how skewed your distribution tells you both how much the distribution is asymmetric and gives you an idea of outliers in the data.

Are there any distributions with no skew? Sure there are:

  • the normal distribution;

  • the \(t\)-distribution;

  • the continuous uniform (rectangular) distribution; and

  • the Laplace distribution;

all have zero skew.

Look at the graph below. It shows a normal distribution and a data distribution graphed together. As you can see, the normal distribution is symmetric but the data distribution is not. The data distribution is not symmetric like the normal distribution, therefore the data distribution is said to be skewed.

Skewness graph showing the symmetric normal distribution and a data distribution which is not symmetric StudySmarterFig. 1. The normal distribution is not skewed, but the data distribution is skewed.

Most statistical software packages will calculate the skew for you. But a general rule of thumb is that a good formula for measuring skew is

\[ \text{skew} = 3\left( \frac{\text{mean} - \text{median}}{\text{standard deviation}} \right) .\]

Let's take a look at the different kinds of skew.

Positively skewed distribution

A positively skewed distribution is one where the skew is greater than zero. In other words, the mean of the distribution is larger than the median. You might also see this distribution called right skewed. You can see in the graph below that the normal distribution has the mean, median, and mode in the same spot, but the positively skewed distribution has

\[ \text{mode} < \text{median} < \text{mean}.\]

The positively skewed distribution has a larger tail on the right side of the graph, which is the same as in the positive direction on the \(x\)-axis.

Skewness positive skew distribution has a longer tail on the right side of the graph StudySmarterFig. 2. Positive skewed distribution as compared to a normal distribution.

If you can have positive skew, of course, you can have negative skew too!

Negatively skewed distribution

A negatively skewed distribution is one where the skew is less than zero. In other words, the mean of the distribution is greater than the median. You might also see this distribution called left skewed. You can see in the graph below that the negatively skewed distribution has

\[ \text{mode} > \text{median} > \text{mean}.\]

The negatively skewed distribution has a larger tail on the left side of the graph, which is the same as in the negative direction on the \(x\)-axis.

Skewness negative skew distribution has a longer tail on the left side of the graph StudySmarterFig. 3. Negative skew distribution as compared to a normal distribution.

Next up, how do you interpret the skew?

Skewness interpretation

One of the things you can learn from the skew of the distribution is where the outliers in the data set are located. In any skewed distribution, the outliers are data points in the long tail of the distribution. This tells you that:

  • in a positive skew distribution, the outliers are in the long tail to the right of the mean; and

  • in a negative skew distribution, the outliers are in the long tail to the left of the mean.

What the skew does not tell you is how many outliers there are!

Skewness is not always a bad thing. In some data sets, it is to be expected.

Suppose you have collected data regarding the length a baseball travels when hit during professional baseball games. Most of the data will indicate that the ball travels a distance somewhere between the pitcher and the stadium seating. However on occasion, a player will bunt the ball, and it will only travel a short distance.

The bunts would be outliers, and the data would be skewed toward the longer distances travelled. This means the data distribution is skewed. That is expected of this kind of data, and not an indication that something is wrong with the data set.

In addition to skew, you can use kurtosis to get information about a data distribution.

Skewness and kurtosis

Kurtosis is a way to describe the shape of the tails of a data distribution as compared to the centre.

Kurtosis is a measurement of the tails of a data set, not of the peak of the data set!

There are three main kinds of kurtosis.

Mesokurtic distributions have \(\text{kurtosis} = 3\), and they generally have tails similar to a normal distribution.

Leptokurtic distributions have \(\text{kurtosis} > 3\). The prefix is "lepto" which means "thin". Leptokurtic distributions have very long tails on both sides of the distribution, making the centre look very thin and tall. The shape of this kind of distribution indicates there are actually outliers on both sides of the mean!

Remember, the important part of leptokurtic distributions is that they have fat tails, not that they have thin centres. Kurtosis is a description of the tails of a distribution. One example of a leptokurtic distribution is a \(t\)-distribution with a low degree of freedom.

Lastly, there are platykurtic distributions, which have \(\text{kurtosis} < 3\). The prefix is "platy" meaning "broad". Platykurtic distributions have very short tails on both sides of the distribution, making the centre look very short and broad. An example of a platykurtic distribution is the uniform distribution.

In the graph below you can see that each of the distributions is symmetric, meaning they have zero skew. However, the tail of the distributions is different in each case.

Skewness mesokurtic, leptokurtic, and platykurtic distributions.  All of them are symmetric but have different tails. StudySmarterFig. 4. All of these distributions have zero skew, but different kurtosis.

So one thing you can now see is that skew and kurtosis are entirely unrelated!

Skewness - Key takeaways

  • If a distribution deviates from the normal distribution, it is said to be skewed.
    • A positively skewed distribution has the mean of the distribution larger than the median, and a longer tail on the right side of the graph.
    • A negatively skewed distribution has the mean of the distribution smaller than the median, and a longer tail on the left side of the graph.
  • Kurtosis is a way to describe the shape of the tails of a data distribution as compared to the center.
    • Mesokurtic distributions have \(\text{kurtosis} = 3\), and are similar to the normal distribution.
    • Leptokurtic distributions have \(\text{kurtosis} > 3\), and very long tails on both sides of the distribution, making the centre look very thin and tall.
    • Platykurtic distributions have \(\text{kurtosis} < 3\), and very short tails on both sides of the distribution, making the centre look very short and broad.

Frequently Asked Questions about Skewness

One way is to take a larger sample so that the effect of outliers is reduced.

Skewness can tell you where the predominance of the outliers in your data is.

Positive skew is the same as right skew, meaning the tail on the distribution is longer on the right.

Think of skewness as a measure of how many outliers your data has.  If your data is a measurement of how well 6 year old kids play football, and you accidently mix in 10 pro-football players, the pro players will introduce skew to your data in the form of outliers.

Skew is interpreted as a measure of outliers in the data or as how far away the data is from a normal distribution.

Test your knowledge with multiple choice flashcards

True or False: There are no distributions without skew.

What does skewness measure?

What does kurtosis measure?

Next

Join over 22 million students in learning with our StudySmarter App

The first learning app that truly has everything you need to ace your exams in one place

  • Flashcards & Quizzes
  • AI Study Assistant
  • Study Planner
  • Mock-Exams
  • Smart Note-Taking
Join over 22 million students in learning with our StudySmarter App Join over 22 million students in learning with our StudySmarter App

Sign up to highlight and take notes. It’s 100% free.

Entdecke Lernmaterial in der StudySmarter-App

Google Popup

Join over 22 million students in learning with our StudySmarter App

Join over 22 million students in learning with our StudySmarter App

The first learning app that truly has everything you need to ace your exams in one place

  • Flashcards & Quizzes
  • AI Study Assistant
  • Study Planner
  • Mock-Exams
  • Smart Note-Taking
Join over 22 million students in learning with our StudySmarter App