# Leaning curves: Skewed distribution

In a previous article, we looked at the standard normal distribution, otherwise known as the “bell” curve.  This time we will describe the non standard distribution pattern known as a skewed distribution.  In the standard normal distribution, the “bell” is in the middle, and the mean and median (as well as the mode) of the distribution are all the same.  Many everyday phenomenon, such as the heights of people within a culture, take the shape of the bell curve, and in statistics, it is often assumed that the characteristic of the population being studied takes the shape of the bell curve.  However, this is not always the case.

When distributions have a “bell” that is not centered, they are called skewed distributions.  In a skewed distribution, there are a large number of values to one end of the range of values and fewer and fewer values on the other end.  The type of skewness is based on which side of the curve the “tail” is.  If the tail goes off to the right, the curve has a positive skew (or a right skew).  If the tail goes off to the left, the curve has a negative skew (or a left skew). The bell curve is shown in comparison to a positively-skewed (right skewed) curve and a negatively-skewed (left-skewed) curve.

## Positive Skew

In a skewed curve, the median and mean are not the same, as is the case with a bell curve.  In a positively-skewed curve, the large number of smaller values makes the median smaller than the mean, which is affected by the high values in the tail of the distribution.  Let’s take a look at an example, the distribution of income in a community.  Consider the table below, which gives the frequency (in percentages) of households in a community in eleven income ranges:

 Midpoint of Range (in thousands of dollars) Frequency 30 3 40 10 50 25 60 19 70 14 80 11 90 7 100 5 110 3 120 2 130 1

The curve below shows this distribution of income.  As can be seen, the “bell” of the curve is on the left side, where the lower incomes are, and the tail goes off to the right, with the higher incomes.  Indicated on the curve are the measures of central tendency, with the mean being larger than the median as the larger values pull the mean up, away from the median, which marks the middle value of the distribution.  As with any curve, the value with the greatest frequency is the mode. A curve with a positive skew. The median is less than the mean.

## Negative Skew

On the opposite side, a negatively-skewed distribution has a greater number of higher values, with the tail heading off to the left.  In this type of distribution, the median is greater than the mean.  Consider this distribution of 180 scores on a 20-question multiple-choice exam:

 Test Score Frequency 30 1 35 2 40 4 45 5 50 5 55 5 60 9 65 10 70 22 75 23 80 26 85 24 90 20 95 14 100 10

The graph of this set of data is shown below.  Though it does not have the “classic” negatively-skewed curve, the skew can clearly be seen.  On the graph are shown the mean, which is less than the median for a curve with a negative skew because the lower values pull the mean down.  The mode is also shown.  (Note that the median and mode could be the same value in a skewed curve, regardless of the direction of the skew.) A curve with a negative skew. The median is greater than the mean.

If you are interesting in learning more about sampling techniques, you may want to consider this career path and pursue a degree in data science.