Standard Deviation: What Is It?

Standard deviation is a measure that quantifies a dataset’s variation or dispersion. It indicates how data points deviate from the mean and is especially useful when the data fits a normal distribution.

You may have encountered the term standard deviation before. It is typical in reports, theses, dissertations, and various articles. Despite its popularity, it often needs to be understood. Can you explain what standard deviation is and when it should be applied?

The Problem

Descriptive statistics primarily consist of measures of central tendency and measures of variability.

When combined, they form the foundation for many statistical analyses and charts.

However, using only one of these measures to describe a population or sample could lead to incomplete or misleading information.

In the example of the river’s depth provided by Woody Hayes, relying solely on the mean as an indicator was insufficient to prevent a person from drowning.

This highlights the importance of using a measure of variability with a measure of central tendency.

Let’s consider another example.

The following three datasets have the same mean (20) but exhibit vastly different variability:

A = 20,20,20,20,20
B = 18,19,20,21,22
C = 00,10,20,30,40

With a measure of variability, these crucial differences are noticed.

Central tendency measures, such as the mean, median, and mode, are relatively straightforward.

However, measures of variability can be less intuitive and, depending on the type, harder to comprehend.

These measures include range, mean absolute deviation, variability, standard deviation, and coefficient of variation.

Standard deviation is widely used, but often without fully comprehending what it represents.

So, what does standard deviation mean, and when should it be employed?

The Solution

Some concepts used here are based on the book Statistics Without Math, which offers an in-depth discussion.

The standard deviation measures data variability, indicating how data points deviate from the mean.

Other measures of variability include the range and the mean absolute deviation, which are more straightforward and intuitive.

The range represents the difference between a dataset’s highest and lowest values.

While simple, it only uses two values from the entire dataset.

Conversely, the mean absolute deviation, or “the average of the absolute distances from each data point to the mean,” is slightly more complex but still intuitive.

Despite its intuitiveness, the mean absolute deviation is not the most commonly used statistic to describe data variability.

That title goes to the somewhat counterintuitive standard deviation.

Similar to the mean absolute deviation, standard deviation is based on the differences between each observation and the mean.

However, these differences are squared in standard deviation, and the square root is extracted at the end.

The standard deviation is a more complex measure of variability.

Nevertheless, it proves valuable when analyzing data that fits a normal distribution.

In such instances, roughly 68% of the values fall within one standard deviation, about 95% within two standard deviations, and nearly all within three standard deviations.

This is the 68-95-99 rule, also known as the tolerance limit! The Empirical Rule!

Ronald Fisher advocated using the standard deviation under “ideal circumstances,” with data fitting the normal distribution.

However, if the data does not fit the normal distribution, the standard deviation may not help describe variability.

In practice, the mean absolute deviation has been shown to perform better with empirical data than the standard deviation.

Concluding Remarks

The standard deviation is a variability measure best used when data perfectly fits a normal distribution.

However, the mean absolute deviation has proven more effective in estimating variability, especially when data does not fit the normal distribution.

Due to its easier comprehension, intuitive nature, and superior performance with realistic data, the mean absolute deviation is an excellent alternative for representing data variability.

Can Standard Deviations Be Negative?

Connect With Us on Our Social Networks!

DAILY POSTS ON INSTAGRAM!

What Is The Standard Deviation?

The Problem

The Solution

Concluding Remarks

Can Standard Deviations Be Negative?

Connect With Us on Our Social Networks!

Understanding Random Sampling: Essential Techniques in Data Analysis

APA Style T-Test Reporting Guide

What is the Difference Between T-test and Mann-Whitney Test?

Can Correlation Coefficient Be Negative?

The Role of Link Functions in Generalized Linear Models

Decision Trees: From Theory to Practice in Python for Aspiring Data Scientists

Leave a Reply Cancel reply

The Problem

The Solution

Concluding Remarks

Can Standard Deviations Be Negative?

Connect With Us on Our Social Networks!

Similar Posts

Leave a Reply Cancel reply