Welcome back! Our journey into Descriptive Statistics continues with Measures of Dispersion. These measures, which include the range, variance, and standard deviation, inform us about the extent to which our data is spread. R's built-in statistical functions offer all we need to thoroughly understand dispersion in our data. Let's dive right in!
Measures of Dispersion capture the spread within a dataset. For example, knowing the average test scores (a Measure of Centrality) isn't enough. Understanding how those scores vary from the average provides a fuller picture. This enhanced comprehension is vital for daily data analysis.
The graph below illustrates two normal distributions with varying standard deviations. A standard deviation measures how much each data point deviates from the average. Observe the width of the curve under each distribution: a smaller spread, reflected by the blue curve, corresponds to a smaller standard deviation. Most of the data points are closer to the mean. In contrast, the wider spread, denoted by the green curve, reveals a greater standard deviation and suggests that data points vary more widely around the mean.
The range, simply the difference between the highest and lowest values, illustrates the spread between the extremes of our dataset. We can calculate the range of a set of numbers using R's built-in function, range()
. Here, we calculate the range of test scores for five students:
