2.5 Measures of Variability
LEARNING OBJECTIVES
- Recognize, describe, calculate, and analyze the measures of the spread of data: variance, standard deviation, and range.
It can be misleading to only use the measures of central tendency (mean, median, mode) to describe a data set. Measures of central tendency describe the center of a distribution. Measures of dispersion or variability are used to describe the spread or dispersion of the data. So far in this chapter, we have already seen a measure of variability—the interquartile range. The interquartile range describes the spread of the middle [latex]50\%[/latex] of the data. But there are other measures of variability, including range, variance, and standard deviation.
Range
The range is the difference between the largest and smallest value in a set of data:
[latex]\displaystyle{\text{Range}=\text{Maximum Value}-\text{Minimum Value}}[/latex]
Range is a poor measure of variability because it is based on only two values in the data set (the largest and smallest values) and is highly influenced by outliers. Also, the range does not help us distinguish between two data sets with the same largest and smallest values because the two data sets will have the same range.
EXAMPLE
AIDS data indicating the number of months a patient with AIDS lives after taking a new antibody drug are as follows:
3 | 4 | 8 | 8 | 10 | 11 | 12 | 13 | 14 | 15 |
15 | 16 | 16 | 17 | 17 | 18 | 21 | 22 | 22 | 24 |
24 | 25 | 26 | 26 | 27 | 27 | 29 | 29 | 31 | 32 |
33 | 33 | 34 | 34 | 35 | 37 | 40 | 44 | 44 | 47 |
Calculate the range.
Solution
The largest value is [latex]47[/latex] and the smallest value is [latex]3[/latex], so
[latex]\displaystyle{\text{Range}=47-3=44 \text{ months}}[/latex]
Variance and Standard Deviation
An important characteristic of any set of data is the variation in the data from the mean. In some data sets, the data values are concentrated close to the mean, but in other data sets, the data values are more widely spread out from the mean. The most common measure of variation, or spread, is the standard deviation. The standard deviation is a number that measures, on average, how far data values are from their mean. The standard deviation provides a numerical measure of the overall amount of variation in a data set and can be used to determine whether a particular data value is close to or far away from the mean.
The standard deviation provides a measure of the overall variation in a data set. The standard deviation is always a non-negative number. The standard deviation is small when the data are all concentrated close to the mean because there is little variation or spread in the data. The standard deviation is larger when the data values are more spread out from the mean because there is a lot of variation in the data. The lowercase letter [latex]s[/latex] represents the sample standard deviation, and the Greek letter [latex]\sigma[/latex] represents the population standard deviation.
Suppose that we are studying the amount of time customers wait in line at the checkout at Supermarket A and Supermarket B. The mean wait time at both supermarkets is five minutes. At supermarket A, the standard deviation for the wait time is two minutes and at supermarket B, the standard deviation for the wait time is four minutes. Because supermarket B has a higher standard deviation, we know that there is more variation in the wait times at supermarket B. Overall, wait times at supermarket B are more spread out from the mean, and wait times at supermarket A are more concentrated near the mean.
As well, the standard deviation can be used to determine whether a data value is close to or far from the mean. For example, suppose that Rosa and Binh both shop at Supermarket A, where the mean wait time at the checkout is five minutes, and the standard deviation is two minutes. Suppose Rosa’s wait time is seven minutes and Binh’s wait time is one minute.
- Rosa’s wait time of seven minutes is two minutes longer than the mean of five minutes. Because two minutes is equal to one standard deviation, Rosa’s wait time of seven minutes is one standard deviation above the mean of five minutes.
- Binh’s wait time of one minute is four minutes less than the mean of five minutes. Because four minutes is equal to two standard deviations, Binh’s wait time of one minute is two standard deviations below the mean of five minutes.
A data value that is two standard deviations from the mean is just on the borderline for what many statisticians would consider to be far from the mean. Considering data to be far from the mean if it is more than two standard deviations away is more of an approximate “rule of thumb” than a rigid rule. In general, the shape of the distribution of the data affects how much of the data is further away than two standard deviations.
Calculating the Standard Deviation
If [latex]x[/latex] is a number, then the difference “[latex]x[/latex] – mean” is called its deviation from the mean. In a data set, there are as many deviations as there are items in the data set. The deviations are used to calculate the standard deviation. If the numbers belong to a population, in symbols, a deviation is [latex]x – \mu[/latex]. For sample data, in symbols, a deviation is [latex]\displaystyle{x-\overline{x}}[/latex].
The procedure to calculate the standard deviation depends on whether the numbers are the entire population or are data from a sample. The calculations are similar but not identical. Therefore, the symbol used to represent the standard deviation depends on whether it is calculated from a population or a sample. The lowercase letter [latex]s[/latex] represents the sample standard deviation, and the Greek letter [latex]\sigma[/latex] represents the population standard deviation. If the sample has the same characteristics as the population, then [latex]s[/latex] should be a good estimate of [latex]\sigma[/latex].
To calculate the standard deviation, we need to calculate the variance first. The variance is the average of the squares of the deviations (the [latex]x-\overline{x}[/latex] values for a sample or the [latex]x – \mu[/latex] values for a population). The symbol [latex]\sigma^2[/latex] represents the population variance, and the population standard deviation [latex]\sigma[/latex] is the square root of the population variance. The symbol [latex]s^2[/latex] represents the sample variance, and the sample standard deviation [latex]s[/latex] is the square root of the sample variance. The standard deviation can be thought of as a special average of the deviations.
The formula for the population standard deviation is: [latex]\displaystyle{\sigma=\sqrt{\frac{\sum(x-\mu)^2}{N}}}[/latex]. To calculate a population standard deviation [latex]\sigma[/latex]:
- Calculate the deviation from the mean for each data value [latex]x[/latex]: [latex]x-\mu[/latex].
- Square each of the deviations: [latex](x-\mu)^2[/latex].
- Add up the squares of the deviations from the mean calculated in step 2.
- Divide the sum in step 3 by the population size [latex]N[/latex].
- The population standard deviation is the square root of the value from step 4.
The formula for the population variance is [latex]\displaystyle{\sigma^2=\frac{\sum(x-\mu)^2}{N}}[/latex]. The population variance is the value found in step 4 in the above population standard deviation calculation.
The formula for the sample standard deviation is: [latex]\displaystyle{s=\sqrt{\frac{\sum(x-\overline{x})^2}{n-1}}}[/latex]. To calculate a sample standard deviation [latex]s[/latex]:
- Calculate the deviation from the mean for each data value [latex]x[/latex]: [latex]x-\overline{x}[/latex].
- Square each of the deviations: [latex](x-\overline{x})^2[/latex].
- Add up the squares of the deviations from the mean calculated in step 2.
- Divide the sum in step 3 by the sample size minus 1: [latex]n-1[/latex].
- The population standard deviation is the square root of the value from step 4.
The formula for the sample variance is [latex]\displaystyle{s^2=\frac{\sum(x-\overline{x})^2}{n-1}}[/latex]. The sample variance is the value found in step 4 in the above sample standard deviation calculation.
Video: “How to calculate Standard Deviation and Variance” by statisticsfun [5:05] is licensed under the Standard YouTube License.Transcript and closed captions available on YouTube.
CALCULATING VARIANCE IN EXCEL
To find the variance in Excel:
- If the data is population data, use the var.p(array) function, where array is the array or cell range containing the data. The output from the var.p function is the population variance.
- Visit the Microsoft page for more information about the var.p function.
- If the data is sample data, use the var.s(array) function where array is the array or cell range containing the data. The output from the var.s function is the sample variance.
- Visit the Microsoft page for more information about the var.s function.
NOTE
There are two different functions to calculate variance in Excel because variance is calculated differently depending on whether the data is from a sample or from a population. When calculating variance, make sure to use the correct function based on the type of data (sample or population).
CALCULATING STANDARD DEVIATION IN EXCEL
To find the standard deviation in Excel:
- If the data is population data, use the stdev.p(array) function where array is the array or cell range containing the data. The output from the stdev.p function is the population standard deviation.
- Visit the Microsoft page for more information about the stdev.p function.
- If the data is sample data, use the stdev.s(array) function where array is the array or cell range containing the data. The output from the stdev.s function is the sample standard deviation.
- Visit the Microsoft page for more information about the stdev.s function.
NOTE
There are two different functions to calculate standard deviation in Excel because standard deviation is calculated differently depending on whether the data is from a sample or from a population. When calculating standard deviation, make sure to use the correct function based on the type of data (sample or population).
Video: “Range, Variance, Standard Deviation in Excel” by Joshua Emmanuel [1:11] is licensed under the Standard YouTube License.Transcript and closed captions available on YouTube.
EXAMPLE
In a fifth-grade class, the teacher was interested in the standard deviation of the ages of her students. The following data are the ages, in years, for a sample of [latex]20[/latex] fifth-grade students. The ages are rounded to the nearest half year:
9 | 9.5 | 9.5 | 10 | 10 | 10 | 10 | 10.5 | 10.5 | 10.5 |
10.5 | 11 | 11 | 11 | 11 | 11 | 11 | 11.5 | 11.5 | 11.5 |
Calculate the mean, the variance, and the standard deviation of the ages of the students. Interpret the standard deviation.
Solution
Enter the data into an Excel spreadsheet. For this example, suppose we entered the data in column A from cell A1 to A20.
For the mean:
Function | average |
---|---|
Field 1 | A1:A20 |
Answer | 10.525 years |
For the variance:
Function | var.s |
---|---|
Field 1 | A1:A20 |
Answer | 0.5125 years2 |
For the standard deviation:
Function | stdev.s |
---|---|
Field 1 | A1:A20 |
Answer | 0.7159 years |
Interpreting the standard deviation:
On average, the age of any fifth grader is [latex]0.7159[/latex] years away from the mean of [latex]10.525[/latex] years.
NOTES
- We are using the var.s (not var.p) and stdev.s (not stdev.p) functions to calculate the variance and the standard deviation because the data is from a sample.
- Standard deviation has the same units as the data. In this case, the data is measured in years, so the standard deviation is also in years.
- Because the values being added up in the variance calculation are squared, the units of variance are squared units. In particular, the units of variance are the squared units of the data. In this example, the data is measured in years, so the units of the variance are (years)2. Because the units of variance are squared units, it can be difficult to intuitively interpret the meaning of the variance.
TRY IT
On a baseball team, the ages, in years, of each of the players are as follows:
21 | 21 | 22 | 23 | 24 |
24 | 25 | 25 | 28 | 29 |
28 | 31 | 32 | 33 | 33 |
34 | 35 | 36 | 36 | 36 |
36 | 38 | 38 | 38 | 40 |
Find the mean and standard deviation.
Click to see Solution
Enter the data into an Excel spreadsheet. For this example, suppose we entered the data in column A from cell A1 to A25.
For the mean:
Function | average |
---|---|
Field 1 | A1:A25 |
Answer | 30.64 years |
For the standard deviation:
Function | stdev.p |
---|---|
Field 1 | A1:A25 |
Answer | 5.99 years |
NOTE
We are using the stdev.p (not stdev.s) function to calculate the standard deviation here because the baseball team is a population.
NOTE
Concentrate on what the standard deviation tells us about the data. The standard deviation is a number that measures how far the data is spread from the mean. Let a calculator or computer do the arithmetic.
The standard deviation, [latex]s[/latex] or [latex]\sigma[/latex], is a non-negative number. When the standard deviation is zero, there is no dispersion about the mean—that is, all the data values are equal to each other. The standard deviation is small when the data are all concentrated close to the mean and is larger when the data values show more variation from the mean. When the standard deviation is significantly larger than zero, the data values are very spread out about the mean. Outliers in the data can make the standard deviation very large.
The standard deviation, when first presented, can seem unclear. By graphing the data, we can get a better “feel” for the deviations and the standard deviation. In symmetrical distributions, the standard deviation can be very helpful but in skewed distributions, the standard deviation may not be much help. The reason is that the two sides of a skewed distribution have different spreads. In a skewed distribution, it is better to look at the first quartile, the median, the third quartile, the smallest value, and the largest value. Because numbers can be confusing, always graph the data.
EXAMPLE
Use the following sample of exam scores from Susan Dean’s spring pre-calculus class:
33 | 42 | 49 | 49 | 53 | 55 | 55 | 61 |
63 | 67 | 68 | 68 | 69 | 69 | 72 | 73 |
74 | 78 | 80 | 83 | 88 | 88 | 88 | 90 |
92 | 94 | 94 | 94 | 94 | 96 | 100 |
Calculate the following:
- The mean.
- The standard deviation.
- The median.
- The first quartile.
- The third quartile.
- [latex]IQR[/latex].
Solution
Enter the data into an Excel spreadsheet. For this example, suppose we entered the data in column A from cell A1 to A31.
For the mean:
Function | average |
---|---|
Field 1 | A1:A31 |
Answer | 73.5 |
For the median:
Function | median |
---|---|
Field 1 | A1:A31 |
Answer | 73 |
For the standard deviation:
Function | stdev.s |
---|---|
Field 1 | A1:A31 |
Answer | 17.92 |
For the first quartile:
Function | quartile.exc |
---|---|
Field 1 | A1:A31 |
Field 2 | 1 |
Answer | 61 |
For the third quartile:
Function | quartile.exe |
---|---|
Field 1 | A1:A31 |
Field 2 | 3 |
Answer | 90 |
For the [latex]IQR[/latex]: [latex]\displaystyle{IQR=90-61=29}[/latex]
Comparing Values from Different Data Sets
The standard deviation is useful when comparing data values that come from different data sets. If the data sets have different means and different standard deviations, then comparing the data values directly can be misleading. In order to directly compare values in different data sets, we compare how many standard deviations away the value is from the mean of its data set. This is done by calculating the value’s [latex]z[/latex]-score:
Sample | [latex]\displaystyle{z = \frac{x - \overline{x}}{s}}[/latex] |
---|---|
Population | [latex]\displaystyle{z = \frac{x - \mu}{\sigma}}[/latex] |
The value [latex]x[/latex] is [latex]z[/latex] standard deviations away from the mean.
EXAMPLE
Two students, John and Ali, are from different high schools and wanted to find out who had the highest GPA when compared to their school. Which student had the highest GPA when compared to their school?
Student | GPA | School Mean GPA | School Standard Deviation |
---|---|---|---|
John | 2.85 | 3.0 | 0.7 |
Ali | 77 | 80 | 10 |
Solution
For each student, determine how many standard deviations the [latex]z[/latex]-score is, and their GPA is away from the mean of their school.
John: [latex]\displaystyle{z=\frac{2.85 - 3.00}{0.7}=-0.21}[/latex]
Ali: [latex]\displaystyle{z=\frac{77- 80}{10}=−0.3}[/latex]
John has a better GPA when compared to his school because his GPA is [latex]0.21[/latex] standard deviations below his school’s mean, while Ali’s GPA is [latex]0.3[/latex] standard deviations below her school’s mean. This means that John’s GPA is closer to his school’s mean than Ali’s GPA is to hers.
NOTE
The sign of a [latex]z[/latex]-score is important. A negative [latex]z[/latex]-score tells us that [latex]x[/latex] is below the mean. A positive [latex]z[/latex]-score tells us that [latex]x[/latex] is above the mean. The absolute value of the[latex]z[/latex]-score tells us how many standard deviations the value of [latex]x[/latex] is from the mean.
TRY IT
Two swimmers, Angie and Beth, are from different teams and wanted to find out who had the fastest time for the 50-meter freestyle when compared to her team’s mean time. Which swimmer had the fastest time when compared to her team?
Swimmer | Time (seconds) | Team Mean Time | Team Standard Deviation |
---|---|---|---|
Angie | 26.2 | 27.2 | 0.8 |
Beth | 27.3 | 30.1 | 1.4 |
Click to see Solution
Angie: [latex]\displaystyle{z=\frac{26.2 - 27.2}{0.8}=-1.25}[/latex]
Beth: [latex]\displaystyle{z=\frac{27.3- 30.1}{1.4}=−2}[/latex]
Angie’s time is [latex]1.25[/latex] standard deviations below her team’s mean time, and Beth’s is [latex]2[/latex] standard deviations below her team’s time. So, Beth had a faster time when compared to her team’s mean than Angie’s time is to hers.
The following lists give a few facts that provide a little more insight into what the standard deviation tells us about the distribution of the data.
Chebyshev’s Rule: For ANY data set, no matter what the distribution of the data is:
- At least [latex]75\%[/latex] of the data is within two standard deviations of the mean.
- At least [latex]89\%[/latex] of the data is within three standard deviations of the mean.
- At least [latex]95\%[/latex] of the data is within [latex]4.5[/latex] standard deviations of the mean.
The Empirical Rule: For data having a distribution that is BELL-SHAPED and SYMMETRIC:
- Approximately [latex]68\%[/latex] of the data is within one standard deviation of the mean.
- Approximately [latex]95\%[/latex] of the data is within two standard deviations of the mean.
- More than [latex]99\%[/latex] of the data is within three standard deviations of the mean.
- It is important to note that this rule only applies when the shape of the distribution of the data is bell-shaped and symmetric.
Exercises
- How much time does it take to travel to work in a particular region? The table below shows the commute time for a sample of workers in the region who are at least [latex]16[/latex] years old and do not work at home.
24.0 24.3 25.9 18.9 27.5 17.9 21.8 20.9 16.7 27.3 18.2 24.7 20.0 22.6 23.9 18.0 31.4 22.3 24.0 25.5 24.7 24.6 28.1 24.9 22.6 23.6 23.4 25.7 24.8 25.5 21.2 25.7 23.1 23.0 23.9 26.0 16.3 23.1 21.4 21.5 27.0 27.0 18.6 31.7 23.3 30.1 22.9 23.3 21.7 18.6 - Find the range.
- Find the standard deviation.
- Interpret the standard deviation.
- What travel time is one standard deviation above the mean?
- What travel time is three standard deviations below the mean?
Click to see Answer
- [latex]15.4[/latex] minutes
- [latex]3.464[/latex] minutes
- On average, the travel time of an arbitrary worker is [latex]3.464[/latex] minutes away from the mean [latex]23.462[/latex] minutes.
- [latex]26.926[/latex] minutes
- [latex]13.07[/latex] minutes
- The following data shows the lengths, in feet, of a sample of boats moored in a marina.
19 35 29 26 21 40 33 33 34 25 20 37 30 26 23 24 29 16 28 25 20 39 32 27 27 27 17 - Find the range.
- Find the standard deviation.
- Interpret the standard deviation.
- What boat length is two standard deviations below the mean?
Click to see Answer
- [latex]24[/latex] feet
- [latex]6.48[/latex] feet
- On average, an arbitrary boat’s length is [latex]6.48[/latex] feet away from the mean of [latex]27.33[/latex] feet.
- [latex]14.37[/latex] feet
- The data below is the weight, in pounds, of all members of a particular NFL team.
177 210 270 275 212 185 200 241 250 220 259 185 210 272 285 212 250 302 205 232 280 285 184 265 215 223 265 260 278 185 228 273 242 185 241 290 210 276 290 206 174 286 247 190 215 245 205 178 290 280 188 230 260 - Find the range.
- Find the standard deviation.
- Interpret the standard deviation.
- The team’s quarterback weighs [latex]205[/latex] pounds. How many standard deviations from the above or below the mean is the quarterback?
- What weight is three standard deviations above the mean?
Click to see Answer
- [latex]128[/latex] pounds
- [latex]37.35[/latex] pounds
- On average, an arbitrary football player’s weight is [latex]37.35[/latex] pounds, away from the mean weight of [latex]236.25[/latex] pounds.
- [latex]0.837[/latex] standard deviations below the mean.
- [latex]348.3[/latex] pounds
- A sample of [latex]35[/latex] post-secondary institutions was taken from across the U.S. The data below shows the number of students enrolled at each institution.
6,414 1,550 2,109 9,350 21,828 4,300 5,944 5,722 2,825 2,044 5,481 5,200 5,853 10,012 6,357 27,000 9,414 7,681 3,200 17,500 9,200 7,380 18,314 6,557 13,713 17,768 7,493 2,771 2,861 1,263 7,285 28,165 5,080 11,622 2,750 - Find the range.
- Find the standard deviation.
- Interpret the standard deviation.
- A school with an enrollment of [latex]8,000[/latex] students would be how many standard deviations above or below the mean?
Click to see Answer
- [latex]26,902[/latex] students
- [latex]6,943.89[/latex] students
- On average, the number of students enrolled at an arbitrary college is [latex]6,943.89[/latex] students, away from the mean of [latex]8,628.74[/latex] students.
- [latex]0.09[/latex] standard deviations below the mean.
- Forty randomly selected students were asked the number of pairs of sneakers they owned. The data is recorded below
1 1 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 7 - Find the range.
- Find the standard deviation.
- Interpret the standard deviation.
Click to see Answer
- [latex]6[/latex] pairs of sneakers
- [latex]1.29[/latex] pairs of sneakers
- On average, the number of pairs of sneakers owned by an arbitrary student is [latex]1.29[/latex] pairs of sneakers away from the mean of [latex]3.775[/latex] pairs of sneakers.
- Two baseball players, Fredo and Karl, on different teams wanted to find out who had the higher batting average when compared to his team. Which baseball player had the higher batting average when compared to his team?
Baseball Player Batting Average Team Batting Average Team Standard Deviation Fredo [latex]0.158[/latex] [latex]0.166[/latex] [latex]0.012[/latex] Karl [latex]0.177[/latex] [latex]0.189[/latex] [latex]0.015[/latex] Click to see Answer
Fredo because his batting average is [latex]0.66[/latex] standard deviations below the mean of his team, and Karl’s batting average is [latex]0.8[/latex] standard deviations below the mean of his team.
- Three students were applying to the same graduate school. They came from schools with different grading systems. Which student had the best GPA when compared to other students at their school? Explain how you determined your answer.
Student GPA School Average GPA School Standard Deviation Thuy [latex]2.7[/latex] [latex]3.2[/latex] [latex]0.8[/latex] Vichet [latex]87[/latex] [latex]75[/latex] [latex]20[/latex] Kamala [latex]8.6[/latex] [latex]8[/latex] [latex]0.4[/latex] Click to see Answer
Kamala Thuy’s GPA is [latex]0.625[/latex] standard deviations below the mean. Vichet’s GPA is [latex]0.6[/latex] standard deviations above the mean. Kamala’s GPA is [latex]1.5[/latex] standard deviations above the mean. So Kamala’s GPA is the furthest above the mean.
- A music school has budgeted to purchase three musical instruments. They plan to purchase a piano costing [latex]\$3,000[/latex], a guitar costing [latex]\$550[/latex], and a drum set costing $[latex]600[/latex]. The mean cost for a piano is [latex]\$4,000[/latex] with a standard deviation of [latex]\$2,500[/latex]. The mean cost for a guitar is [latex]\$500[/latex] with a standard deviation of [latex]\$200[/latex]. The mean cost for drums is [latex]\$700[/latex] with a standard deviation of [latex]\$100[/latex]. Which cost is the lowest when compared to other instruments of the same type? Which cost is the highest when compared to other instruments of the same type? Justify your answer.
Click to see Answer
Drums. The cost of the drums is [latex]1[/latex] standard deviations below the mean. The cost of the piano is [latex]0.4[/latex] standard deviations below the mean. The cost of the guitar is [latex]0.25[/latex] standard deviations above the mean. So the cost of the drums is the furthest below the mean.
- An elementary school class ran one mile with a mean of eleven minutes and a standard deviation of three minutes. Rachel, a student in the class, ran one mile in eight minutes. A junior high school class ran one mile with a mean of nine minutes and a standard deviation of two minutes. Kenji, a student in the class, ran one mile in eight and a half minutes. A high school class ran one mile with a mean of seven minutes and a standard deviation of four minutes. Nedda, a student in the class, ran one mile in eight minutes.
- Why is Kenji considered a better runner than Nedda, even though Nedda ran faster than he?
- Who is the fastest runner with respect to his or her class? Explain why.
Click to see Answer
- Kenji’s time was [latex]0.25[/latex] standard deviations below the mean of their class. Nedda’s time was [latex]0.25[/latex] standard deviations above the mean of their class. Because Kenji’s time was below the mean of their class, Kenji is considered a better run relative to the mean of their class.
- Rachel is the fastest runner because Rachel’s time was [latex]1[/latex] standard deviation below the mean of their class. This means Rachel’s time is the fastest relative to the mean of the class.
- Using the number of full-time equivalent students (FTES) each year at a local college for the past 40 years, the mean is [latex]1,000[/latex] FTES, the median is [latex]1,014[/latex] FTES, and the standard deviation is [latex]474[/latex] FTES. How many standard deviations above or below the mean is the median?
Click to see Answer
[latex]0.0295[/latex] standard deviations above the mean.
“2.6 Measures of Dispersion” and “2.7 Exercices” from Introduction to Statistics by Valerie Watts is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.