2.5 Measures of Location

LEARNING OBJECTIVES

  • Recognize, describe, calculate, and interpret the measures of location of data: quartiles and percentiles.

The common measures of location are quartiles and percentiles.  Previously, we learned that the median is a number that measures the “center” of the data.  But the median can also be thought of as a measure of location because the median is the “middle value” of a set of data.  The median is a number that separates ordered data into halves.  Half of the values in the data are the same number or smaller than the median and half of the values in the data are the same number or larger.

For example, consider the following data, already ordered from smallest to largest:

1 1 2 2 4 6 6.8
7.2 8 8.3 9 10 10 11.5

Because there are 14 observations, the median is between the seventh value, [latex]6.8[/latex], and the eighth value, [latex]7.2[/latex].  To find the median, add the two values together and divide by two:

[latex]\displaystyle{\frac{6.8+7.2}{2}=7}[/latex]

The median is seven.  We can see that half (or 50%) of the values are less than seven and half (or 50%) of the values are larger than seven.

The median is an example of both a quartile and a percentile.  The median is also the second quartile, [latex]Q_2[/latex], and the 50th percentile, [latex]P_{50}[/latex].

Quartiles

Quartiles are numbers that separate the data into quarters (four parts).  Like the median, quartiles may or may not be an actual value in the set of data. To find the quartiles, order the data (from smallest to largest) and then find the median or second quartile.  The first quartile, [latex]Q_1[/latex], is the middle value of the lower half of the data and the third quartile, [latex]Q_3[/latex], is the middle value of the upper half of the data. To get the idea, consider the same (ordered) data set used above:

1 1 2 2 4 6 6.8
7.2 8 8.3 9 10 10 11.5

The median or second quartile is seven.  The lower half of the data are:

1 1 2 2 4 6 6.8

The middle value of the lower half of the data is 2.  The number 2, which is part of the data, is the first quartile, [latex]Q_1[/latex].  One-fourth (or 25%) of the entire sets of values are the same as or less than 2 and three-fourths (or 75%) of the values are more than two.

The upper half of the data are:

7.2 8 8.3 9 10 10 11.5

The middle value of the upper half of the data is 9.  The third quartile, [latex]Q_3[/latex], is 9.  Three-fourths (or 75%) of the values are less than 9.  One-fourth (or 25%) of the values in the data set are greater than or equal to 9.

The interquartile range is a number that indicates the spread of the middle half or the middle 50\% of the data.  It is the difference between the third quartile ([latex]Q_3[/latex]) and the first quartile ([latex]Q_1[/latex]).

[latex]\displaystyle{IQR = Q_3 – Q_1}[/latex]

The [latex]IQR[/latex] can help to determine potential outliers.  A value is suspected to be a potential outlier if it is less than [latex]1.5 \times IQR[/latex] below the first quartile or more than [latex]1.5 \times IQR[/latex] above the third quartile.  Potential outliers always require further investigation.

NOTE

A potential outlier is a data point that is significantly different from the other data points. These special data points may be errors, some kind of abnormality, or they may be a key to understanding the data.


Watch this video: Median, Quartiles and Interquartile Range by ExamSolutions [12:35] (transcript available).


CALCULATING QUARTILES IN EXCEL

To find quartiles in Excel, use the quartile.exc(array, quartile number) function. 

  • For array, enter the array or cell range containing the data. 
  • For quartile number, enter the quartile (1, 2 or 3) being calculated.

The output from the quartile.exc function is the value of the corresponding quartile.  For example, quartile.exc(array,1) returns the value of the first quartile where 25% of the observations in the data are (strictly) less than the value of the first quartile.

Visit the Microsoft page for more information about the quartile.exc function.

NOTE

We are using the quartile.exc function, and not the quartile.inc function, because we want the percent of the observations in the data to be strictly less than the value of the quartile.


Watch this video: How To Find Quartiles and Construct a Box Plot in Excel by Joshua Emmanuel [4:12] (transcript available).


EXAMPLE

For the following 13 real estate prices, calculate the three quartiles and the [latex]IQR[/latex].  Determine if any prices are potential outliers. The prices are in dollars.

389,950 230,500 158,000 479,000 639,000 114,950 5,500,000
387,000 659,000 529,000 575,000 488,800 1,095,000

Solution:

Enter the data into an Excel spreadsheet.  For this example, suppose we entered the data in column A from cell A1 to A13.

For the first quartile [latex]Q_1[/latex]:

Function quartile.exc Answer
Field 1 A1:A13 $308,750
Field 2 1

For the second quartile [latex]Q_2[/latex]:

Function quartile.exc Answer
Field 1 A1:A13 $488,800
Field 2 2

For the third quartile [latex]Q_3[/latex]:

Function quartile.exc Answer
Field 1 A1:A13 $649,000
Field 2 3

For the IQR:  [latex]\displaystyle{IQR = 649,000 – 308,750 = \$340,250}[/latex]

To determine if there are any outliers:

[latex]\begin{eqnarray*} 1.5 \times IQR &  =  & 1.5 \times 340,250 = 510,375 \\  \\ Q_1 – 1.5 \times IQR & =  & 308,750 – 510,375 = –201,625\\  \\ Q_3 + 1.5 \times IQR & = & 649,000 + 510,375 = 1,159,375 \end{eqnarray*}[/latex]

No house price is less than [latex]–\$201,625[/latex]. However, [latex]\$5,500,000[/latex] is more than [latex]\$1,159,375[/latex]. Therefore, [latex]\$5,500,000[/latex] is a potential outlier.

NOTE

Quartiles have the same units as the data.  In this case, the data is measured in dollars, so the quartiles are also in dollars.

TRY IT

For the following 11 salaries, calculate the three quartiles and the [latex]IQR[/latex].  Are any of the salaries outliers?  The salaries are in dollars.

33,000 72,000 54,000
64,500 68,500 120,000
28,000 69,000 40,500
54,000 42,000

 

Click to see Solution

 

Enter the data into an Excel spreadsheet.  For this example, suppose we entered the data in column A from cell A1 to A11.

For the first quartile [latex]Q_1[/latex]:

Function quartile.exc Answer
Field 1 A1:A11 $40,500
Field 2 1

For the second quartile [latex]Q_2[/latex]:

Function quartile.exc Answer
Field 1 A1:A11 $54,000
Field 2 2

For the third quartile [latex]Q_3[/latex]:

Function quartile.exc Answer
Field 1 A1:A11 $69,000
Field 2 3

For the IQR:  [latex]\displaystyle{IQR = 69,000 – 40,500 = \$28,500}[/latex]

To determine if there are any outliers:

[latex]\begin{eqnarray*} 1.5 \times IQR &  =  & 1.5 \times 28,500 = 42,750 \\  \\ Q_1 – 1.5 \times IQR & =  & 40,500 – 42,750 =- 2,250\\  \\ Q_3 + 1.5 \times IQR & = & 69,000+ 42,750 = 111,750 \end{eqnarray*}[/latex]

No salary is less than [latex]-\$2,250[/latex]. However, [latex]\$120,000[/latex] is more than [latex]\$111,750[/latex], so [latex]\$120,000[/latex] is a potential outlier.

TRY IT

Find the interquartile range for the following two data sets and compare them.

Test Scores for Class A
69 96 81 79 65 76 83 99 89 67
90 77 85 98 66 91 77 69 80 94
Test Scores for Class B
90 72 80 92 90 97 92 75 79 68
70 80 99 95 78 73 71 68 95 100
Click to see Solution

 

Enter the data into an Excel spreadsheet.  For this example, suppose we entered the data for Class A into column A from cell A1 to A20 and the data for Class B into column B from cell B1 to B20.

Class A

For the first quartile [latex]Q_1[/latex]:

Function quartile.exc Answer
Field 1 A1:A20 70.75
Field 2 1

For the third quartile [latex]Q_3[/latex]:

Function quartile.exc Answer
Field 1 A1:A20 90.75
Field 2 3

For the IQR:  [latex]\begin{eqnarray*} IQR & = & 90.75-70.75=20 \end{eqnarray*}[/latex]

Class B

For the first quartile [latex]Q_1[/latex]:

Function quartile.exc Answer
Field 1 B1:B20 72.25
Field 2 1

For the third quartile [latex]Q_3[/latex]:

Function quartile.exc Answer
Field 1 B1:B20 94.25
Field 2 3

For the IQR:  [latex]\begin{eqnarray*} IQR & = & 94.25-72.25=22 \end{eqnarray*}[/latex]

The data for Class B  has a larger [latex]IQR[/latex], so the scores between [latex]Q_3[/latex] and [latex]Q_1[/latex] (the middle 50% of the data) for the data for Class B  are more spread out and not clustered about the median.

Percentiles

Percentiles are numbers that separate the (ordered) data into hundredths (100 parts).  Like quartiles, percentiles may or may not be part of the data. The [latex]n[/latex]th percentile, [latex]P_n[/latex], is the value where [latex]n\%[/latex] of the observations in the data are less than the value of the [latex]n[/latex]th percentile.  To score in the 90th percentile of an exam does not mean, necessarily, that you received [latex]90\%[/latex] on a test. The 90th percentile means that [latex]90\%[/latex] of test scores are less than your score and [latex]10\%[/latex] of the test scores are the same or greater than your test score.

Quartiles are special percentiles.  The first quartile, [latex]Q_1[/latex], is the same as the 25th percentile and the third quartile, [latex]Q_3[/latex], is the same as the 75th percentile.  The median is the 50th percentile.

Percentiles are useful for comparing values.  For this reason, universities and colleges use percentiles extensively.  One instance in which colleges and universities use percentiles is when SAT results are used to determine a minimum testing score that will be used as an acceptance factor.  For example, suppose Duke accepts SAT scores at or above the 75th percentile.  That translates into an SAT score of at least 1220.

Percentiles are mostly used with very large data sets. Therefore, if you were to say that [latex]90\%[/latex] of the test scores are less (and not the same or less) than your score, it would be acceptable because removing one particular data value is not significant.

CALCULATING PERCENTILES IN EXCEL

To find the [latex]k[/latex]th percentiles in Excel, use the percentile.exc(array, percent) function. 

  • For array, enter the array or cell range containing the data. 
  • For percent, enter the percentile (as a decimal) being calculated.  For example, if we are calculating the 60th percentile, we would enter 0.6 for the percent in the percentile.exc function.

The output from the percentile.exc function is the value of the corresponding percentile.  For example, percentile.exc(array,0.6) returns the value of the 60th percentile where 60% of the observations in the data are (strictly) less than the value of the 60th percentile.

Visit the Microsoft page for more information about the percentile.exc function.

NOTE

We are using the percentile.exc function, and not the percentile.inc function, because we want the percent of the observations in the data to be strictly less than the value of the percentile.


Watch this video: Percentiles – How to calculate Percentiles, Quartiles, … by Joshua Emmanuel [3:43] (transcript available).


EXAMPLE

Listed are twenty-nine ages (in years) for trees found in the Saint Louis Botanical Garden.

18 21 22 25 26 27 29 30 31 33
36 37 41 42 47 52 55 57 58 62
64 67 69 71 72 73 74 76 77
  1. Find the 70th percentile.
  2. Find the 83rd percentile.

Solution:

Enter the data into an Excel spreadsheet.  For this example, suppose we entered the data in column A from cell A1 to A29.

For the 70th percentile [latex]P_{70}[/latex]:

Function percentile.exc Answer
Field 1 A1:A29 64 years
Field 2 0.7

For the 83rd percentile [latex]P_{83}[/latex]:

Function percentile.exc Answer
Field 1 A1:A29 71.9 years
Field 2 0.83

NOTE

Percentiles have the same units as the data.  In this case, the data is measured in years, so the percentiles are also in years.

TRY IT

Listed are 29 ages (in years) for Academy Award winning best actors.

18 21 22 25 26 27 29 30 31 33
36 37 41 42 47 52 55 57 58 62
64 67 69 71 72 73 74 76 77

Calculate the 20th percentile and the 55th percentile.

 

Click to see Solution

 

Enter the data into an Excel spreadsheet.  For this example, suppose we entered the data in column A from cell A1 to A29.

For the 20th percentile [latex]P_{20}[/latex]:

Function percentile.exc Answer
Field 1 A1:A29 27 years
Field 2 0.2

For the 55th percentile [latex]P_{55}[/latex]:

Function percentile.exc Answer
Field 1 A1:A29 53.5 years
Field 2 0.55

Interpreting Percentiles and Quartiles

A percentile indicates the relative standing of a data value when data are sorted into numerical order from smallest to largest.  Percentages of data values are less than the value of the nth percentile.  For example, 15% of the data values are less than the value of the 15th percentile.  Note that low percentiles always correspond to lower data values and high percentiles always correspond to higher data values.

A percentile may or may not correspond to a value judgment about whether it is “good” or “bad.”  The interpretation of whether a certain percentile is “good” or “bad” depends on the context of the situation to which the data applies.  In some situations, a low percentile would be considered “good,” but in other contexts a high percentile might be considered “good”.  In many situations, there is no value judgment that applies.

Understanding how to interpret percentiles or quartiles properly is important not only when describing data, but also when calculating probabilities in later chapters of this text.  When writing the interpretation of a percentile or quartile in the context of the given data, the sentence should contain the following information:

  • Information about the context of the situation being considered,
  • The data value (value of the variable) that represents the percentile/quartile.
  • The percent of individuals or items with data values below the percentile/quartile.

EXAMPLE

On a timed math test, the first quartile for the time it took to finish the exam was 35 minutes. Interpret the first quartile in the context of this situation.

Solution:

  • Interpretation:  25% of students finished the exam in less than 35 minutes.
  • In this context, a low percentile could be considered good, as finishing more quickly on a timed exam is desirable. (If you take too long, you might not be able to finish.)

TRY IT

For the 100-meter dash, the third quartile for times for finishing the race was 11.5 seconds.  Interpret the third quartile in the context of the situation.

 

Click to see Solution
  • Interpretation:  75% of runners finished the race in less than 11.5 seconds.
  • In this context, a lower percentile is good because finishing a race more quickly is desirable.

EXAMPLE

On a 20 question math test, the 70th percentile for the number of correct answers was 16.  Interpret the 70th percentile in the context of this situation.

Solution:

  • Interpretation:  70% of students answered less than 16 questions correctly.

TRY IT

On a 60 point written assignment, the 80th percentile for the number of points earned was 49.  Interpret the 80th percentile in the context of this situation.

 

Click to see Solution
  • Interpretation:  80% of students earned less than 49 points.

EXAMPLE

At a community college, it was found that the 30th percentile of credit units that students are enrolled for is 7 units.  Interpret the 30th percentile in the context of this situation.

Solution:

  • Interpretation:  30% of students are enrolled in less than 7 credit units.
  • In this context, there is no “good” or “bad” value judgment associated with a higher or lower percentile.  Students attend community college for varied reasons and needs, and their course load varies according to their needs.

TRY IT

During a season, the 40th percentile for points scored per player in a game is 8.  Interpret the 40th percentile in the context of this situation.

 

Click to see Solution
  • Interpretation:  40% of players scored fewer than 8 points.

Concept Review

The values that divide an ordered set of data into 100 equal parts are called percentiles.  Percentiles are used to compare and interpret data.  For example, an observation at the 50th percentile would be greater than 50% of the other observations in the set.

Quartiles divide data into quarters.  The first quartile, [latex]Q_1[/latex], is the 25th percentile, the second quartile, [latex]Q_2[/latex], is the 50th percentile, and the third quartile, [latex]Q_3[/latex], is the the 75th percentile.  The interquartile range, [latex]IQR[/latex], is the range of the middle 50% of the data values.  The [latex]IQR[/latex] is found by subtracting [latex]Q_1[/latex] from [latex]Q_3[/latex], and can help determine outliers by using the following two expressions.


Attribution

2.3 Measures of the Location of the Data in Introductory Statistics by OpenStax is licensed under a Creative Commons Attribution 4.0 International License.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Introduction to Statistics Copyright © 2022 by Valerie Watts is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.