1.4 Frequency, Frequency Tables, and Levels of Measurement

LEARNING OBJECTIVES

  • Classify data by level of measurement.
  • Create and interpret frequency tables.

Once we have a set of data, we need to organize it so that we can analyze how frequently each datum occurs in the set.  However, when calculating the frequency, we may need to round our answers so that they are as precise as possible.

A simple way to round off answers is to carry the final answer to one more decimal place than was present in the original data.  Round off only the final answer.  Do not round off any intermediate results, if possible.  If it becomes necessary to round off intermediate results, carry them to at least twice as many decimal places as the final answer.  For example, the average of the three quiz scores four, six, and nine is 6.3, rounded off to the nearest tenth because the data are whole numbers.  Most answers will be rounded off in this manner.

Levels of Measurement

The way a set of data is measured is called its level of measurement.  Correct statistical procedures depend on a researcher being familiar with levels of measurement.  Not every statistical operation can be applied to every set of data.  In addition to being classified as quantitative or qualitative, data is classified into four levels of measurement.  They are (from lowest to highest level):

Qualitative Data Quantitative Data
Nominal Scale Level Ordinal Scale Level Interval Scale Level Ratio Scale Level

Data that is measured using a nominal scale is data that can be placed into categories.  Colors, names, labels, favorite foods, and yes/no survey responses are examples of nominal level data.  Nominal scale data are not ordered, which means the categories of the data are not ordered.  For example, trying to “order” people according to their favorite food does not make any sense.  Putting pizza first and sushi second is not meaningful. Smartphone companies are another example of nominal scale data.  Some examples are Sony, Motorola, Nokia, Samsung, and Apple.  This is just a list of different brand names, and there is no agreed upon order for the categories.  Some people may prefer Apple but that is a matter of opinion.  Because nominal data consists of categories, nominal scale data cannot be used in calculations.

Data that is measured using an ordinal scale is similar to nominal scale data in that the data can be placed into categories, but there is a big difference.  The categories of ordinal scale data can be ordered or ranked.  An example of ordinal scale data is a list of the top five national parks in the United States because the parks can be ranked from one to five.  Another example of using the ordinal scale is a cruise survey where the responses to questions about the cruise are “excellent,” “good,” “satisfactory,” and “unsatisfactory.”  These responses are ordered from the most desired response to the least desired.  In ordinal scale data, the differences between two pieces of data cannot be measured or calculated.  Similar to nominal scale data, ordinal scale data cannot be used in calculations.

Data that is measured using an interval scale is similar to ordinal level data because it has a definite ordering.  However, the differences between interval scale data can be measured or calculated, but the data does not have a starting point.  Temperature scales like Celsius (C) and Fahrenheit (F) are measured by using the interval scale.  In both temperature measurements (Celsius and Fahrenheit), 40° is equal to 100° minus 60°.  The differences in temperature can be measured and make sense.  But there is no starting point to the temperature scales because 0° is not the absolute lowest temperature.  Temperatures like -10°F and -15°C exist, and are colder than 0°.  Interval level data can be used in calculations, but ratios do not make sense and cannot be done.  For example, 80°C is not four times as hot as 20°C (nor is 80°F four times as hot as 20°F).  So there is no meaning to the ratio of 80 to 20 (or four to one) in either temperature scale.  In general, ratios have no meaning in interval scale data.

Data that is measured using the ratio scale takes care of the ratio problem, and gives us the most information.  Ratio scale data is like interval scale data, but it has a starting point to the scale (a 0 point) and ratios can be calculated.  For example, four multiple choice statistics final exam scores are 80, 68, 20 and 92 (out of a possible 100 points).  The data can be put in order from lowest to highest: 20, 68, 80, 92.  The differences between the data have meaning:  92 minus 68 is 24.  Ratios can be calculated:  80 is four times 20.  The smallest possible score is 0.


Watch this video: Nominal, ordinal, interval and ratio data: How to Remember the differences by NurseKillam [11:03] (transcript available)


Frequency

Twenty students were asked how many hours they worked per day. Their responses, in hours, are recorded in the table below:

5 6 3 3 2 4 7 5 2 3
5 6 5 4 4 3 5 2 5 3

The following table lists the different data values in ascending order and their frequencies.

Frequency Table of Student Work Hours
DATA VALUE FREQUENCY
[latex]2[/latex] [latex]3[/latex]
[latex]3[/latex] [latex]5[/latex]
[latex]4[/latex] [latex]3[/latex]
[latex]5[/latex] [latex]6[/latex]
[latex]6[/latex] [latex]2[/latex]
[latex]7[/latex] [latex]1[/latex]

A frequency is the number of times a value of the data occurs. According to the table, there are three students who work two hours, five students who work three hours, and so on.  The sum of the values in the frequency column is 20, which is the total number of students included in the sample.

A relative frequency is the ratio (fraction or proportion) of the number of times a value of the data occurs in the set of all outcomes to the total number of outcomes.  To find the relative frequencies divide each frequency by the total number of students in the sample–in this case20.  Relative frequencies can be written as fractions, percents, or decimals.  The sum of the values in the relative frequency column is 1 or 100%.

Frequency Table of Student Work Hours with Relative Frequencies
DATA VALUE FREQUENCY RELATIVE FREQUENCY
[latex]2[/latex] [latex]3[/latex] [latex]\displaystyle\frac{3}{20}=0.15[/latex]
[latex]3[/latex] [latex]5[/latex] [latex]\displaystyle\frac{5}{20}=0.25[/latex]
[latex]4[/latex] [latex]3[/latex] [latex]\displaystyle\frac{3}{20}=0.15[/latex]
[latex]5[/latex] [latex]6[/latex] [latex]\displaystyle\frac{6}{20}=0.30[/latex]
[latex]6[/latex] [latex]2[/latex] [latex]\displaystyle\frac{2}{20}=0.10[/latex]
[latex]7[/latex] [latex]1[/latex] [latex]\displaystyle\frac{1}{20}=0.05[/latex]

Cumulative frequency is the accumulation of the previous frequencies. To find the cumulative frequencies, add all of the previous frequencies to the frequency for the current row, as shown in the table below.  The last entry of the cumulative frequency column is the number of observations in the data.

Frequency Table of Student Work Hours with Relative and Cumulative Frequencies
DATA VALUE FREQUENCY RELATIVE FREQUENCY CUMULATIVE FREQUENCY
[latex]2[/latex] [latex]3[/latex]  [latex]0.15[/latex] [latex]3[/latex]
[latex]3[/latex] [latex]5[/latex]  [latex]0.25[/latex] [latex]3+5=8[/latex]
[latex]4[/latex] [latex]3[/latex]  [latex]0.15[/latex] [latex]8 + 3 = 11[/latex]
[latex]5[/latex] [latex]6[/latex]  [latex]0.30[/latex] [latex]11 + 6 =17[/latex]
[latex]6[/latex] [latex]2[/latex]  [latex]0.10[/latex] [latex]17 + 2 = 19[/latex]
[latex]7[/latex] [latex]1[/latex]  [latex]0.05[/latex] [latex]19+ 1 = 20[/latex]

Cumulative relative frequency is the accumulation of the previous relative frequencies. To find the cumulative relative frequencies, add all the previous relative frequencies to the relative frequency for the current row, as shown in the table below.  The last entry of the cumulative relative frequency column is 1 or 100%, indicating that 100% of the data has been accumulated.

Frequency Table of Student Work Hours with Relative, Cumulative and Cumulative Relative Frequencies
DATA VALUE FREQUENCY RELATIVE FREQUENCY CUMULATIVE FREQUENCY CUMULATIVE RELATIVE FREQUENCY
[latex]2[/latex] [latex]3[/latex]  [latex]0.15[/latex] [latex]3[/latex] [latex]0.15[/latex]
[latex]3[/latex] [latex]5[/latex]  [latex]0.25[/latex] [latex]8[/latex] [latex]0.15 + 0.25 = 0.40[/latex]
[latex]4[/latex] [latex]3[/latex]  [latex]0.15[/latex] [latex]11[/latex] [latex]0.40 + 0.15 = 0.55[/latex]
[latex]5[/latex] [latex]6[/latex]  [latex]0.30[/latex] [latex]17[/latex] [latex]0.55 + 0.30 = 0.85[/latex]
[latex]6[/latex] [latex]2[/latex]  [latex]0.10[/latex] [latex]19[/latex] [latex]0.85 + 0.10 = 0.95[/latex]
[latex]7[/latex] [latex]1[/latex]  [latex]0.05[/latex] [latex]20[/latex] [latex]0.95 + 0.05 = 1.00[/latex]

NOTE

Because of rounding of the relative frequencies, the relative frequency column may not always sum to 1 or 100%, and the last entry in the cumulative relative frequency column may not be 1 or 100%. However, they each should be close to 1 or 100%.  If all of the decimals are kept in the calculations, the relative frequency column will sum to 1 or 100% and the last cumulative relative frequency will be 1 or 100%.

CREATING A FREQUENCY DISTRIBUTION IN EXCEL

In order to create a frequency distribution and its corresponding histogram in Excel, we need to use the Analysis ToolPak.  Follow these instructions to install the Analysis ToolPak add-in in Excel.

  1. Enter the data into an Excel worksheet.
  2. Determine the classes for the frequency distribution.  Using these classes, create a Bin column that contains the upper limit for each class.
  3. Go to the Data tab and click on Data Analysis.  If you do not see Data Analysis in the Data tab, you will need to install the Analysis ToolPak.
  4. In the Data Analysis window, select Histogram.  Click OK.
  5. In the Input range, enter the cell range for the data.
  6. In the Bin range, enter the cell range for the Bin column.
  7. Select the location where you want the output to appear.
  8. Select Chart Output to produce the corresponding histogram for the frequency distribution.
  9. Click OK.

This website provides additional information on using Excel to create a frequency distribution.


Watch this video: Frequency Distributions by Joshua Emmanuel [8:40] (transcript available).


Watch this video: How to Construct a Histogram in Excel using built-in Data Analysis by Joshua Emmanuel [1:58] (transcript available).


Concept Review

Some calculations generate numbers that are artificially precise.  It is not necessary to report a value to eight decimal places when the measures that generated that value were only accurate to the nearest tenth.  Round off final answers to one more decimal place than was present in the original data.  This means that if you have data measured to the nearest tenth of a unit, report the final statistic to the nearest hundredth.

There are four levels of measurement for data:

  • Nominal scale level: the data are categories, but the data cannot be ordered or used in calculations
  • Ordinal scale level: the data are categories and the data can be ordered, but the differences cannot be measured.
  • Interval scale level: the data have definite order or rank, but no starting point.  The differences can be measured, but there is no such thing as a ratio.
  • Ratio scale level: the data have a definite order or rank with a starting point.  The differences have meaning and ratios can be calculated.

When organizing data, it is important to know how many times a value appears.  How many statistics students study five hours or more for an exam?  What percent of families on your block own two pets?  Frequency, relative frequency, cumulative frequency, and cumulative relative frequency are measures that answer questions like these.


Attribution

“1.3 Frequency, Frequency Tables, and Levels of Measurement in Introductory Statistics by OpenStax is licensed under a Creative Commons Attribution 4.0 International License.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Introduction to Statistics Copyright © 2022 by Valerie Watts is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.