"

3.2 Categorizing Data

Data can come in a variety of forms. It can fit into specific categories, or it can be more general in nature. We can collect data in different ways, such as through instruments like a thermometer or through an open-ended survey. Regardless of the data type and how it is collected, the information is useless without analysis. Data analysis is a powerful tool that can guide key strategic decisions. Businesses rarely make important decisions without some data analysis to support them. A key step in the analysis process is ensuring the reliability and validity of data. A business can then use various techniques to understand the information better. Microsoft Excel provides several tools to make that process easier for the business decision-maker.

Analysis of data to provide insights and recommendations to a team on ways to monitor and improve performance. The skills needed in this field include critical thinking, analytical thinking, problem solving, attention to detail, and communication. Skills you will apply as a health care professional. You may need to dig deeper into data by asking the “why” questions and then be able to clearly communicate your findings to colleagues.

What is Data?

Data are units of information that are collected through observation.

Data can be collected in many forms, such as surveys, interviews, focus groups, measurements, and controlled findings. As a result, they can be numeric or descriptive. For example, data could measure the number of people accessing a website compared to age of that client, or it could be measuring feedback such as that seminar was “wonderful”, or a scale-based survey with a scale where five is wonderful. A dental hygienist may evaluate how many people are accessing community service based on many factors, including accessibility, as seen in Chapter 1, and correlate the documented dental ailments by income, age, accessibility, etc.

Once data is collected, it can be grouped into two broad categories: qualitative and quantitative.

Qualitative Data

Qualitative data is categorical information that does not include numbers, or if it does include numbers, those numbers do not have a true mathematical meaning. For example, a question on a class evaluation survey could ask about the mode of delivery for the course, which could be as follows: 1 – in-person, 2 – online, or 3 – hybrid. The numbers, in this case, do not have a meaning; instead, they are placeholders to indicate the category you select.

  • Nominal Data is qualitative data that is categorized by naming or labelling and has no quantitative value or meaningful order associated with it. For example, gender, blood type, and marital status. All are grouped into distinct categories.
  • Ordinal Data: is used to rank or order data. For example, a survey ranked the level of satisfaction on a scale of 1 to 5. The numbers assigned have a rank or order but no numeric measurement. For example, on a scale of 1 to 5, 4 is not necessarily twice as satisfied as 2 (Beacom, 2018).

Interviews, focus groups or data collected from a variety of sources such as photos, field notes, interview transcripts, and photos can also be included as qualitative data.

Quantitative Data

Quantitative data involves numerical evaluations. Think “quantity”. This type of data is typically represented as whole numbers, cannot be broken down into smaller units, and is usually collected through surveys, experiments and statistical analysis.  There are two main types of quantitative data: discrete and continuous.

  • Discrete data: Discrete data refers to numerical values that can only take on specific, distinct values. This type of data is typically represented as whole numbers and cannot be broken down into smaller units. Examples of discrete data include the number of students in a class, the number of cars in a parking lot, and the number of children in a family (Hassan, 2024).
  • Continuous data: Continuous data refers to numerical values that can take on any value within a certain range or interval. This type of data is typically represented as decimal or fractional values and can be broken down into smaller units. Examples of continuous data include measurements of height, weight, temperature, and time (Hassan, 2024).

Descriptive Statistics

Descriptive statistics summarize and describe the basic features of the data, such as the mean, median, mode, standard deviation, and range. There are three main types of Descriptive Statistics:

  1. Frequency distribution: records how often data occurs.
  2. Central tendency: records the data’s center point of distribution and can be measured using mean and median.
  3. Dispersion (spread): records its degree of dispersion and can be measured using standard deviation.

Mean: The arithmetic mean of a variable, often called the average, is computed by adding up all the values and dividing by the total number of values.

Median: The median of a variable is the middle value of the data set when the data are sorted in order from least to greatest. It splits the data into two equal halves, with 50% of the data below the median and 50% above the median. The median is resistant to the influence of outliers and may be a better measure of the centre with strongly skewed data.

Standard deviation is a statistic used as a measure of the dispersion or variation in a distribution, how much the data points differ from the mean.

Variables

In data analysis and statistics, variables are characteristics or attributes that are observed, measured, and recorded. Here are some types of variables:

  • Independent Variable: The condition that you change in an experiment. These can vary or be manipulated. For example, the temperature in a room.
  • Dependent Variable: The variable that you measure or observe. This changes as a result of the independent variable manipulation. It’s the outcome you’re interested in measuring, and it “depends” on your independent variable.
  • Controlled Variable: A variable that does not change during an experiment.

Example

The following example uses the independent variable as the medication, which is changing, and the dependent, the blood pressure of the patient when given the medication.

Independent variable levels: You are studying the impact of a new medication on the blood pressure of patients with hypertension. Your independent variable is the treatment that you directly vary between groups.

You have three independent variable levels, and each group gets a different level of treatment.

You randomly assign your patients to one of the three groups:

  • A low-dose experimental group
  • A high-dose experimental group
  • A placebo group (to research a possible placebo effect)
Independent Variable
Type of Treatment (level)
Dependent Variable
Blood Pressure
Level 1: Low dose of the new medication No change in blood pressure
Level 2: High dose of the new medication Blood pressure is lowered
Level 3: Placebo No change in blood pressure
Apply different levels of the independent variable. Measure the effect on the dependent variable

(Bhandari, 2022).


11.1: Understanding Data, Data Validation, and Data Tables” in Workplace Software and Skills by LibraTexts is licensed under a Creative Commons Attribution 4.0 International Licence.—Modifications: Used paragraph one; Used paragraphs six and seven of section What is data, edited, added additional content examples.

“Chapter 1: Descriptive Statistics and the Normal Distribution” from Natural Resources Biometrics by Diane Kiernan is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.—Modifications: Used Mean & Median, edited, removed mathematical examples.