Analyzing the Data

Rajiv S. Jhangiani; I-Chant A. Chiang; Carrie Cuttler; Dana C. Leighton; Molly A. Metz

12 Analyzing the Data

Learning Objectives

Distinguish between descriptive and inferential statistics
Identify the different kinds of descriptive statistics researchers use to summarize their data
Describe the purpose of inferential statistics.
Distinguish between Type I and Type II errors.

Once the study is complete and the observations have been made and recorded the researchers need to analyze the data and draw their conclusions. Typically, data are analyzed using both descriptive and inferential statistics. Descriptive statistics are used to summarize the data and inferential statistics are used to generalize the results from the sample to the population. In turn, inferential statistics are used to make conclusions about whether or not a theory has been supported, refuted, or requires modification.

Descriptive Statistics

Descriptive statistics are used to organize or summarize a set of data. Examples include percentages, measures of central tendency (mean, median, mode), measures of dispersion (range, standard deviation, variance), and correlation coefficients.

Measures of central tendency are used to describe the typical, average and center of a distribution of scores. The mode is the most frequently occurring score in a distribution. The median is the midpoint of a distribution of scores. The mean is the average of a distribution of scores.

Measures of dispersion are also considered descriptive statistics. They are used to describe the degree of spread in a set of scores. So are all of the scores similar and clustered around the mean or is there a lot of variability in the scores? The range is a measure of dispersion that measures the distance between the highest and lowest scores in a distribution. The standard deviation is a more sophisticated measure of dispersion that measures the average distance of scores from the mean. The variance is just the standard deviation squared. So it also measures the distance of scores from the mean but in a different unit of measure.

Typically means and standard deviations are computed for experimental research studies in which an independent variable was manipulated to produce two or more groups and a dependent variable was measured quantitatively. The means from each experimental group or condition are calculated separately and are compared to see if they differ.

For non-experimental research, simple percentages may be computed to describe the percentage of people who engaged in some behavior or held some belief. But more commonly non-experimental research involves computing the correlation between two variables. A correlation coefficient describes the strength and direction of the relationship between two variables. The values of a correlation coefficient can range from −1.00 (the strongest possible negative relationship) to +1.00 (the strongest possible positive relationship). A value of 0 means there is no relationship between the two variables. Positive correlation coefficients indicate that as the values of one variable increase, so do the values of the other variable. A good example of a positive correlation is the correlation between height and weight, because as height increases weight also tends to increase. Negative correlation coefficients indicate that as the value of one variable increase, the values of the other variable decrease. An example of a negative correlation is the correlation between stressful life events and happiness; because as stress increases, happiness is likely to decrease.

Inferential Statistics

As you learned in the section of this chapter on sampling, typically researchers sample from a population but ultimately they want to be able to generalize their results from the sample to a broader population. Researchers typically want to infer what the population is like based on the sample they studied. Inferential statistics are used for that purpose. Inferential statistics allow researchers to draw conclusions about a population based on data from a sample. Inferential statistics are crucial because the effects (i.e., the differences in the means or the correlation coefficient) that researchers find in a study may be due simply to random chance variability or they may be due to a real effect (i.e., they may reflect a real relationship between variables or a real effect of an independent variable on a dependent variable).

Researchers use inferential statistics to determine whether their effects are statistically significant. A statistically significant effect is one that is unlikely due to random chance and therefore likely represents a real effect in the population. More specifically results that have less than a 5% chance of being due to random error are typically considered statistically significant. When an effect is statistically significant it is appropriate to generalize the results from the sample to the population. In contrast, if inferential statistics reveal that there is more than a 5% chance that an effect could be due to chance error alone then the researcher must conclude that their result is not statistically significant.

It is important to keep in mind that statistics are probabilistic in nature. They allow researchers to determine whether the chances are low that their results are due to random error, but they don’t provide any absolute certainty. Hopefully, when we conclude that an effect is statistically significant it is a real effect that we would find if we tested the entire population. And hopefully when we conclude that an effect is not statistically significant there really is no effect and if we tested the entire population we would find no effect. And that 5% threshold is set at 5% to ensure that there is a high probability that we make a correct decision and that our determination of statistical significance is an accurate reflection of reality.

But mistakes can always be made. Specifically, two kinds of mistakes can be made. First, researchers can make a Type I error, which is a false positive. It is when a researcher concludes that their results are statistically significant (so they say there is an effect in the population) when in reality there is no real effect in the population and the results are just due to chance (they are a fluke). When the threshold is set to 5%, which is the convention, then the researcher has a 5% chance or less of making a Type I error. You might wonder why researchers don’t set it even lower to reduce the chances of making a Type I error. The reason is when the chances of making a Type I error are reduced, the chances of making a Type II error are increased. A Type II error is a missed opportunity. It is when a researcher concludes that their results are not statistically significant when in reality there is a real effect in the population and they just missed detecting it. Once again, these Type II errors are more likely to occur when the threshold is set too low (e.g., set at 1% instead of 5%) and/or when the sample was too small.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Research Methods in Psychology Copyright © 2020 by Rajiv S. Jhangiani, I-Chant A. Chiang, Carrie Cuttler, Dana C. Leighton & Molly A. Metz is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Report section	Questions worth asking
Abstract	What are the key findings? How were those findings reached? What framework does the researcher employ?
Acknowledgments	Who are this study’s major stakeholders? Who provided feedback? Who provided support in the form of funding or other resources?
Problem statement (introduction)	How does the author frame their research focus? What other possible ways of framing the problem exist? Why might the author have chosen this particular way of framing the problem?
Literature review (introduction)	How selective does the researcher appear to have been in identifying relevant literature to discuss? Does the review of literature appear appropriately extensive? Does the researcher provide a critical review?
Sample (methods)	Where was the data collected? Did the researcher collect their own data or use someone else's data? What population is the study trying to make claims about, and does the sample represent that population well? What are the sample’s major strengths and major weaknesses?
Data collection (methods)	How were the data collected? What do you know about the relative strengths and weaknesses of the method employed? What other methods of data collection might have been employed, and why was this particular method employed? What do you know about the data collection strategy and instruments (e.g., questions asked, locations observed)? What don’t you know about the data collection strategy and instruments?
Data analysis (methods)	How were the data analyzed? Is there enough information provided for you to feel confident that the proper analytic procedures were employed accurately?
Results	What are the study’s major findings? Are findings linked back to previously described research questions, objectives, hypotheses, and literature? Are sufficient amounts of data (e.g., quotes and observations in qualitative work, statistics in quantitative work) provided in order to support conclusions drawn? Are tables readable?
Discussion/conclusion	Does the author generalize to some population beyond her/his/their sample? How are these claims presented? Are claims made supported by data provided in the results section (e.g., supporting quotes, statistical significance)? Have limitations of the study been fully disclosed and adequately addressed? Are implications sufficiently explored?

Behavior Experienced at work	Women	Men	p value
Subtle or obvious threats to your safety	2.9%	4.7%	0.623
Being hit, pushed, or grabbed	2.2%	4.7%	0.480
Comments or behaviors that demean your gender	6.5%	2.3%	0.184
Comments or behaviors that demean your age	13.8%	9.3%	0.407
Staring or invasion of your personal space	9.4%	2.3%	0.039
Note: Sample size was 138 for women and 43 for men.

12 Analyzing the Data

Learning Objectives

Descriptive Statistics

Inferential Statistics

Learning Objectives

Developing a concept map

Key Takeaways

Learning Objectives

Understanding the results section

Key Takeaways

Glossary

Image Attributions

Chapter outline

Content advisory

License

Share This Book