12.6 Coefficient of Determination
LEARNING OBJECTIVES
- Calculate and interpret the coefficient of determination.
Previously, we saw how to use the correlation coefficient to measure the strength and direction of the linear relationship between the independent and dependent variables. The correlation coefficient gives us a way to measure how good a linear regression model fits the data. The coefficient of determination is another way to evaluate how well a linear regression model fits the data. Denoted [latex]r^2[/latex], the coefficient of determination is the proportion of variation in the dependent variable that can be explained by the regression equation based on the independent variable. The coefficient of determination is the square of the correlation coefficient.
The coefficient of determination is a number between 0 and 1, and is the decimal form of a percent. The closer the coefficient of determination is to 1, the better the independent variable is at predicting the dependent variable. When we interpret the coefficient of determination, we use the percent form. When expressed as a percent, [latex]r^2[/latex] represents the percent of variation in the dependent variable [latex]y[/latex] that can be explained by the variation in the independent variable [latex]x[/latex] using the regression line. When interpreting the coefficient of determination, remember to be specific to the context of the question.
EXAMPLE
A statistics professor wants to study the relationship between a student’s score on the third exam in the course and their final exam score. The professor took a random sample of 11 students and recorded their third exam score (out of 80) and their final exam score (out of 200). The results are recorded in the table below. The professor wants to develop a linear regression model to predict a student’s final exam score from the third exam score.
Student | Third Exam Score | Final Exam Score |
1 | 65 | 175 |
2 | 67 | 133 |
3 | 71 | 185 |
4 | 71 | 163 |
5 | 66 | 126 |
6 | 75 | 198 |
7 | 67 | 153 |
8 | 70 | 163 |
9 | 71 | 159 |
10 | 69 | 151 |
11 | 69 | 159 |
Previously we found the correlation coefficient [latex]r=0.6631[/latex] and the line-of-best-fit [latex]\hat{y}=-173.51+4.83x[/latex] where [latex]x[/latex] is the third exam score and [latex]\hat{y}[/latex] is the (predicted) final exam score.
- Find the coefficient of determination.
- Interpret the coefficient of determination found in part 1.
Solution:
- [latex]\displaystyle{r^2=(0.6631)^2=0.4397}[/latex].
- [latex]43.97\%[/latex] of the variation in the final exam score can be explained by the regression line based on the third exam score.
TRY IT
SCUBA divers have maximum dive times they cannot exceed when going to different depths. The data in the table below shows different depths with the maximum dive times in minutes. Previously, we found the correlation coefficient and the regression line to predict the maximum dive time from depth.
Depth (in feet) | Maximum Dive Time (in minutes) |
50 | 80 |
60 | 55 |
70 | 45 |
80 | 35 |
90 | 25 |
100 | 22 |
- Find the coefficient of determination.
- Interpret the coefficient of determination found in part 1.
Click to see Solution
- [latex]\displaystyle{r^2=(-0.9629)=0.9272}[/latex].
- [latex]92.72\%[/latex] of the variation in the maximum dive time can be explained by the regression line based on depth.
Concept Review
The coefficient of determination, [latex]r^2[/latex], is equal to the square of the correlation coefficient. When expressed as a percent, the coefficient of determination represents the percent of variation in the dependent variable [latex]y[/latex] that can be explained by the variation in the independent variable [latex]x[/latex] using the regression line.
Attribution
“12.3 The Regression Equation“ in Introductory Statistics by OpenStax is licensed under a Creative Commons Attribution 4.0 International License.