13.7 Multicollinearity
LEARNING OBJECTIVES
- Define multicollinearity and understand its impact on multiple regression.
The term independent variable applies to any variable that is used to predict or explain the value of the dependent variable. But this does not mean that the independent variables themselves are unrelated to each other. In fact, most independent variables in multiple regression models share some degree of relatedness. For example, if “distance travelled” and “litres of gas consumed” are the independent variables in a regression model to predict the dependent variable “travel time,” the variables “distance travelled” and “litres of gas consumed” are highly correlated.
When two or more independent variables in a regression model are highly correlated to each other, multicollinearity exists between the independent variables. Consequently, the conclusions about the relationship between the dependent variable and the individual independent variables may be affected when the independent variables are related to each other. In addition, multicollinearity may affect the outcome of the tests on the individual regression coefficients. But multicollinearity does not affect the outcome of the overall test on the regression model.
Even though the overall model test may conclude that there is a relationship between the dependent variable and the set of independent variables, multicollinearity amongst the independent variables may cause all of the tests on the individual regression coefficients to conclude that none of the individual independent variables are related to the dependent variable. One way to address the problem of multicollinearity is to avoid including independent variables that are highly correlated or remove one of two highly correlated independent variables from the model.
Concept Review
Multicollinearity refers to the correlation that may exist between two or more independent variables in a regression model. Although multicollinearity may affect conclusions drawn about the individual regression coefficients, multicollinearity does not affect conclusions about the overall model.