Student grades data (2013/2014-2016/2017)
As part of our exploration into the academic impact of the eText/IPM initiative on students at Algonquin College, a total of four years of student grades data from across the College was requested from the Registrar’s Office. The resultant dataset contained over 1.7 million rows (n = 1 757 142), with each row representing each grade earned by a student since academic year 2013/2014 until the end of academic year 2016/2017. Each student in this dataset was assigned a new unique identifier by the Registrar’s Office, different from his or her student number, to preserve the link between a given student’s grades (to, for example, permit the exploration of both within- and between-subject effects) while serving to anonymize and protect the student’s identity. The subsequent dataset was treated in a number of ways to facilitate its analysis. Each of these treatment steps have been documented to give a sense of the kinds of considerations necessary for working with raw grades data. Unless otherwise noted, the treatments were applied sequentially, with each treatment being applied to the resultant dataset of the previous treatment.
Step | Treatment | Justification | Resultant dataset(s) |
---|---|---|---|
0 | Initial dataset | 1 757 142 | |
1 | Removal of duplicate entries due to multiple professors | Duplicate entries for the same grade | 1 645 955 |
2 | Removal of empty (null) grades | Null grades cannot be used for analysis | 1 309 178 |
3 | Restriction to valid grades | Need valid values for grades to allow analysis | 1 221 631 |
4 | Removal of continuing education and distance education (Centre for Continuing and Online Learning), Jazan, and Kuwait | Limit as much as possible to grades earned in a similar context | 1 143 890 |
5 | Limitation to full-time programs using section numbering | Limit as much as possible to grades earned in a similar context | 923 651 |
6 | Addition of a Pass/Fail course flag and an eText/IPM flag | – | – |
See Appendix A for details of each treatment step.
Characterizing the resultant dataset
The resultant dataset, which will be used for analysis, retained 53% of the initial grades (923 651 out of 1 757 142 grades). 55 353 unique students1 are represented across four reference periods and nine categories of faculty or school.
Grades count and unique students by reference period1
Reference period | Grades eText/IPM = ‘N’ | Grades eText/IPM = ‘Y’ | % eText/IPM | Student count1 |
---|---|---|---|---|
All periods | 725 247 | 198 404 | 21.5 | 55 353 |
2013/2014 | 199 231 | 31 448 | 13.6 | 22 306 |
2014/2015 | 177 981 | 50 224 | 22.0 | 22 396 |
2015/2016 | 172 677 | 56 918 | 24.8 | 22 453 |
2016/2017 | 175 358 | 59 814 | 25.4 | 22 900 |
1 Students may be represented in more than one level of this category