Educators have always been idealists. We choose to believe what we hope is possible, and that belief often keeps us going when things aren’t going our way. It’s probably what drove many of us to finish a graduate degree and what drives us to put our hearts into our work despite all the discouraging news about higher ed these days.
But an abundance of unchecked idealism can also be a dangerous thing. Because the very same passion that can drive one to achieve can also blind one to believe in something just because it seems like it ought to be so. Caught up in a belief that feels so right, we are often less likely to scrutinize the metrics that we choose to measure ourselves or compare ourselves to others. Worse still, our repeated use of these unexamined metrics can become etched into institutional decision-making. Ultimately, the power of belief that once drove us to overcome imposing challenges can become our Achilles heel because we are absolutely certain of things that may, in fact, not be so.
For decades, colleges have tracked the distribution of their class sizes (i.e., the number of classes enrolling 2-9, 10-19, 20-29, 30-39, 40-49, 50-99, and more than 100 students, respectively) as a part of something called the Common Data Set. The implication behind tracking this data point is that a higher proportion of smaller classes ought to correlate with a better learning environment. Since the mid-1980s, the U.S. News and World Report rankings of colleges and universities have included this metric in its formula distilling it down to two numbers – the proportion of classes at an institution with 19 or fewer students (more is better) and the proportion of classes at an institution with 50 or more students (less is better). Two years ago U.S News added a twist by creating a sliding scale so that classes of 19 and fewer received the most credit, the percentage of classes with 20-29, 30-39, and 40-49 received proportionally less credit, and classes of over 50 received no credit. Over time these formulations have produced a powerful mythology across many postsecondary institutions: classes with 19 or fewer students are better than classes with 20 or more.
This begs a pretty important question: are those cut points (19/20, 29/30, etc.) grounded in anything other than an arbitrary application of the Roman numbering system?
Our own fall term IDEA course feedback data provides an opportunity to test the validity of this metric. The overall distribution of class sizes is almost perfect (a nicely shaped bell curve), with almost 80% of courses receiving a robust response rate. Moreover, IDEA’s aggregate dataset allows us to compare three useful measures of the student learning experience across all courses: a student-reported proxy of learning gains called the “progress on relevant objectives” (PRO) score (for a short explanation of the PRO score with additional links for further information, click here), the student perception of the instructor, and the student perception of the course. The table below spells out the average response scores for each measure across eight different categories of class size. Each average score comes from a 5-item response set (converted to a range of 1-5). The PRO score response options range from “no progress” to “exceptional progress,” and the perception of instructor and course excellence response options range from “definitely false” to “definitely true” (to see the actual items on the survey, click here). For this analysis, I’ve only included courses that exceed a 2/3rds (66.67%) response rate.
Class Size | PRO Score | Excellent Teacher | Excellent Course |
6-10 students (35 classes) | 4.24 | 4.56 | 4.38 |
11-15 students (85 classes) | 4.12 | 4.38 | 4.13 |
16-20 students (125 classes) | 4.08 | 4.29 | 4.01 |
21-25 students (71 classes) | 4.18 | 4.40 | 4.27 |
26-30 students (37 classes) | 4.09 | 4.31 | 4.18 |
31-35 students (9 classes) | 3.90 | 4.13 | 3.81 |
36-40 students (11 classes) | 3.64 | 3.84 | 3.77 |
41 or more students (8 classes) | 3.90 | 4.04 | 3.89 |
First, classes enrolling 6-10 students appear to produce notably higher scores on all three measures than any other category. Second, it doesn’t look like there is much difference between subsequent categories until we get to classes enrolling 31 or more students (further statistical testing supports this observation). Based on our own data – assuming that the fall 2017 data does not significantly differ from other academic terms, if we were going to replicate the notion that class size distribution correlates with the quality of the overall learning environment we might be inclined to choose only two cut points to create three categories of class size: those with 10 or fewer students, those with between 11 and 30 students, and those with more than 30 students.
However, further examination of the smallest category of classes indicates that these courses are almost entirely upper-level major courses. Since we know that all three metrics tend to score higher for upper-level major courses because the students in them are more intrinsically interested in the subject matter than students in lower-level courses (classes that often also meet general education requirements), we can’t attribute the higher scores for this group to class size per se. This leaves us with two general categories: classes with 30 or fewer students, and classes with more than 30 students.
How does this comport with the existing research on class size? Although there isn’t much out there, two brief overviews (here and here) don’t find much of a consensus. Some studies suggest that class size is not relevant, others find a positive effect on the learning experience as classes get smaller, and a few others indicate a slight positive effect as classes get larger(!). Especially in light of developments in pedagogy and technology over the past two decades, a 2013 essay that spells out some findings from IDEA’s extensive dataset suggests that other factors almost certainly complicate the relationship between class size and student learning.
So what do we do with all this? Certainly, mandating that all class enrollments sit just below 30 would be, um, stupid. There is a lot more to examine before anyone should march out onto the quad and declare a “class size” policy. One finding from researchers at IDEA that might be worth exploring on our own campus is the variation of learning objectives selected and achieved by class size. IDEA found that smaller classes might be more conducive to more complex (sometimes called “deeper”) learning objectives, while larger classes might be better suited for learning factual knowledge, general principles, or theories. If class size does, in fact, set the stage for different learning objectives, it might be worth assessing the relationship between learning objectives and class size at Augustana to see if we are taking full advantage of the learning environment that a smaller class size provides.
.And what should we do about the categories of class sizes that U.S. News uses in their college rankings formula? As family incomes remain stagnant, tuition revenue continues to lag behind institutional budget projections, and additional resources seem harder to come by, that becomes an increasingly valid question. Indeed, there might be a circumstance where an institution ought to continue using the Common Data Set class size index to guide the way that it fosters an ideal classroom learning environment. And it is certainly reasonable to take other considerations (e.g., faculty workload, available classroom space, intended learning outcomes of a course, etc.) into account when determining an institution’s ideal distribution of class enrollments. But if institutional data suggests that there is little difference in the student learning experience between classes with 16-20 students and classes with 21-25 students, it might be worth revisiting the rationale that an institution uses to determine its class size distribution. No matter what an institution chooses to do, it seems like we ought to be able to justify our choices based on the most effective learning environment that we can construct rather than an arbitrarily-defined and externally-imposed metric.
Make it a good day,
Mark