Increased Structure and Active Learning Reduce the Achievement Gap in Introductory Biology

See allHide authors and affiliations

Science  03 Jun 2011:
Vol. 332, Issue 6034, pp. 1213-1216
DOI: 10.1126/science.1204820


Science, technology, engineering, and mathematics instructors have been charged with improving the performance and retention of students from diverse backgrounds. To date, programs that close the achievement gap between students from disadvantaged versus nondisadvantaged educational backgrounds have required extensive extramural funding. We show that a highly structured course design, based on daily and weekly practice with problem-solving, data analysis, and other higher-order cognitive skills, improved the performance of all students in a college-level introductory biology class and reduced the achievement gap between disadvantaged and nondisadvantaged students—without increased expenditures. These results support the Carnegie Hall hypothesis: Intensive practice, via active-learning exercises, has a disproportionate benefit for capable but poorly prepared students.

Since the 1970s, policy-makers have been calling for an increase in the number of underrepresented minority (URM) students who complete science-related degrees at the undergraduate, graduate, and professional levels (13). In response, educators and administrators have created programs focused on recruiting and retaining minorities in the STEM (science, technology, engineering, and mathematics) disciplines. In some cases these programs also target students from socioeconomically disadvantaged backgrounds, irrespective of ethnicity. At the college level, most of these efforts fall into two broad categories: (i) comprehensive programs that recruit promising students and provide financial aid, supplementary instruction, mentoring, social support, and research opportunities (2, 46) or (ii) less-intensive programs that offer supplementary instruction or peer-led workshops associated with introductory courses that have high failure rates (79). Many of the latter programs have increased the success of the target populations in the STEM disciplines; some of the former have succeeded in reducing or eliminating the achievement gap that exists between URM and non-URM students—a gap that starts in K-12 and continues through undergraduate education (10, 11). Unfortunately, both approaches are expensive and have therefore rarely been permanently incorporated into the traditional funding structure of host institutions (2). When external funding has run out, participation and success rates have dropped dramatically (7).

Changing introductory STEM courses for undergraduates from traditional lecturing to active learning designs has been advocated as an alternative solution to the achievement gap problem (3, 12). This call has trickled up from research on K-12 programs, where the implementation of active learning (13) and culturally responsive teaching (14) has had a profound impact on the achievement gap. Some reformed introductory courses at the college level have also reported success in boosting achievement by disadvantaged students, although these course designs required increased investment by external funders and host institutions (15, 16).

We asked the following: Can an existing STEM course be modified to improve performance by students from disadvantaged educational and socioeconomic backgrounds who are at high risk of failing, without requiring increased resources in the way of staffing or external funding? In essence, our work addresses what Benjamin Bloom called the “2 Sigma Problem”: the need to create teaching-learning conditions under large-group instruction that allow students to achieve at the level they would under individual instruction by a skilled tutor (10, 17). The question has taken on added urgency as faculty-to-student ratios worsen in response to the global economic crisis.

We worked with a large-enrollment introductory biology course for undergraduate majors called Biology 180 and studied changes in the performance of students in the University of Washington’s Educational Opportunity Program (EOP) (18). Individuals in the EOP are from educationally or economically disadvantaged backgrounds; most are first in their family to attend college. Although EOP students are not identified on the basis of ethnicity, most URM students at the University of Washington (UW) (76.5%, in the present study) are also in the EOP category. Analyzing individuals in the EOP captures most URM students while broadening the analysis to include all students from disadvantaged backgrounds.

Previous work showed that we can predict student performance in this course a priori, on the basis of college grade point average at the time of entering the course and SAT verbal score (1820). Failure in this course is defined as a final grade under 1.5, the threshold required to continue in the UW introductory biology series; students who are at high risk of failing include a disproportionate number of students in the EOP (Fig. 1A). Actual failure rates are consistent with the predicted-grade analysis: From 2003 to 2008, the mean failure rate for EOP students was 21.9%, whereas the failure rate for non-EOP students was 10.1%.

Fig. 1

Students from disadvantaged backgrounds are at high risk for failure in introductory STEM courses. (A) Frequency distribution of students in Biology 180, binned by predicted grade. The horizontal line indicates the overall proportion of EOP students in this course (16). EOP students are overrepresented at the low end of the distribution of predicted grades relative to non-EOP students (generalized linear model with binomial error distribution, P << 0.001). (B) Achievement gaps between EOP and non-EOP students in introductory STEM courses at the University of Washington. Means were calculated for gaps in each quarter, academic years 2003 to 2008; error bars represent bootstrapped 95% confidence intervals.

As a comparison, we also analyzed 111,227 records on students taking introductory STEM courses at the University of Washington from 2003 to 2008. We calculated means and 95% confidence intervals for the achievement gaps between EOP and non-EOP students, calculated as the difference in final grade on a 4.0 point scale, for the first course in these multicourse sequences designed for prospective majors in the STEM disciplines. The achievement gaps in Biology 180 are among the highest on our campus (Fig. 1B). We hypothesized that the large gaps occur because Biology 180 final grades depend largely on short-answer midterm and final exams that test higher-order cognitive skills (19) and because EOP students are underprepared for this type of assessment relative to non-EOP students.

Previous work on four quarters of Biology 180, taught by the same instructor, showed that incorporating a moderate level of required (graded) active-learning exercises increased performance by all students compared with performance in lecture-intensive courses where active-learning exercises were absent or optional (19). The active-learning exercises were daily, multiple-choice “clicker questions” implemented in a peer instruction format (21) and a weekly practice exam comprising five short-answer questions that were peer-graded (18, 19). Recent analyses combined these data with data from two additional quarters taught by the same instructor in a third course design: a lecture-free format that added preclass reading quizzes (18, 2022) and extensive informal group work in class (18, 20, 23, 24). This course design was considered highly structured because it required students to (i) prepare for class sessions, (ii) use clickers or random-call responses to participate in class sessions that were focused entirely on active-leaning exercises, and (iii) complete a weekly low-risk assessment in the form of a practice exam. The highly structured approach resulted in another increase in overall performance by all students, compared with the low-structure, lecture-intensive course with no required active learning and the moderate-structure design based on clickers and a weekly practice exam (20).

To reduce the achievement gap, however, interventions must have a disproportionate benefit for disadvantaged students. To test whether EOP students disproportionately benefit from high-structure versus low-structure courses, we analyzed grades achieved by students in 29 quarters of Biology 180 by using their predicted grades to control for among-quarter variation in student ability and preparation (18). Comparing student performance in the two highly structured quarters versus 27 quarters with little or no active learning, we find that, although all students benefit from structure, EOP students experience a disproportionate benefit. The generalized linear mixed model with the most power to explain the actual grades received by students included active learning, predicted grade, EOP status, and their interaction as explanatory variables, with quarter as a random effect to control for instructor differences (likelihood ratio test, P = 0.0023, Table 1). Our highly structured course significantly improved student performance in this broad-based comparison—but did so disproportionately for EOP students (Fig. 2).

Table 1

Ability of alternative models, containing different combinations of explanatory variables (such as predicted grades, structure, and EOP status), to explain observed grades students receive in introductory biology. Only a subset of all possible models are shown here for clarity; for a full list see table S2, and see (18) for a description of the models. k is the number of parameters in the model; AIC, Akaike’s information criterion; ω, the weighted AICs. P = 0.0023 for a likelihood ratio test between Structure*Predicted*EOP and Predicted*EOP models. P = 2.2 × 10–16 for a likelihood ratio test between Predicted*EOP and Structure+Predicted+EOP. P = 0.074 for likelihood ratio test between Structure and Null models.

View this table:
Fig. 2

Highly structured course designs benefit all students, but especially disadvantaged students. The difference between predicted performance and actual performance is significantly decreased for all students, and EOP students in particular, in two highly structured courses relative to 27 low-structure versions of the same course with little to no active learning. The best-fit generalized linear mixed-effects models of performance include EOP as a fixed effect, likelihood ratio test, df = 13, χ2 = 10.997, P = 0.0027; see Table 1. Error bars indicate ±1 SE.

To verify that these results are robust to instructor differences, we reexamined the achievement gap between EOP and non-EOP students but restricted the analysis to the six quarters taught by the same instructor: two in the lecture-intensive, low-structure format; two in the moderately structured format that included in-class clicker questions and weekly practice exams; and two in the highly structured format that added daily reading quizzes and in-class group exercises and nearly eliminated lecturing (18, 20). The highly structured format reduced the raw achievement gap for EOP versus non-EOP students from 0.80 to 0.44 grade points—a 45% drop from both the moderate-structure and the low-structure formats. This decline in achievement gap exceeded the 95% confidence intervals from the low- and moderate-structure quarters (Fig. 3). We found a similar decrease in achievement gap over these six quarters when accounting for incoming achievement (fig. S2). Overall, course structure has a significant impact on the achievement gap, controlling for instructor identity and student ability (table S2).

Fig. 3

For quarters with a common instructor, the achievement gap is halved with increased structure. Across six quarters of low, medium, and high structure (two quarters each), the achievement gap is the smallest under high structure. Controlling for differences in predicted student ability, we find a drop under the medium-structure course design as well (fig. S2). Means were calculated from 10,000 bootstrap iterations; error bars represent 95% bootstrapped confidence intervals. The asterisk near the vertical axis represents the average achievement gap across all instructors, 2003 to 2008 (Fig. 1B).

The change from low to moderate to high structure did not require additional financial resources, smaller class sizes, or more class time. In fact, the second iteration of the highly structured course (in autumn 2009) took place when class size had increased from a maximum of 345 in the earlier five quarters analyzed to 700, labs had been cut from 3 hours per week to 2, and the number of graduate teaching assistants had been reduced from one for every 49 students to one for every 87.5 students.

Our results raise two related questions: Did EOP students actually learn more or simply perform better on assessments? And were their performance gains due to benefits derived from active learning or frequent testing (25)? Ranking exam questions on Bloom’s taxonomy of educational objectives (26, 27) provides a standardized framework for assessing the level of learning (28). A recent analysis of the exams given in the six quarters taught by the same instructor shows that (i) an average exam question, weighted by points available, is at the application level of Bloom’s taxonomy and (ii) the Bloom’s level and difficulty of exams actually increased during the transition from low- to medium- to highly structured course designs (20). The application level subsumes mastery of vocabulary and strong conceptual understanding and demands that students apply concepts in new situations. The high level of exam questions in this course suggests that the performance gains we document here reflect actual learning gains. Further, active-learning exercises have been shown to increase performance on exam questions that demand higher-order cognitive skills, while having no effect on exam questions focused on lower-order cognitive skills [levels 3 to 6 versus 1 and 2 on Bloom’s taxonomy; e.g., (29)]. These results suggest that active learning does not help with information transfer, only with problem-solving and other types of higher-order learning. Thus, student gains in performance in our highly structured design should reflect a deeper understanding of the content and not result simply from increased frequency of assessment.

An intriguing next step is to explore the mechanisms responsible for the disproportionate change in EOP student performance in a large-enrollment, highly structured course design. The success of other interventions at the K-12 (13) and college (7) levels suggests that many students from educationally disadvantaged backgrounds are capable of succeeding in the STEM disciplines but simply lack exposure to the challenge of integrating concepts to solve new problems. We propose that almost all students arrive on college campuses with 12 years of practice at Bloom’s levels 1 and 2, but that most students from deprived educational backgrounds have had minimal exposure to higher-order thinking [Blooms’ levels 3 and above (30)]. Highly structured course designs provide practice with problem-solving and reasoning skills that may be new to high-risk students in introductory college STEM courses. Specifically, active learning that promotes peer interaction makes students articulate their logic and consider other points of view when solving problems, leading to learning gains [e.g., (31)].

We call this proposal the Carnegie Hall hypothesis on the basis of the story of a tourist who asks a New Yorker how to get to Carnegie Hall. The answer? “Practice.” The hypothesis is a direct extension of constructivist theory (32, 33), a cornerstone of classical explanations for why active learning works (29, 34). Constructivism maintains that individuals incorporate new information into their previous conception and that they only change ideas when they realize that the new information conflicts with their previous understanding, creating cognitive dissonance. Constructivism defines the types of practice that are valuable for underprepared students in introductory courses: not drilling, but exercises that challenge previous conceptions and require students to explain their thinking.

Traditional approaches to reducing the achievement gap have produced disproportionate benefits for disadvantaged students by focusing an influx of resources on them. If the Carnegie Hall hypothesis is correct, highly structured courses may be able to reduce the achievement gap while raising the performance of all students without requiring additional resources because of a disproportionate benefit of increased practice for capable students who have grown up lacking exposure to higher-order cognitive skills.

Supporting Online Material

Materials and Methods

Figs. S1 and S2

Tables S1 to S3

References (3547)

References and Notes

  1. Materials and methods are available as supporting material on Science Online.
  2. Acknowledgments: We thank the University of Washington Biology Education Research Group, Bio-Grads Organizing Opportunities for Diversity and Development, and P. Haak for helpful discussion and A. Arquiza, R. Moreno, and S. Fernandes for assembling the achievement gap data across introductory STEM courses from the Office of Minority Affairs and Diversity. Data are archived with the study authors and may be obtained by contacting S.F. The work was funded by grants from the University of Washington’s College of Arts and Sciences as part of the Foundations Course program, the University of Washington–Howard Hughes Medical Institute Undergraduate Science Education Programs (grant 52003841), and an NSF predoctoral fellowship to D.C.H. The research was conducted under Human Subjects Review no. 07-8483-J01.
View Abstract

Stay Connected to Science

Navigate This Article