Education ForumScience Education

Anatomy of STEM teaching in North American universities

See allHide authors and affiliations

Science  30 Mar 2018:
Vol. 359, Issue 6383, pp. 1468-1470
DOI: 10.1126/science.aap8892

Embedded Image

Despite numerous calls to improve student engagement, supported by a large body of evidence, STEM classes are often still dominated by lectures.


A large body of evidence demonstrates that strategies that promote student interactions and cognitively engage students with content (1) lead to gains in learning and attitudinal outcomes for students in science, technology, engineering, and mathematics (STEM) courses (1, 2). Many educational and governmental bodies have called for and supported adoption of these student-centered strategies throughout the undergraduate STEM curriculum. But to the extent that we have pictures of the STEM undergraduate instructional landscape, it has mostly been provided through self-report surveys of faculty members, within a particular STEM discipline [e.g., (36)]. Such surveys are prone to reliability threats and can underestimate the complexity of classroom environments, and few are implemented nationally to provide valid and reliable data (7). Reflecting the limited state of these data, a report from the U.S. National Academies of Sciences, Engineering, and Medicine called for improved data collection to understand the use of evidence-based instructional practices (8). We report here a major step toward a characterization of STEM teaching practices in North American universities based on classroom observations from over 2000 classes taught by more than 500 STEM faculty members across 25 institutions.

Our study used the Classroom Observation Protocol for Undergraduate STEM (COPUS) (9), which can provide consistent assessment of instructional practices and document impacts of educational initiatives. COPUS requires documenting the co-occurrence of 13 student behaviors (e.g., listening, answering questions) and 12 instructor behaviors (e.g., lecturing, posing questions) during each 2-min interval of a class. Our large-scale COPUS data allow generalizations beyond institution-level descriptions and suggest an opportunity to resolve inconsistent findings from recent discipline-based education research (DBER) studies. For example, STEM faculty report that it is more difficult to use student-centered techniques in large classrooms or less amenable physical layouts (10), but this has not been borne out in practice (11). Previous studies also disagree on the relationship between course level (introductory or upper division) and instructional practices (1113). Also, although classroom observations are often used for evaluative (e.g., promotion and tenure) purposes, as well as to document the impact of educational initiatives, more data are needed to guide such use of observational protocols to collect data in a valid way (11).

Didactic, Interactive, and More

We observed 2008 STEM classes from 709 courses taught by 548 individual faculty members across 24 doctorate-granting universities and one primarily undergraduate institution (table S3). Faculty members were observed teaching on average 1.3 courses and 3.2 times. Observations covered seven STEM disciplines: 71.4% from lower-level courses, 19.8% from upper-level courses, 4.7% from graduate courses, 0.3% from cross-listed courses, and 3.7% from courses with unspecified levels (table S4). COPUS, which was adapted from the Teaching Dimensions Observation Protocol (14), was selected for this study as it is broadly used and has been demonstrated to provide valid characterization of instructional practices in STEM classrooms (see supplementary materials). The high level of interrater reliability consistently achieved across studies employing COPUS ensures that it can provide a reliable and valid characterization of STEM instruction on a large scale.

The most common instructor behaviors were lecture (an average of 74.9 ± 27.8% of the total 2-min intervals of a given class), writing in real time (35.0 ± 35.2%), posing nonrhetorical questions (25.0 ± 21.4%), following-up on questions (14.3 ± 18.9%), answering student questions (11.5 ± 12.8%), and administering clicker questions (10.0 ± 16.5%). Students primarily listened to the instructor (87.1 ± 20.8%), answered instructor questions (21.6 ± 19.8%), and asked questions (10.4 ± 12.1%).

Simply documenting the prevalence of instructor and student behaviors does not accurately reflect what strategies are being implemented alongside or instead of one another. To address this issue, we conducted latent profile analysis, creating clusters based on four instructor behaviors (lecture, posing questions, clicker questions, and one-on-one work with students) and four student behaviors (group work on clicker questions, group work on worksheets, other group work, and asking questions). We chose these eight behaviors because they were observed with adequate heterogeneity, were not highly correlated with each other, and were likely to be key strategies in active or nonactive learning environments. The solution consisted of seven clusters, each representing a unique instructional profile (fig. S4).

The first group of instructional profiles, which we labeled “Didactic” (clusters 1 and 2), depicts classrooms in which 80% or more of class time consists of lecturing. Fifty-five percent of the observations belonged to this broad instructional style. Cluster 1 has no observed student involvement except sporadic questions from and to the students, whereas cluster 2 has clicker questions that are sometimes associated with group work.

The second group of profiles, which we named “Interactive Lecture” (clusters 3 and 4), represents instructors who supplement lecture with more student-centered strategies such as “Other group activities” (cluster 3) and “Clicker questions with group work” (cluster 4). Twenty-seven percent of the observations were classified in this instructional style.

Finally, clusters 5, 6, and 7 depict instructors who incorporate student-centered strategies into large portions of their classes. Eighteen percent of observations were in this “Student-Centered” style. Cluster 5 represents a variety of group work strategies consistently used, whereas cluster 7 represents a similar variety but with less consistent usage. Some in cluster 6 may resemble a popular style of instruction, Process Oriented Guided Inquiry Learning (15), but others (due to a higher proportion of lecture) likely represent strategies that incorporate group worksheets and one-on-one assistance from the instructor. Although we are unable to claim that our data are entirely representative, the sample size and diversity of courses and disciplines represented in our data suggest that these profiles and broad instructional styles provide a reliable snapshot of the current instructional landscape in undergraduate STEM courses taught at North American institutions.

We leveraged the identification of the three broad instructional styles to address discrepancies among prior DBER studies (see the graphic). Observations in large courses were classified in the didactic instructional style more than expected by random chance and in the student-centered instructional style less than expected by chance, whereas the opposite occurred for small courses [χ2 (4, N = 1753) = 56.5, P < 0.001, V = 0.13]. Classrooms with flexible seating were more likely to be classified in the student-centered instructional style [χ2 (2, N = 1137) = 55.9, P < 0.001, V = 0.22]. But simply providing infrastructure or small class size does not necessarily change instructional practices, as about half of the classes with flexible seating and about half of the small- and medium-size courses were classified as didactic. We found no significant relationships between instructional style and course level, suggesting that instructional style is similar throughout the curriculum [χ2 (8, N = 1927) = 11.0, P = 0.20].

We were interested in differences by discipline because content, disciplinary teaching conventions, and educational research traditions are different for each. Relative to chance, mathematics and geology have more student-centered styles than expected, biology has more interactive styles than expected, and chemistry has more didactic styles than expected [χ2 (12, N = 1994) = 101.3, P < 0.001, V = 0.16].

As in previous research (11), we found that individual instructors vary their teaching from day to day. Only about half of the courses (53.7%) from which two or more observations were collected had their observations classified into only one of the three broad instructional styles; 41.9% of these courses had their observations classified in two styles, and 9.1% of the courses that were observed three or more times had observations classified in all three styles. The more frequently an instructor was observed within the same course, the greater the number of instructional styles under which her or his teaching was classified. Our data thus suggest that at least four observations are necessary for reliable characterization of teaching (see the graphic, bottom).

Data, Incentives, Training

Three main findings emerge from this report: (i) Didactic practices are prevalent throughout the undergraduate STEM curriculum despite ample evidence for the limited impact of these practices and substantial interest on the part of institutions and national organizations in education reform. (ii) Although faculty survey-based studies have suggested classroom layouts and course size as barriers to instructional innovation, flexible classroom layouts and small course sizes do not necessarily lead to an increase in student-centered practices. (iii) Reliable characterization of instructional practices requires at least four visits.

These findings challenge institutions and STEM disciplines to reflect on practices and policies that sustain the status quo. Specifically, institutions should revise their tenure, promotion, and merit-recognition policies to incentivize and reward implementation of evidence-based instructional practices for all academic ranks. Ideally, implementation of these practices would be an expectation for promotion and tenure to be obtained and factored into annual merit decisions. These policy changes would require institutions and STEM professional organizations to provide effective pedagogical training for the current and future professoriate, similar to the level provided for research. Further, these policy changes cannot be meaningfully implemented without research-based guidelines for measuring effective teaching practices. Funding agencies should prioritize the development of such guidelines.

Distributions of instructional styles

Distributions of the three broad instructional styles across class size (small, 0 to 50 students; medium, 51 to 100; large, more than 100), classroom physical layout, course level, STEM discipline, and number of observations per course. The lower-right panel represents the relationship between the number of observations per course and the classification of observations in one, two, and all three broad instructional styles. The percentages appearing to the left of each bar represent the proportion of the observations in a particular graph that are reflected in a given bar.


This report provides specific baseline data for comparison for determining the impact of educational interventions, for professional development facilitators to inform the design of their programs, and for faculty when they receive COPUS data. The seven instructional profiles allow these comparisons to move beyond the binary teacher- or student-centered teaching classification and to inform incremental and diverse paths toward student-centered teaching. However, this baseline is limited because the sample is focused on doctorate-granting universities in North America and only seven STEM disciplines. Moreover, the analytical tool used (i.e., COPUS) focuses on frequencies and not quality of behaviors, does not capture the quality of the content being conveyed, and only focuses on the classroom portion of STEM courses, not other components such as laboratory, field work, or online experiences. To fully characterize the STEM instructional landscape, funding agencies should support large-scale studies that include a representative sample of institutions and/or STEM disciplines, as well as multiple sources of data that characterize type and quality of instructional practices experienced by students in all components of a course.

References and Notes

Acknowledgments: The authors acknowledge support from the U.S. National Science Foundation under grant nos. DUE 1347243 (J.M.E., K.M.P., P.J.W., A.M.Y.), DUE 1432804 (E.R.S., M.K.E., F.A.L., M.L.F., B.V.V.), DUE 1525331 (S.V.C.), DUE 1323022 (J.K.K.), DUE 1432728 (R.C.), DUE 1347577 (M.K.S., M.R.S., E.L.V.), DUE 1347578 (M.K.S.), DUE 1322851 (M.K.S.), DRL 0962805 (M.K.S., M.R.S., E.L.V.), DUE 1347697 (T.M., N.M.), DUE 1256003 (M.S.), DUE 1347814 (M.S.), CAREER 1552448 (M.S.,J.H.); the NIH under award nos. 1R25GM114822-01 (C.J.L.) and 5R25GM114822 (C.J.L); the John Templeton Foundation grant FP053369-G via a subaward from the University of Chicago Knowledge Lab (C.J.L.); The Carl Wieman Science Education Initiative, University of British Columbia (M.K.B., L.M.M., T.M.R., N.G.S., P.M.S., L.K.W.); the University of Northern Colorado: Faculty Research and Publications Board (S.E.D.P.); the Flexible Learning Initiative, University of British Columbia (A.M.); Howard Hughes Medical Institute award no. 52006934 (S.M.L.) and no. 52008380 (C.J.L.). All COPUS coders are acknowledged in the supplementary materials.
View Abstract

Stay Connected to Science

Navigate This Article