EDITORIAL

Next-Generation Assessments

See allHide authors and affiliations

Science  12 Nov 2010:
Vol. 330, Issue 6006, pp. 890
DOI: 10.1126/science.1199897
CREDIT: CASEY A. CASS/UNIVERSITY OF COLORADO

Spurred by its poor performance on international tests, the United States is now embarked on a new round of education standards development. Common Core State Standards have been released in English language arts and mathematics, and a draft framework for science education standards from the U.S. National Research Council provides a complex conception of scientific reasoning practices that are integrated with principled understanding of science content. These ambitious standards aim at problem-solving and knowledge application and are urgently needed, but will fail if lessons from the past go unheeded. In the United States, tests—not standards—determine what gets taught, and these tests must change dramatically.

The U.S. has a 30-year history of accountability testing in grades 3 to 10. An extensive research literature has documented the negative effects of such test-driven instruction, the most obvious being the reduction or elimination of less-tested subjects, including science and social studies. Less obvious have been the negative effects on learning in tested subjects. When students are drilled on materials that closely resemble accountability tests, test scores can rise dramatically without a commensurate gain in learning. This “test score inflation” is revealed when the apparent gains fail to replicate on independent measures that cover the same content, such as U.S. National Assessment of Educational Progress. More serious are the distorting effects on students' conceptions of what it means to know and do science and mathematics.

CREDIT: THINKSTOCK

Previous reforms have tried to avoid such shortfalls. When the U.S. standards-based educational reform movement began in the early 1990s, the goal was to replace low-level, basic-skills tests with much more challenging, authentic assessments focused on problem-solving and higher-order reasoning. A start was made, but the sheer quantity of tests required by the No Child Left Behind (NCLB) Act of 2001 forced a retreat to inexpensive multiple-choice tests.

In September 2010, the U.S. Department of Education awarded Race-to-the-Top grants to two state consortia that have promised to prepare mathematics assessments that include “challenging performance tasks and innovative, computer-enhanced items that elicit complex demonstrations of learning.”* These consortia were intended to permit economies of scale and enable the development of quality assessments tied to a shared curriculum, something missing in the United States but found in top-performing countries.** Unfortunately, negotiating deeper curriculum development has become unwieldy, and there is an emphasis instead on the timeliness and frequency of score reporting. Enormous care is now needed to prevent this opportunity from being undone by the efficiencies of multiple-choice testing, which currently dominates the new interim assessment systems sold to school districts to help raise scores on NCLB tests.

Fundamentally different forms of assessment are needed in the United States to directly capture more complex levels of achievement: the ability to solve nonroutine problems, to analyze data and reason from evidence, to communicate effectively both orally and in writing, and to frame and conduct scientific investigations. “Performance assessments” that require students to demonstrate their learning and reasoning are far better measuring devices and predominate in high-performing countries that the United States seeks to emulate. To explain successes on international assessments, researchers point to the importance of ongoing instruction that calls on students to apply their knowledge and the use of curriculum-embedded assessments that help students and teachers develop common understandings of scoring criteria. These ideas are already being implemented in the redesign of the U.S. Advanced Placement precollege curricula and examinations, but they are just as important in the early grades. Real demonstrations of mathematics and science mastery, not ease of scoring, must determine assessment policy.

Navigate This Article