Policy ForumGenetics

Toward a New Vocabulary of Human Genetic Variation

See allHide authors and affiliations

Science  15 Nov 2002:
Vol. 298, Issue 5597, pp. 1337-1338
DOI: 10.1126/science.1074447

Recent genetic variation research has reinvigorated the dispute over the validity of race as a research variable [HN1] (18). Proponents of using race assert that genetic differences and racial classification are strongly associated, and so support the use of race in the design of research and the application of its findings (3, 6). Critics cast race as a social construct and counter that putting subjects into racial groups fundamentally misrepresents human genetic variation and hinders research (2, 89). Several solutions have been offered, such as replacing race with ethnicity (1011) or with genetic markers (12). However, although these suggestions might apply to certain kinds of research, none provides an overall solution.

This is because there are many distinct meanings to the word race [HN2], and different ways of using it as a research variable. For example, a popular definition today describes race as a social construct that incorporates beliefs about language, history, and culture (13). Here, race forms the basis on which social identity, traditions, and politics are built. This concept has been promoted as an alternative to an earlier genetic theory of race, which has been scientifically repudiated and rejected, that divided the human species into subspecies that were ranked on the basis of skill, intelligence, and morality (14). However, rejecting race as genetic hierarchy is not tantamount to rejecting the idea that human populations differ genetically. [HN3].

Conceptualizing race as a social construct has helped to undermine racism by eliminating its alleged natural basis. However, it has also had the unintended consequence of eliminating a legitimate basis for discussion of population-based genetic differences. Insistence that there is no such thing as race in a genetic or biological sense, as well as the lack of an alternative term for discussing genetic diversity leaves those who wish to discuss genetic diversity without a functional vocabulary (1518).

Two points should help to clarify these issues. First, the debate in genetic variation research over race as a research variable is not a debate over whether human populations differ genetically. Rather the debate is over the scientific, clinical, and social significance of labeling genetic differences as race, as something else, or not at all (19).


Second, many of the disagreements about race and genetics are fostered by confusion over the relationship of these two concepts. For example, recent studies have examined DNA samples from various populations and clustered them into groups based on identity of DNA sequences at several loci. In some studies (20), but not others (8), genomes examined by this method do sort out in a way that reflects race as social construct, depending on how many or which genetic loci are compared. [HN4] It is not that race exists in one population and not in another. Rather, it may be that the appearance of clustering is a function of how populations are sampled (21), of how criteria for boundaries between clusters are set, and of the level of resolution used. In the same way that the earth can be described by many different kinds of maps —from topological to economic—so, too, can the naturally occurring genetic variation among populations be divided in numerous ways and be made to highlight any chosen similarity or difference.

Proposed Alternatives to Race

This framework helps to explain why replacing race with ethnicity or genetic markers fails to solve the problem of race in genetic research. Some studies use race as a way to enroll a sample of people with genotypes more likely to be similar or diverse in a particular way. For such a study, genetic markers [HN5] might effectively replace race, but only if the markers chosen happen to be distributed among the selected populations in the same way as the variable of interest in that particular study. This reasoning suggests that it may only be a matter of time before sufficient genetic markers are identified. However, results of such research still need to be socially or geographically grounded for clinical application (22).

Another popular alternative to race is ethnicity [HN6], a term with multiple, conflicting meanings (23). Anthropologists initially proposed ethnicity to direct attention away from genetics and toward social and historical factors as explanations of population variation (24). The term has gained users simply because to some researchers the word “ethnicity” seems more acceptable. However, its actual application is often identical to race (10, 23). Nevertheless, if applied in its original sense to define a population socially or culturally, ethnicity could replace race in research when a researcher seeks a variable that corresponds to the behavioral aspects implied by the term, such as diet, occupation, social status, or health beliefs (10, 11).

For some studies, a combination of genetic and social markers may be appropriate. However, examining more closely the range of uses to which genetic researchers put race and the problems they encounter when they do so might provide some guidance on how research should proceed.

Race as a Variable in Genetic Research

Researchers interested in genetics incorporate race into research designs in several ways. For example, pharmacogenetic studies [HN7] may be inventorying the genotypes occurring in human populations and may ideally sample from a diverse set of populations to attempt to represent a broad range of genetic difference. Association studies [HN8] seek genetic differences between those with and without a particular phenotype. The more genetically similar they are, the easier it is to find the specific genetic differences that account for differences in phenotype. Race is used as a proxy for genetic relatedness to control for potential confounding that occurs if the study populations differ genetically in ways not related to the phenotype in question. In contrast, epidemiological studies [HN9] seeking to determine risk factors for disease may want to use race to control for population stratification, but also as a proxy for environmental exposures, including social interactions (e.g., people of certain races may be more or less likely to be referred for further treatment).

What appears as straightforward use of a variable, however, becomes complicated when researchers decide how to put it into operation. In all but the final example, the researcher is using race as a way of grouping subjects by similarity or difference of genetic sequence, which reflects population history. In the absence of known genetic markers, researchers need to access this variation through race or through ethnicity, which is the way certain aspects of genetic variation have been socially represented. Several conventions exist, such as asking subjects how they self-identify ethnically or racially, or where their grandparents were born.

Complicating Factors

Some of the complexity of race comes from its multiple, overlapping meanings that span popular and scientific use. Further confusion is generated, however, by the tendency to leave race undefined. This leads to three sorts of problems in the conduct and reporting of genetic research: (i) nonequivalent uses of race within one research report; (ii) inverting the relationship between genetics and race, or studying race as an end in itself; and (iii) an overemphasis on race.

Accessing a particular set of conditions with a variable requires choosing the right variable and using it consistently. While this may seem obvious, race-related genetic research does not always observe this rule. For example, the initial reference to race in an article is often to the racial identity of individual subjects, sometimes described as “self-assigned” by subjects. A subsequent reference to race might appear in the classification of genotypes associated with groups of the self-identified subjects. The final one might appear in the discussion section that generalizes the findings to different racial groups, i.e., massive world populations, such as Euro-Americans or Asians.

Nonequivalent use of labels is illustrated by the common juxtaposition of terms such as “white” with “African-American,” where skin color and geographical location are treated as equivalent. Another example is the juxtaposition of “Asian-American” with “Mexican-American,” which implies that people of Asian ancestry now living in the United States represent a level of genetic diversity that is equivalent to that of people of Mexican ancestry now living in the United States. Such examples indicate a need for more consistent attention to definition of groups and to the need to explain the rationale for their equivalence.

Another set of problems in race research results when researchers attempt to map genetics to race (or to other characteristics that are, in part, socially determined) as an end in itself. One such study set out as its goal, “to identify a set of genetic markers that would allow the confident determination of ethnicity, for use in a forensic setting” (25). A similar problem is caused by the use of language such as “white chromosome,” “mutant African alleles,” or “Asian gene gap” as scientific shorthand, implying that some genetic variants “belong” exclusively to some races (2628). Even if, in rare circumstances, certain alleles have been found exclusively in one population, to call a chromosome white or Asian makes an inappropriate link between a rapidly shifting social term and a fixed biological entity.


Race has been retained in genetic research on the basis of the belief that it is a social or geographical unit that approximates a genetic grouping. We do not argue for the imposition of any one particular set of terms to describe genetic groupings, or for the wholesale elimination of race from genetic research. Rather, we stress that this type of boundary is not likely to be equally useful in all kinds of genetic research. Furthermore, researchers need to be clear about the choice and definition of terms, as well as to be careful about making appropriate generalizations. Funders and publishers of biomedical research should follow the suggestions of editorials in Nature Genetics, Archives of Pediatrics and Adolescent Medicine, and the British Medical Journal and ask researchers to define race when they use it (29-31). [HN10] Editors should take care that these rules are followed. Our own preliminary analysis of articles published after these editorials reveal few if any changes in the explanations provided concerning race as a research variable.

In designing genetic studies, researchers should first consider whether they want to use race as a proxy for genetic similarity or diversity, or as a proxy for nongenetic factors such as socioeconomic status, or both. Are there other, more direct, measures available that should be used instead? If not, it is important to consider the level of resolution necessary to describe populations, to use groupings that are comparable in resolution, and to describe them precisely. Sometimes, nationality might suffice, whereas other investigations might require a smaller geographical region or allow a larger one. Thus, it is important to collect data with as much precision as possible and to note always how subjects were assigned to groups, such as on the basis of records or self-assignment. It is imperative for the research community to acknowledge that the maps used in research are not the only maps used to describe the terrain they study and that careful use of language is necessary to avoid misunderstanding.

HyperNotes Related Resources on the World Wide Web

General Hypernotes

Dictionaries and Glossaries

The xrefer Web site provides scientific dictionaries and other reference works.

The Talking Glossary of Genetic Terms is provided by the National Human Genome Research Institute (NHGRI).

A genome glossary is provided by the Human Genome Project Information Web site.

An illustrated glossary of genetics is provided by the GeneTests_GeneClinics Web site.

The Cambridge Healthtech Institute provides a collection of genomics glossaries and taxonomies. A genetic variations glossary is included.

Web Collections, References, and Resource Lists

P. Gannon's Cell & Molecular Biology Online is a collection of annotated links to Internet resources.

The library of the Karolinska Institutet, Stockholm, Sweden, provides collections of Internet resources on molecular biology and genetics and bioethics.

The Genetics Education Center of links to Internet resources is maintained by D. Collins, University of Kansas Medical Center. Also provided is the Information for Genetic Professionals Web site with a section on ethics resources.

The National Information Resource on Ethics & Human Genetics is provided by the National Reference Center for Bioethics Literature, Kennedy Institute of Ethics, Georgetown University.

The Center for Biomedical Ethics, Stanford University, provides links to Internet ethics resources.

The Biological Anthropology Web is provided by K. Kelly, Department of Anthropology, University of Iowa.

NHGRI provides a list of online bioethics resources.

The Human Genome Project Information Web site provides a resource page on minorities, race, and genomics.

Online Texts and Lecture Notes

The National Center for Biotechnology Information (NCBI) provides a Science Primer on genetics and genomics.

Exploring Our Molecular Selves is an online multimedia educational kit provided by NHGRI.

Genomics and Its Impact on Medicine and Society: A 2001 Primer is available on the Human Genome Project Information Web site.

The History of Race in Science Web site is sponsored by the MIT Center for the Study of Diversity in Science, Technology, and Medicine, the Program in Science, Technology, and Society, Massachusetts Institute of Technology, and the Department of History, University of Toronto.

D. O'Neil, Behavioral Sciences Department, Palomar College, San Marcos, CA, provides a tutorial on human variation as part of a collection of physical anthropology tutorials. Cultural anthropology tutorials are also provided.

P. Willoughby, Department of Anthropology, University of Alberta, offers lecture notes and other resources for an anthropology course on race and racism in the modern world.

J. Binden, Department of Anthropology, University of Alabama, offers lecture notes for a course on physical anthropology.

M. K. Raghuraman, Department of Genome Sciences, University of Washington, provides lecture notes for a genetics course.

General Reports and Articles

The April 2001 issue of the Atlantic Monthly had an article by S. Olson titled “The genetic archaeology of race.”

The 18 February 2002 issue of The Scientist had an article by R. Lewis titled “Race and the clinic: Good science?” (free registration required).

The 14 June 1997 issue of BMJ had an article by R. Bhopal titled “Is research into ethnicity and health racist, unsound, or important science?”

J. Marks, Department of Sociology and Anthropology, University of North Carolina, Charlotte, makes available (in PDF format) an article titled “Contemporary bio-anthropology: Where the trailing edge of anthropology meets the leading edge of bioethics” that was published in the August 2002 issue of Anthropology Today and a 1997 conference paper titled “The spectrum of human variation.”

The 23 October 1998 issue of Science had a News article by E. Marshall titled “DNA studies challenge the meaning of race.” The 15 October 1999 issue had a Viewpoint article by K. Owens and M.-C. King titled “Genomic views of human history” (17).

The Fall 2001 issue of Perspectives had an article by C. Wienker titled “The anthropological perspective on race: An historical overview.”

Evaluating Human Genetic Diversity is a 1998 report available on the Web from the National Academies Press. The 2002 reportSpeaking of Health: Assessing Health Communication Strategies for Diverse Populations includes a chapter titled “Toward a new definition of diversity.”

Numbered Hypernotes

1. Validity of race in research. Does Race Exist? on the companion Web site for the “Mystery of the First Americans” from NOVA Online makes available the essays “An antagonist's perspective” by C. L. Brace and “A proponent's perspective” by G. W. Gill. The History of Race in Science Web site makes available a 30 July 2002 New York Times article by N. Wade titled “Race is seen as real guide to track roots of disease.” Bio-IT World had a 9 September 2002 article by K. Davies titled “The debate over race relations.” V. Randall, Institute on Race, Health Care and the Law, University of Dayton School of Law, makes available an article by S. S.-J. Lee, J. Mountain, and B. Koenig titled “The reification of race in health research,” which was adapted from a Spring 2001 article (available in PDF format) in the Yale Journal of Health Policy, Law, and Ethics. The December 2001 issue of Policy Review had an article by S. Satel titled “Medicine's race problem.” J. Marks makes available (in PDF format) an article titled “Science and race” that was published in the November-December 1996 issue of American Behavioral Scientist. The 15 October 1996 issue of the Annals of Internal Medicine had a Perspectives article by R. Witzig titled “The medicalization of race: Scientific legitimization of a flawed social construct” (13). Africana.com makes available a 27 September 2001 article by J. Entine and S. Satel titled “The science and politics of genetic diversity.” C. D. Kreger's modern human origins Web site makes available a paper titled “The concept of human races: Uses and problems.” The proceedings of a February 1999 workshop titled “Anthropology, Genetic Diversity, and Ethics,” made available by the Center for 21st Century Studies, University of Wisconsin-Milwaukee, include transcripts of a session on issues relating to population identification. CNN.com offers a 2 May 2001 article by E. Cohen titled “Experts debate role of race in medical research.”

2. Definitions and concepts of race. Entries about race are included in the online Wikipedia encyclopedia and in the Columbia Encyclopedia. A FAQ on human diversity and race is available on the online learning center for C. Kottak's Cultural Anthropology. D. O'Neil offers a presentation on the models of classification of human variation. J. Binden provides lecture notes on the history of the concepts of race and definitions of race for a course on physical anthropology. K. Boden, Anthropology Department, Shippensburg University, PA, provides an introduction to race concepts for a cultural anthropology course. J. Scarry, Department of Anthropology, University of North Carolina, offers lecture notes on classification, racial groups, and racism for an anthropology course. The Department of Human Biology, University of Leeds, makes available (in PDF format) lecture notes tiled “A cautionary tale — The races of man” for a course on human evolution. A Statement on “Race” is provided by the American Anthropological Association; a Response to OMB Directive 15 (Race and Ethnic Standards for Federal Statistics and Administrative Reporting) is also provided. The American Association of Physical Anthropologists provides a Statement on Biological Aspects of Race.

3. Human genetic variation. The NIH Office of Science Education offers a teacher's guide to understanding human variation. The companion Web site for How Humans Evolved by R. Boyd and J. Silk provides an introduction to human genetic diversity. The Genome News Network, an online publication of The Center for the Advancement of Genomics, offers a presentation on human genome variations. M. Murphy, Department of Anthropology, University of Alabama, offers lecture notes on human variation for an anthropology course. J. Binden provides lecture notes on human genetic diversity for a course on physical anthropology. A. Martindale, Department of Anthropology, McMaster University, Hamilton, Canada, provides lecture notes (in PDF format) on human biological diversity for a biological anthropology course.

4. Genetic studies of population groups. The 1 April 1997 issue of the Proceedings of the National Academy of Sciences had an article by L. B. Jorde et al. titled “Microsatellite diversity and the demographic history of modern humans” (20). The 29 April 1997 issue had an article by G. Barbujani, A. Magagni, E. Minch, and L. Cavalli-Sforza titled “An apportionment of human DNA diversity.” The March 2000 issue of the American Journal of Human Genetics had an article by L. B. Jorde et al. titled “The distribution of human genetic diversity: A comparison of mitochondrial, autosomal, and Y-chromosome data.” The March 2001 issue had an article by W. S. Watkins et al. titled “Patterns of ancestral human diversity: An analysis of Alu-insertion and restriction-site polymorphisms.” The October 2000 issue of Genetics had an article by D. Labudaa, E. Zietkiewicz, and V. Yotova titled “Archaic lineages in the history of modern humans.” The 21 June 2002 issue of Science had a report by S. Gabriel et al. titled “The structure of haplotype blocks in the human genome” and a News Focus article by J. Couzin titled “New mapping project splits the community.” The 2 March 2001 issue had a review article by R. Cann titled “Genetic clues to dispersal in human populations: Retracing the past from the present.” NHGRI makes available a 29 October 2002 NIH news release titled “International consortium launches genetic variation mapping project” with links to related resources. The 4 November 2002 issue of Science had a News of the Week article by J. Couzin titled “HapMap launched with pledges of $100 million.” MEDLINEplus makes available a 29 October 2002 Reuters news article by A. Ault titled “Global team will hunt for disease hotspots on genes.”

5. Human genetic markers. Marker is defined in the NHGRI glossary and in the GeneTests_GeneClinics glossary. U. Melcher's molecular genetics tutorial includes a section on genetic markers. The Wellcome Trust's Human Genome Web site offers an introduction to variation in the human genome and genetic markers. H. E. Sutton, School of Biological Sciences, University of Texas, provides lecture notes on genetic markers for a course on heredity, evolution and society. The SNP Consortium provides an introduction to single nucleotide polymorphisms (SNP) markers. NCBI's Science Primer offers a presentation on SNPs. BioTechniques had a June 2002 supplement titled “SNPs: Discovery of Markers for Disease.”

6. Ethnicity. Entries for ethnicity are included in the online Wikipedia encyclopedia and in xrefer's Dictionary of Geography. D. O'Neil's cultural anthropology tutorials include a presentation on ethnicity and race. Lecture notes on ethnicity by R. Norton for an anthropology course are made available by the Department of Anthropology, Macquarie University, Australia. M. Patterson, Department of Sociology, Geography and Anthropology, Kennesaw State University, GA, offers a PowerPoint presentation and lecture notes on ethnicity and race for a geography course on social issues.

7. Pharmacogenetic studies. Pharmacogenomics is defined in the Human Gene Project genome glossary. The Human Gene Project Information Web site offers a presentation on pharmacogenomics. NCBI's Science Primer includes a section on pharmacogenomics. The Pharmacogenetics Research Network Web site is provided by the National Institute of General Medical Sciences. The 13 November 1999 issue of BMJ had an article by W. Sadée on pharmacogenomics. The 15 October 1999 issue of Science had a review article by W. Evans and M. Relling titled “Pharmacogenomics: Translating functional genomics into rational therapeutics.” The Cambridge Healthtech Institute offers a pharmacogenomics glossary.

8. Association studies. D. Curtis, London Statistical Genetics Group, St. Bartholomew's and the Royal London School of Medicine and Dentistry, provides an introduction to association studies. The molecular genetics review Web site of the Clinical Molecular Genetics Society offers lecture notes by C. Fratter on association studies. G. Carey, Department of Psychology and Institute for Behavioral Genetics, University of Colorado, makes available a draft chapter (in PDF format) on association studies prepared for a book on human genetics for the social sciences.

9. Epidemiological studies. D. O'Neil offers a presentation on epidemiology in his medical anthropology tutorial. FACSNET provides a epidemiology tutorial. Epidemiology for the Uninitiated by D. Coggon, G. Rose, and D. Barker is made available by BMJ. The 30 July 1994 issue of BMJ had an article by P. Senior and R. Bhopal titled “Ethnicity as a variable in epidemiological research.” The CDC Office of Genomics and Disease Prevention makes available an article by M. Khoury and Q. Yang titled “The future of genetic studies of complex human diseases: An epidemiologic perspective.” Epidemiology, the Internet and Global Health, a Web course from the Global Health Network, includes presentations on the basic concepts and principles of epidemiology, methods of epidemiology, and molecular epidemiology. HuGENet (human genome epidemiology network) is a global collaboration of individuals and organizations who develop and communicate epidemiologic information on the human genome.

10. Journal guidelines. The February 2000 issue of Nature Genetics had an editorial titled “Census, race and science” (30). The February 2001 issue of the Archives of Pediatric and Adolescent Medicine had an editorial by F. Rivara and L. Finberg titled “Use of the terms race and ethnicity” (31). The 27 April 1997 issue of BMJ had an article titled “Style Matters: Ethnicity, race, and culture: Guidelines for research, audit, and publication” (29) and an editorial by K. McKenzie and N. Crowcroft titled “Describing race, ethnicity, and culture in medical research.” The 22 May 1999 issue of the British Dental Journal had a commentary by R. Bhopal and J. Rankin titled “Concept and terminology in ethnicity, race and health: Be aware of the ongoing debate.”

11. P. Sankar is at the Center for Bioethics, University of Pennsylvania School of Medicine.

12. M. K. Cho is at the Center for Biomedical Ethics, Stanford University.

References and Notes

View Abstract

Navigate This Article