Review

# Ethnicity and Conflict: Theory and Facts

See allHide authors and affiliations

Science  18 May 2012:
Vol. 336, Issue 6083, pp. 858-865
DOI: 10.1126/science.1222240

## Abstract

Over the second half of the 20th century, conflicts within national boundaries became increasingly dominant. One-third of all countries experienced civil conflict. Many (if not most) such conflicts involved violence along ethnic lines. On the basis of recent theoretical and empirical research, we provide evidence that preexisting ethnic divisions do influence social conflict. Our analysis also points to particular channels of influence. Specifically, we show that two different measures of ethnic division—polarization and fractionalization—jointly influence conflict, the former more so when the winners enjoy a “public” prize (such as political power or religious hegemony), the latter more so when the prize is “private” (such as looted resources, government subsidies, or infrastructures). The available data appear to strongly support existing theories of intergroup conflict. Our argument also provides indirect evidence that ethnic conflicts are likely to be instrumental, rather than driven by primordial hatreds.

There are two remarkable facts about social conflict that deserve notice. First, within-country conflicts account for an enormous share of deaths and hardship in the world today. Figure 1 depicts global trends in inter- and intrastate conflict. Since the Second World War, there have been 22 interstate conflicts with more than 25 battle-related deaths per year, and 9 of them have killed at least 1000 over the entire history of conflict (1). The total number of attendant battle deaths in these conflicts is estimated to be around 3 to 8 million (2). The same period witnessed 240 civil conflicts with more than 25 battle-related deaths per year, and almost half of them killed more than 1000 (1). Estimates of the total number of battle deaths are in the range of 5 to 10 million (2). Added to the direct count of battle deaths are the 25 million noncombatant civilian (3) and indirect deaths due to disease and malnutrition, which have been estimated to be at least four times as high as violent deaths (4), as well as the forced displacements of more than 40 million individuals by 2010 (5). In 2010 there were 30 ongoing civil conflicts (6).

Second, internal conflicts often appear to be ethnic in nature. More than half of the civil conflicts recorded since the end of the Second World War have been classified as ethnic or religious (3, 7). One criterion for a conflict to be classified as ethnic is that it involves a rebellion against the state on behalf of some ethnic group (8). Such conflicts involved 14% of the 709 ethnic groups categorized worldwide (9). Brubaker and. Laitin, examining the history of internal conflicts in the second half of the 20th century, are led to remark on “the eclipse of the left-right ideological axis” and the “marked ethnicization of violent challenger-incumbent contests” (10). Horowitz, author of a monumental treatise on the subject of ethnic conflict, observes that “[t]he Marxian concept of class as an inherited and determinative affiliation finds no support in [the] data. Marx’s conception applies with far less distortion to ethnic groups.… In much of Asia and Africa, it is only modest hyperbole to assert that the Marxian prophecy has had an ethnic fulfillment” (11).

The frightening ubiquity of within-country conflicts, as well as their widespread ethnic nature, provokes several questions. Do “ethnic divisions” predict conflict within countries? How do we conceptualize those divisions? If it is indeed true that ethnic cleavages and conflicts are related, how do we interpret such a result? Do “primordial,” ancestral ethnic hatreds trump “more rational” forms of antagonism, such as the instrumental use of ethnicity to achieve political power or economic gain? To discuss and possibly answer some of these questions is the goal of this review.

## Class and Ethnicity as Drivers of Conflict

The study of human conflict is (and has been) a central topic in political science and sociology. Economics—with relatively few and largely recent exceptions—has paid little attention to the issue. [For three recent overviews, see (1214).] Perhaps textbook economics, with its traditional respect for property rights, often presumes that the economic agents it analyzes share that respect and do not violently challenge allocations perceived to be unfair. Yet one of the notable exceptions in economics—Marx—directly or indirectly dominates the analytical landscape on conflict in the rest of the social sciences. Class struggle, or more generally, economic inequality, has been viewed as the main driver of social conflict in industrial or semi-industrial society (15). In Sen’s words, “the relationship between inequality and rebellion is indeed a close one” (16).

Yet, intuitive as it might seem, this relationship doesn’t receive emphatic empirical endorsement. In a detailed survey paper on the many attempts to link income inequality and social conflict empirically, Lichbach mentions 43 papers on the subject, some “best forgotten” (17). The evidence is thoroughly mixed, concludes Lichbach, as he cites a variety of studies to support each possible relationship between the two, and others that show no relationship at all. Midlarsky remarks on the “fairly typical finding of a weak, barely significant relationship between inequality and political violence … rarely is there a robust relationship between the two variables” (18).

The emphasis on economic inequality as a causal correlate of conflict seems natural, and there is little doubt that carefully implemented theory will teach us how to better read the data (see below). Yet it is worth speculating on why there is no clear-cut correlation. Certainly, economic demarcation across classes is a two-edged sword: While it breeds resentment, the very poverty of the have-nots separates them from the means for a successful insurrection. In addition, redistribution across classes is invariably an indirect and complex process.

The use of noneconomic “markers” such as ethnicity or religion addresses both these issues. Individuals on either side of the ethnic divide will be economically similar, so that the gains from such conflict are immediate: The losing group can be excluded from the sector in which it directly competes with the winners [e.g., (11, 19, 20)]. In addition, each group will have both poor and rich members, with the former supplying conflict labor and the latter supplying conflict finances (21). This suggests an interesting interaction between inequality and ethnicity, by which ethnic groups with a higher degree of within-group inequality will be more effective in conflict (22). Moreover, it has been suggested that “horizontal” inequality (i.e., inequality across ethnic groups) is an important correlate of conflict (2326).

There are two broad views on the ethnicity-conflict nexus [e.g., (10, 27)]. The “primordialist” view (28, 29) takes the position that ethnic differences are ancestral, deep, and irreconcilable and therefore invariably salient. In contrast, the “instrumental” approach pioneered by (19) and discussed in (10) sees ethnicity as a strategic basis for coalitions that seek a larger share of economic or political power. Under this view, ethnicity is a device for restricting the spoils to a smaller set of individuals. Certainly, the two views interact. Exclusion is easier if ethnic groups are geographically concentrated (30, 31). Strategic ethnic conflict could be exacerbated by hatreds and resentments—perhaps ancestral, perhaps owing to a recent clash of interests—that are attached to the markers themselves. Finally, under both these views, in ethnically divided societies democratic agreements are hard to reach and once reached, fragile (32); the government will supply fewer goods and services and redistribute less (33, 34); and society will face recurrent violent conflict (11).

Either approach raises the fundamental question of whether there is an empirical, potentially predictive connection between ethnic divisions and conflict. To address that question, we must first define what an “ethnic division” is. Various measures of ethnic division or dominance (3537) have been proposed. The best-known off-the-shelf measure of ethnic division is the fractionalization index, first introduced in the 1964 edition of the Soviet Atlas Narodov Mira, to measure ethnolinguistic fragmentation. It equals the probability that two individuals drawn at random from the society will belong to two different groups (see Box 1 for a precise definition). Ethnic fractionalization has indeed been usefully connected to per capita gross domestic product (GDP) (38), economic growth (39), or governance (40). But (7, 35, 41, 42) do not succeed in finding a connection between ethnic or religious fractionalization and conflict, though it has been suggested that fractionalization appears to work better for smaller-scale conflicts, such as ethnic riots (43). By contrast, variables such as low GDP per capita, natural resources, environmental conditions favoring insurgency, or weak government are often statistically significant correlates of conflict (12, 44). Fearon and Laitin conclude that the observed “pattern is thus inconsistent with … the common expectation that ethnic diversity is a major and direct cause of civil violence” (7).

### Box 1

A model of conflict and distribution.

The two measures of ethnic divisions discussed in this article are both based on the same underlying parameters: the number of groups m and total population N, the population Ni of each group, and the intergroup distances dij. Polarization and fractionalization are given by

where ni = Ni/N is the population share of group i. The distinction between P and F is superficial at first sight but is of great conceptual importance. The squaring of population shares in P means that group size matters over and above the mere counting of individual heads implicit in F. In addition, fractionalization F discards intergroup distances and replaces them with 0 or 1 variable.

The theory developed in (48) and summarized below links these measures to conflict incidence. There are m groups engaged in conflict. The winner enjoys two sorts of prizes: One is “oprivate” and the other is “opublic.” Let μ be the per capita value of the private prize at stake. Let uij be the utility to an individual member of group i from the policy implemented by group j. For any i the utility from the ideal policy is strictly higher than any other policy; that is, uii > uij. Then, the “odistance” between i and j is dijuiiuij, so that the loss to i from j“(tm)s ideal policy is dij. Let π be the amount of money an individual is willing to give up in order to bring the implemented policy one unit toward her ideal policy. Then, we can say that the monetary value to a member of group i of policy j is πuij and the loss relative to the ideal policy is πdij. Individuals in each group expend resources r to influence the probability of success of their own group. Write the income equivalent cost to such expenditure as c(r) and assume that c is increasing, smooth, and strictly convex, with c'(0) = 0. Add individual contributions in group i to obtain group contribution Ri. Assume that the probability of success for group i is given by pi = Ri/RN, where RN ≡ ∑iRi. Measure conflict intensity in population-normalized form by π = RN/N.

The direct payoff to a person in group i who expends resources r is given by . Individuals also care about the payoff to the other group members. When deciding on how much r to contribute, individuals seek to maximize the sum of their direct payoff and the total of the other group members, weighted by a group commitment factor α. Note that the optimal contribution ri by a member of group i depends on the contributions made by all other individuals. We focus on the Nash equilibrium of this strategic game: the vector of actions with the property that all are the best response to each other. We prove that such an equilibrium always exists and that it is unique.

Note now that c'(r) is the implicit “oprice” in sacrificed income that an individual is willing to pay for an extra unit of effort contributed to conflict. We then define the per capita normalized intensity of conflict C as the value of the resources expended, C = c'(π)π. Hence, [C/(π + μ)] is the ratio of the resources wasted in conflict relative to the stakes, all expressed in monetary terms. Proposition 2 in (48) shows that the equilibrium intensity of conflict C is approximately determined as follows:

where is the relative publicness of the prize, and where G is a third measure of ethnic distribution, the Greenberg-Gini index: . Its influence wanes with population size, and we“(tm)ve ignored it in this essay, though (48, 51) contain a detailed discussion of all three measures.

For large populations, the expression above reduces to the one in the main text.

But the notion of “ethnic division” is complex and not so easily reduced to a measure of diversity. The discussion that follows will introduce a different measure—polarization—that better captures intergroup antagonism. As we shall see, polarization will be closely connected to the incidence of conflict; moreover, with a measure of polarization in place and controlled for, fractionalization, too, will matter for conflict.

## Fractionalization and Polarization

As already discussed, the index of fractionalization is commonly used to describe the ethnic structure of a society (see Box 1). This index essentially reflects the degree of ethnic diversity. When groups are of equal size, the index increases with the number of groups. It reaches a maximum when everyone belongs to a different group.

When one is interested in social conflict, this measure does not seem appropriate on at least two counts. First, as social diversity increases beyond a point, intuition suggests that the likelihood of conflict would decrease rather than increase. After all, group size matters. The fact that “many are in this together” provides a sense of group identity in times of conflict. Moreover, groups need a minimum size to be credible aggressors or opponents. Second, not all groups are symmetrically positioned with respect to other groups, though the measure implicitly assumes they are. A Pushtun saying is illustrative: “Me against my brothers, me and my brothers against my cousins, me and my cousins against the world.” The fractionalization measure can be interpreted as saying that every pair of groups is “equally different.” Often, they are not.

Consider now the notion of polarization as introduced in (4547). Polarization is designed to measure social “antagonism,” which is assumed to be fueled by two factors: the “alienation” felt between members of different groups and the sense of “identification” with one’s own group. This index is defined as the aggregation of all interpersonal antagonisms. Its key ingredients are intergroup distances (how alien groups are from each other) and group size (an indicator of the level of the group identification). Using an axiomatic approach (45, 48), we obtain the specific form used in this article; see Box 1 for the precise formula.

In any society with three or more ethnic groups, the polarization measure behaves very differently from fractionalization. Unlike fractionalization, polarization declines with the continued splintering of groups and is globally maximized for a bimodal distribution of population. This is shown in Fig. 2, where groups are always of equal size and intergroup distances are equal to 1. Rather than being two different (but broadly related) ways of measuring the same thing, the two measures emphasize different aspects of a fundamentally multidimensional phenomenon. As we shall see, the differences have both conceptual and empirical bite. For instance, Montalvo and Reynal-Querol (49), using a simplified version of the index of polarization, show that ethnic polarization is a significant correlate of civil conflict, whereas fractionalization is not. Their contribution provides the first piece of serious econometric support for the proposition that “ethnic divisions” might affect conflict.

Despite their divergent performance in empirical work, the two measures are linked. Indeed, they are even identical if (i) group identity does not play a role and (ii) individuals feel equally alienated from members of all other groups. Which index is best to use is therefore determined by the nature of the problem at hand: on whether the sense of identity, of intergroup differentiation, or both are relevant. Group identification matters when we face problems of public import, in which the payoffs to the entire community jointly matter. Intergroup differentiation is relevant whenever the specific cultural characteristics of the other groups affect the policies that they choose, and therefore create implications for any one group. In contrast, if social groups compete for narrow economic gains that accrue to the winners and are excludable from the losers, no opponent’s victory means more or less than any other. In the theory that we outline below, these are precisely the factors that receive greatest emphasis.

## Marrying Theory and Facts

A systematic econometric exploration of the links between ethnic divisions and conflict will generally take the form of a multivariate regression. The “dependent variable” we seek to explain is some measure of conflict. On the other side of the regression is our main “independent variable,” which is a particular measure of “ethnic divisions,” as well as a host of “control variables” that are included to capture other influences on conflict that we seek to filter out. This much is evident. The problem is (and this is true of empirical research more generally) that little discipline is often imposed on the specification of that regression. Much of that research involves the kitchen-sink approach of including all variables—usually linearly—that could possibly play a role in ethnic conflict. Such an approach is problematic on at least three counts. First, the number of plausible variables is unbounded, not just in principle but apparently also in practice: 85 different variables have been used in the literature (50). Trying them out in various hopeful combinations smacks uncomfortably of data-mining. Second, even if we could narrow down the set of contenders, there are many ways to specify the empirical equation that links those variables to conflict. Finally, the absence of a theory hinders the interpretation of the results.

From a statistical perspective, fractionalization and polarization are just two, seemingly equally reasonable, ways of measuring ethnic divisions. Yet they yield very different results in connecting ethnicity to conflict. Do we accept this inconsistency as yet another illustration of “measurement error”? Or is there something deeply conceptual buried here?

The results we are going to present are obtained from an explicit game-theoretic model of conflict. We then bring the predicted equilibrium of this model to data. This allows us both to test the theory and to suitably interpret the results. Perhaps the most important contribution of the theory is that it permits both polarization and fractionalization as joint drivers of conflict and explains precisely when one measure acquires more explanatory salience than the other.

We begin by presenting the recent analysis that links polarization and fractionalization to equilibrium conflict (48). We then describe some of the empirical findings obtained in (51) when confronting the predictions of the model with data.

## Polarization, Fractionalization, and Conflict: Theory

A situation of open civil conflict arises when an existing social, political, or economic arrangement is challenged by an ethnic group. Whether the ethnic marker is focal for instrumental or primordial reasons is an issue that we’ve remarked on earlier, but at this stage it is irrelevant for our purpose. [For more on ethnic salience, see (5254).] In such a situation, the groups involved will undertake costly actions (demonstrations, provocations, bombs, guerrilla or open warfare) to increase their probability of success. We view the aggregate of all such actions as the extent of conflict.

More precisely, suppose that there are m groups engaged in conflict. Think of two types of stakes or prizes in case of victory. One kind of prize is “public,” the individual payoff from which is undiluted by one’s own group size. For instance, the winning group might impose its preferred norms or culture: a religious state, the abolition of certain rights or privileges, the repression of a language, the banning of political parties, and so on. Or it might enjoy political power or the satisfaction of seeing one’s own group vindicated or previous defeats avenged. Let uij be the payoff experienced by an individual member of group i in the case in which group j wins and imposes its preferred policy; we presume that uii > uij, which is true almost by definition. This induces a notion of “distance” across groups i and j: dijuiiuij, which can be interpreted as the loss to i of living under the policy implemented by j. Note that a member of group i might prefer j rather than k to be in power, and that will happen precisely when dij < dik.

The money-equivalent value of the public payoffs—call it π—tells us how much money individuals are ready to give up to bring the implemented policy “one unit” closer to one’s own ideal policy. Its value depends in part on the extent to which the group in power can impose policies or values on the rest of society. Thus, a member of group i assigns a money value of uijπ to the ideal policy of group j.

The other type of prize is “private.” Examples include the material benefits obtained from administrative or political positions, specific tax breaks, directed subsidies, bias in the allocation of public expenditure and infrastructures, access to rents from natural resources, or just plain loot. Private payoffs have two essential properties. First, group size dilutes individual benefits: The larger the group, the smaller is the return from a private prize for any one group member. Second, the identity of the winner is irrelevant to the loser since, in contrast to the “public” case, the loser is not going to extract any payoff from that fact. (If there are differential degrees of resentment over the identity of the winner, simply include this component under the public prize.) Let μ be the per capita money value of the private prize at stake.

Individuals in each group expend costly resources (time, effort, risk) to influence the probability of success. Conflict is defined to be the sum of all these resources over all individuals and all groups. The winners share the private prize and get to implement their favorite policies (the public prize). The losers have to live with the policies chosen by the winners. A conflict equilibrium describes the resulting outcome. (“Conflict equilibrium” perhaps abuses semantics to an unacceptable degree, our excuse being that we observe the game-theoretic tradition of describing the noncooperative solution to a game as a Nash “equilibrium.”) It is a vector of individual actions such that each agent’s behavior maximizes expected payoffs in the conflict, given the choices made by all other individuals. Note that by the word “payoff” we don’t mean only some narrow monetary amount, but also noneconomic returns, such as political power or religious hegemony.

But what does the maximization of payoffs entail? Individuals are individuals, but they also have a group identity. To some extent an individual will act selfishly, and to some extent he or she will act in the interest of the ethnic group. The weight placed on the group versus the individual will depend on several factors (some idiosyncratic to the individual), but a large component will depend on the degree of group-based cohesion in the society; we return to this below. Formally, we presume that an individual places a weight of α on the total payoff of his or her group, in addition to their own payoff.

Let us measure the intensity of conflict—call it C—by the money value of the average, per capita level of resources expended in conflict. In (48) we argue that in equilibrium, the eventual across-group variation in the per capita resources expended has a minor effect on the aggregate level of conflict. Thus, in practice the population-normalized intensity of conflict C can be approximated well by ignoring this variation, and this simplification yields the approximate formula (1)for large populations, where λ ≡ π/(π + μ) is the relative publicness of the prize, F is the fractionalization index, and P is a particular member of the family of polarization measures described earlier, constructed using intergroup distances dij derived from “public” payoff losses. (Box 1 describes these measures more formally and also provides a more general version of Eq. 1.) Thus, the theory tells us precisely which notions of ethnic division need to be considered. Moreover, the relationship has a particular form, which informs the empirical analysis.

This result highlights the essential role of theory for meaningful empirical work. The exogenous data of the model—individual preferences, group size, the nature and the size of the prize, and the level of group cohesion—all interact in a special way to determine equilibrium conflict intensity. The theory shows, first, that it suffices to aggregate all the information on preferences and group sizes into just two indices—F and P—capturing different aspects of the ethnic composition of a country. Second, the weights on the two distributional measures depend on the composition of the prize and on the level of group commitment. In particular, the publicness of the prize (reflected in a high value of λ) reinforces the effect of polarization, whereas high privateness of the prize (low λ) reinforces the effect of fractionalization. Not surprisingly, high group cohesion α enhances the effect of both measures on conflict.

The publicness of the prize is naturally connected to both identification and alienation—and therefore to polarization. With public payoffs, group size counts twice: once, because the payoffs accrue to a larger number, and again, because a larger number of individuals internalize that accrual and therefore contribute more to the conflict. Intergroup distances matter, too: The precise policies interpreted by the eventual winner continue to be a cause of concern for the loser. Both these features—the “double emphasis” on group size and the use of distances—are captured by the polarization measure P; see Box 1 for more details. By contrast, when groups fight for a private payoff—say money—one winner is as bad as another as long as my group doesn’t win, and measures based on differences in intergroup alienation become useless. Moreover, with private payoffs, group identification counts for less than it does with public payoffs, as group size erodes the per capita gain from the prize. The resulting index that is connected to this scenario is one of fractionalization (see Box 1).

In short, the theory tells us to obtain data on P and F and combine them in a particular way. It tells us that when available, we should attempt to obtain society-level data for group cohesion α and relative publicness λ and enter them in the way prescribed by Eq. 1. With this in mind, we now bring the theory to the data.

## Taking the Theory to Data

We study 138 countries over 1960 to 2008, with the time period divided into 5-year intervals. That yields a total of 1125 observations (in most cases). Some of the variables in the theory are not directly observable, and so we will use proxies. For a complete set of results, see (51) and the accompanying Web Appendix.

We measure conflict intensity in two ways. The first is the death toll. Using data from the jointly maintained database under the Uppsala Conflict Data Program and the Peace Research Institute of Oslo (UCDP/PRIO) (1), we construct a discrete measure of conflict—PRIO-C—for every 5-year period and every country as follows: PRIO-C is equal to 0 if the country is at peace in those 5 years; to 1 if it has experienced low-intensity conflict (more than 25 battle-related deaths but less than 1000) in any of these years; or to 2 if the country has been in high-level conflict (more than 1000 casualties) in any of the 5 years. Despite the overall popularity of UCDP/PRIO, this is an admittedly coarse measure of deaths, based on only three categories (peace, low conflict, and high conflict) defined according to ad hoc thresholds, and it reports conflicts only when one of the involved parties is the state. To overcome these two problems, we use a second measure of intensity: the Index of Social Conflict (ISC) computed by the Cross-National Time-Series Data Archive (55). It provides a continuous measure of several manifestations of social unrest, with no threshold dividing “peace” from “war.” The index ISC is formed by taking a weighted average over eight different manifestations of internal conflict, such as politically motivated assassinations, riots, guerrilla warfare, etc.

Our core independent variables are the indices F and P. To compute these indices, we need the population size of different ethnic groups for every country and a proxy for intergroup distances. For demographic information on groups, we use the data set provided by (9), which identifies over 800 “ethnic and ethno-religious” groups in 160 countries. For intergroup distances, we follow (9, 56, 57) and use the linguistic distance between two groups as a proxy for group “cultural” distances in the space of public policy.

Linguistic distance is defined on a universal language tree that captures the genealogy of all languages (58). All Indo-European languages, for instance, will belong to a common subtree. Subsequent splits create further “sub-subtrees,” down to the current language map. For instance, Spanish and Basque diverge at the first branch, since they come from structurally unrelated language families. By contrast, the Spanish and Catalan branches share their first seven nodes: Indo-European, Italic, Romance, Italo-Western, Western, Gallo-Iberian, and Ibero-Romance languages. We measure the distance between two languages as a function of the number of steps we must retrace to find a common node. The results are robust to alternative ways of mapping linguistic differences into distances.

Linguistic divisions arise because of population splits. Languages with very different origins reveal a history of separation of populations going back several thousand years. For instance, the separation between Indo-European languages and all others occurred around 9000 years ago (59). In contrast, finer divisions, such as those between Spanish and Catalan, tend to be the result of more recent splits, implying a longer history of common evolution. Consistent with this view, there is evidence showing a link between the major language families and the main human genetic clusters (60, 61).

The implicit theory behind our formulation is that linguistic distance is associated with cultural distance, stemming from the chronological relation of language trees to group splittings and, therefore, to independent cultural (and even genetic) evolution. That argument, while obviously not self-evident, reflects a common trade-off. The disadvantage is obvious: Linguistic distances are at best an imperfect proxy for the unobserved “true distances.” But something closer to the unobserved truth—say, answers to survey questions about the degree of intergroup antagonism, or perhaps a history of conflict—have the profound drawback of being themselves affected by the very outcomes they seek to explain, or being commonly driven (along with the outcome of interest) by some other omitted variable. That is, such variables are endogenous to the problem at hand. The great advantage of linguistic distances is that a similar charge cannot be easily leveled against them. Whether the trade-off is made well here is something that a mixture of good intuition and final results must judge.

In our specifications, we also control for other variables that have been shown to be relevant in explaining civil conflict (12): population size (POP), because conflict is population-normalized in the theory; gross domestic product per capita (GDPPC), which raises the opportunity cost of supplying conflict resources; natural resources (NR), measured by the presence of oil or diamonds, which affects the total prize; the percentage of mountainous terrain (MOUNT), which facilitates guerrilla warfare; noncontiguity (NCONT), referring to countries with territory separated from the land area containing the capital city either by another territory or by 100 km of water; measures of the extent of democracy (DEMOC); the degree of power (PUB) afforded to those who run the country, which is a proxy for the size of the public prize (more on this below); time dummies to capture possible global trends; and regional dummies to capture patterns affecting entire world regions. Finally, because current conflict is deeply affected by past conflict, we use lagged conflict as an additional control in all our specifications.

Our exercise implements Eq. 1 in three ways. First, we run a cross-sectional regression of conflict on the two measures of ethnic division. Second, we independently compute a degree of relative publicness of payoffs for each country and include this in the regression. Third, we add separate proxies of group cohesion for all the countries. Each of these steps takes us progressively closer to the full power of Eq. 1, but with the potential drawback that we need proxies for an increasing number of variables.

To form a relative publicness index by country, we proxy π and μ for every country. Begin with a proxy for the private payoff μ. It seems natural to associate μ with rents that are easily appropriable. Because appropriability is closely connected to the presence of resources, we approximate the degree of “privateness” in the prize by asking if the country is rich in natural resources. Typically, oil and diamonds are the two commodities most frequently associated with the “resource curse” (62, 63). Data on the quantity of diamonds produced is available (64), but information on quality (and associated price) is scarce, making it very difficult to estimate the monetary value of diamond production. Diamond prices per carat can vary by a factor of 8 or more, from industrial diamonds ($25 a carat in 2001) to high-quality gemstones ($215 per carat in 2001) (63). Hence, we focus exclusively on oil in this exercise. We use the value of oil reserves per capita, OILRSVPC, as a proxy for μ.

Next, we create an index of “publicness,” PUB, by measuring the degree of power afforded to those who run the country, “more democratic” being regarded as correlated with “less power” and consequently a lower valuation of the public payoff to conflict. We use four different proxies to construct the index: (i) the lack of executive constraints, (ii) the level of autocracy, (iii) the degree to which political rights are flouted, and (iv) the extent of suppression of civil liberties. We use time-invariant dummies of these variables based on averages over the sample, because short-run changes are likely to be correlated with the incidence of conflict.

Our proxy for the relative publicness of the prize is given by(2)where we multiply the PUB indicator by per capita GDP to convert the “poor governance” variables into monetary equivalents. The “conversion factor” γ makes the privateness and publicness variables comparable and allows us to combine them to arrive at the ratio Λ. In the empirical exercise we present here, we set γ equal to 1. But the results are robust to the precise choice of this parameter; see the Web Appendix to (51).

Finally, we proxy the level of group cohesion α by exploiting the answers to a set of questions in the 2005 wave of the World Values Survey (65). We use the latest wave available because it covers the largest number of countries. One could argue that the answers might be conditioned by the existence of previous or contemporary conflict. Hence, the questions we have selected do not ask about commitment to specific groups but address issues like adherence to social norms, identification with the local community, the importance of helping others, and so on. We compute the country average of individual scores on this set of questions and denote this by A; see (51) for a list of the questions.

## What the Data Say

As already mentioned, we proceed in three steps. First, we examine the strength of the cross-country relationship between conflict intensity and the two indices of ethnic division, with all controls in place, including time and regional dummies. The estimated coefficients will address the importance of the two independent variables as determinants of conflict intensity. In the second stage, we step closer to the full model and interact the distributional indices with country-specific measures of the relative publicness λ of payoffs, just as in Eq. 1. Finally, we test the full model by adding to the previous specification the extent of group cohesion α independently computed for each country. In both the second and third stages, we also retain the two distributional indices without interaction to verify whether the significance comes purely from the ethnic structure of the different countries or because this structure interacts with λ and α in the way predicted by the theory.

In stage 1, then, we regress conflict linearly on the two distributional indices and all other controls. Columns 1 and 2 in Table 1 record the results for each specification of the conflict intensity variable—PRIO-C and ISC. Ethnicity turns out to be a significant correlate of conflict, in sharp contrast to the findings of the previous studies mentioned above. Throughout, P is highly significant and positively related to conflict. F also has a positive and significant coefficient.

Table 1

Ethnicity and conflict. All specifications use region and time dummies, not shown explicitly. P values are reported in parentheses. Robust standard errors adjusted for clustering have been used to compute z statistics. Columns 1, 3, and 5 are estimated by maximum likelihood in an ordered logit specification, and columns 2, 4, and 6 by OLS. GDPPC: log of gross domestic product per capita; POP: log of population; NR: a dummy for oil and/or diamonds in columns 1 and 2 and oil reserves per capita (OILRSVPC) in columns 3 to 6; MOUNT: percentage of mountainous territory; NCONT: noncontiguous territory (see text); POLITICS is DEMOC in columns 1 and 2, and the index PUB times GDPPC (the numerator of I") for the remaining columns; LAG, lagged conflict in previous 5-year interval; CONST, constant term.

View this table:

Apart from statistical significance, the effect of these variables is quantitatively important. Taking column 1 as reference, if we move from the median polarized country (Germany) to the country in the 90th percentile of polarization (Niger), while changing no other institutional or economic variable in the process and evaluating those variables at their means, the predicted probability of experiencing conflict (i.e, the probability of observing strictly positive values of PRIO-C) rises from ~16 to 27%, which implies an increase of 69%. Performing the same exercise for F (countries at the median and at the 90th percentile of F are Morocco and Cameroon, respectively) takes us from 0.19 to 0.25% (an increase of 32%). These are remarkably strong effects, not least because in the thought experiment we change only the level of polarization or fractionalization, keeping all other variables the same.

Figure 3 depicts two world maps. The dots in each map show the maximum yearly conflict intensity experienced by each country; smaller dots meet the 25-death PRIO criterion, whereas larger dots satisfy the 1000-death criterion. Although these maps cannot replicate the deeper findings of the statistical analysis, they clearly show the positive relationship between conflict and ethnic divisions.

In stage 2, we consider the cross-country variation in relative publicness; recall our proxy index Λ from (2). In columns 3 and 4 in Table 1, the main independent variables are P*Λ and F*(1 − Λ), just as specified by the theory; see Eq. 1. This allows us to test whether the interacted indices of ethnic fractionalization and polarization are significant. We also include the noninteracted indices to examine whether their significance truly comes from the interaction term. Indeed, polarization interacted with Λ is positive and highly significant, and the same is true of fractionalization interacted with 1 − Λ. These results confirm the relevance of both polarization and fractionalization in predicting conflict once the variables are interacted with relative publicness in the way suggested by the theory.

It is of interest that the level terms P and F are now no longer significant. Indeed, assuming that our proxy for relative publicness accurately captures all these issues at stake, this is precisely what the model would predict. For instance, polarization should have no further effect over and beyond the “λ-channel”: Its influence should dip to zero when there are no public goods at stake. That our estimate Λ happens to generate exactly this outcome is of interest. But the public component of that estimate is built solely on the basis of governance variables. If this eliminates all extraneous effects of polarization (as it indeed appears to do), it could suggest that primordial factors such as pure ethnic differences per se have little to do with ethnic conflict.

Finally, in our third stage, we allow group cohesion to vary across countries. Unfortunately, we are able to proxy A for just 53 countries, and this restricts the number of our observations to 447. Columns 5 and 6 of Table 1 examine this variant. In this specification, the independent variables are exactly in line with those described by the model, though we’ve had to sacrifice data. We use precisely the combinations asked for by the theory: polarization is weighted both by Λ and by A, and fractionalization by (1 − Λ) and by A again. We continue to use the direct terms P and F, as well as the controls. The results continue to be striking. The composite terms for polarization are significant, whereas the levels are not. The composite term for fractionalization is highly significant when we focus on smaller-scale social unrest, as measured by ISC, but it is marginally nonsignificant in column 5. The level terms of F continue to be insignificant. This behavior of fractionalization mirrors previous results that showed the nonrobust association of F and different manifestations of conflict (7, 35).

## What Have We Learned?

Existing ethnographic literature makes it clear that most within-country social conflicts have a strong ethnic or religious component. But the ubiquity of ethnic conflict is a different proposition from the assertion of an empirical link between existing ethnic divisions and conflict intensity. We’ve argued in this article that such a link can indeed be unearthed, provided that we’re willing to write down a theory that tells us what the appropriate notion of an “ethnic division” is. The theory we discuss points to one particular measure—polarization—when the conflict is over public payoffs such as political power. It also points to a different measure—fractionalization—when the conflict is over private payoffs such as access to resource rents. Indeed, the theory also tells us how to combine the measures when there are elements of both publicness and privateness in the prize. With these considerations in mind, the empirical links between ethnicity and conflict are significant and strong.

The theory and empirical strategy together allow us to draw additional interesting inferences. First, we find conclusive evidence that civil conflict is associated with (and possibly driven by) public payoffs, such as political power, and not just by the quest for private payoffs or monetary gain. Otherwise only fractionalization would matter, and not polarization. Second, the disappearance of the level effects of P and F once interactions with relative publicness are introduced (as specified by the theory) strongly suggests that ethnicity matters, not intrinsically as the primordialists would claim, but rather instrumentally, when ethnic markers are used as a means of restricting political power or economic benefits to a subset of the population.

One might object that the results are driven by the peculiarities of some regions that exhibit both highly polarized ethnicities and frequent and intense conflicts. Africa is a natural candidate that comes to mind. However, if we use regional controls or repeat the exercise by removing one continent at a time from the data set, we obtain exactly the same results (51).

It is too much to assert that every conflict in our data set is ethnic in nature and that our ethnic variables describe them fully. Consider, for instance, China or Haiti or undivided Korea, which have experienced conflict and yet have low polarization and fractionalization. All conflict is surely not ethnic, but what is remarkable is that so many of them are, and that the ethnic characteristics of countries are so strongly connected with the likelihood of conflict. Yet we must end by calling for a deeper exploration of the links between economics, ethnicity, and conflict.

This paper takes a step toward the establishment of a strong empirical relationship between conflict and certain indicators of ethnic group distribution, one that is firmly grounded in theory. In no case did we use income-based groups or income-based measures, and in this sense our study is perfectly orthogonal to those that attempt to find a relationship between economic inequality and conflict, such as those surveyed in (17). Might that elusive empirical project benefit from theoretical discipline as well, just as the ethnicity exercise here appears to? It well might, and such an endeavor should be part of the research agenda. But with ethnicity and economics jointly in the picture, it is no longer a question of one or the other as far as empirical analysis is concerned. The interaction between these two themes now takes center stage. As we have already argued, there is a real possibility that the economics of conflict finds expression across groups that are demarcated on other grounds: religion, caste, geography, or language. Such markers can profitably be exploited for economic and political ends, even when the markers themselves have nothing to do with economics. A study of this requires an extension of the theory to include the economic characteristics of ethnic groups and how such characteristics influence the supply of resources to conflict. It also requires the gathering of group data at a finer level that we do not currently possess. In short, a more nuanced study of the relative importance of economic versus primordial antagonisms must await future research.

## References and Notes

1. Acknowledgments: We gratefully acknowledge financial support from Ministerio de Economía y Competitividad project ECO2011-25293 and from Recercaixa. J.E. and L.M. acknowledge financial support from the AXA Research Fund. The research of D.R. was funded by NSF grant SES-0962124. We thank two referees for valuable comments and are grateful to R. Jayaraman for suggestions that improved the manuscript.
View Abstract