Research Article

Antisocial Punishment Across Societies

See allHide authors and affiliations

Science  07 Mar 2008:
Vol. 319, Issue 5868, pp. 1362-1367
DOI: 10.1126/science.1153808


We document the widespread existence of antisocial punishment, that is, the sanctioning of people who behave prosocially. Our evidence comes from public goods experiments that we conducted in 16 comparable participant pools around the world. However, there is a huge cross-societal variation. Some participant pools punished the high contributors as much as they punished the low contributors, whereas in others people only punished low contributors. In some participant pools, antisocial punishment was strong enough to remove the cooperation-enhancing effect of punishment. We also show that weak norms of civic cooperation and the weakness of the rule of law in a country are significant predictors of antisocial punishment. Our results show that punishment opportunities are socially beneficial only if complemented by strong social norms of cooperation.

Recent research has shown that altruistic punishment, that is, a person's propensity to incur a cost in order to punish freeloaders who fail to pull their weight in cooperative endeavors, can explain why genetically unrelated individuals are often able to maintain high levels of socially beneficial cooperation (14). This holds even when direct and indirect reciprocity (5, 6) or laws and regulations provide no incentives to behave cooperatively (7).

In this paper, we direct attention to a phenomenon that [with a few exceptions (810)] has been largely neglected: People might punish not only freeloaders, but cooperators too. For example, participants who had been punished in the past for contributing too little might retaliate against the cooperators because the cooperators are precisely those individuals most likely to punish the free-riding low contributors. Our experimental evidence from 16 participant pools with various cultural and economic backgrounds shows that antisocial punishment of prosocial cooperators is indeed widespread in many participant pools; interestingly, the participant pools in which most of the previous research on altruistic punishment has been conducted form the main exception.

Our observation of antisocial punishment grew out of our research goal to understand whether there are cross-societal differences in people's punishment and cooperation behavior. Previous large-scale cross-cultural evidence comes mainly from one-shot bargaining games conducted in small-scale societies around the world (11, 12). However, there is no systematic large-scale evidence on cooperation games. We therefore conducted cooperation experiments with and without punishment opportunities. Moreover, we ran our experiments as repeated games to see whether different cooperation levels emerge and remain stable across groups. Such a possibility is precluded in one-shot experiments.

Our research strategy was to conduct the experiments with comparable social groups from complex developed societies with the widest possible range of cultural and economic backgrounds (13) to maximize chances of observing cross-societal differences in punishment and cooperation. The societies represented in our participant pools diverge strongly according to several widely used criteria developed by social scientists in order to characterize societies (1416). This variation, covering a large range of the worldwide available values of the respective criteria, provides us with a novel test for seeing whether societal differences between complex societies have any impact on experimentally observable disparities in cooperation and punishment behavior.

Experiments. The workhorse for our cross-societal analysis is the public goods game with and without punishment (1). The public goods game is a stylized model of situations that require cooperation to achieve socially beneficial outcomes in the presence of free-rider incentives. Examples abound: warfare, cooperative hunting, voting, paying taxes, fighting corruption, contributing to public goods, teamwork, work morale, neighborhood watch, common pool resource management, recycling, tackling climate change, and so on. These are frequent situations with the common feature that cooperation leads to a group-beneficial outcome but is jeopardized by selfish incentives to ride free on others' contributions.

To implement a cooperation game with and without punishment opportunities, we adapted a design developed by (1). In each participant pool, we conducted the exact same public goods experiment with real monetary stakes and two treatment conditions: a no-punishment condition (the N experiment) and a punishment condition (the P experiment). Groups of four members played the following public goods game in both conditions: Each member received an endowment of 20 tokens. Participants had to decide how many tokens to keep for themselves and how many to contribute to a group project. Each of the four group members earned 0.4 tokens for each token invested in the project, regardless of whether he or she contributed any. Because the cost of contributing one token in the project was exactly one token whereas the return on that token was only 0.4 tokens, keeping all one's own tokens was always in any participant's material self-interest, irrespective of how much the other three group members contributed. Yet, if each group member retained all of his or her tokens, there were no earnings to be shared; on the other hand, each member would earn 0.4 × 80 = 32 tokens if each of them invested their entire 20-token endowment.

All the interactions in the experiment were computer-mediated (17) and took place anonymously. Participants were not informed about the identity of others in the group; they made their contribution decisions simultaneously, and, once the decisions were made, they were informed about the other group members' contributions.

The only and crucial difference between the P experiment and the N experiment was that participants in the P experiment could punish each of the other group members after they were informed about the others' investments, whereas the N experiment ended after participants were informed about the other group members' contributions. A punishment decision was implemented by assigning the punished member between zero and 10 deduction points. Each deduction point assigned reduced the punished member's earnings by three tokens and cost the punishing member one token. All punishment decisions were made simultaneously. Participants were not informed about who punished them.

One of the goals of our experiment was to see whether and at what level punishment stabilized cooperation in the P experiment compared to the N experiment. To allow for the emergence of different cooperation levels, we therefore repeated the experiment 10 times under both conditions, keeping the group composition constant.

Because we were interested in whether people behave differently under the exact same circumstances, some methodological challenges arose. First, with regard to procedures, we followed the rules established in experimental economics (13). A second challenge was maximizing participant pool comparability to avoid confounds of participant pool differences with variations in sociodemographic composition. To minimize sociodemographic variability, we conducted all experiments with university undergraduates (n = 1120) who were similar in age, shared an (upper) middle class background, and usually did not know each other. We administered a postexperimental questionnaire to be able to control for further sociodemographic background characteristics (see table S2 for details).

Results. We first analyze people's punishment behavior across participant pools. Our perspective is how an individual who has contributed a certain amount to the public good punishes other group members who contributed either less, the same amount, or more than them. Figure 1 therefore displays punishment expenditures as a function of how much the punished individual's contribution deviated from the contribution of the punisher. We label the punishment of negative deviations punishment of free riding because the punished group member rode free on the punisher's contribution. Put differently, from the perspective of the punisher the target member behaved less prosocially than the punisher. In case the target member contributed the same amount or more, he or she behaved at least as prosocially as the punisher. We therefore call the punishment in these cases antisocial punishment.

Fig. 1.

Mean punishment expenditures for a given deviation from the punisher's contribution. The deviations of the punished participant's contribution from the punisher's contribution are grouped into five intervals, where [–20, –11] indicates that the punished participant contributed between 11 and 20 tokens less than the punishing participant, [–10, –1] indicates that the punished participant contributed between 1 and 10 tokens less than the punishing participant, [0] indicates that the punished participant contributed exactly the same amount as the punishing participant, [1, 10] indicates that the punished participant contributed between 1 and 10 tokens more than the punishing participant, and [11, 20] indicates that the punished participant contributed between 11 and 20 tokens more than the punishing participant. In Boston, for example, participants (including nonpunishers) expended 0.96 money units on average for all cases of negative deviations between [–10, –1] and 2.74 money units on average in cases of deviations between [–20, –11]. Participant pools are sorted according to their mean antisocial punishment. Fig. S2 and tables S3 and S4 provide complementary analyses.

Punishment behavior differed strongly across participant pools (Fig. 1). This holds in particular for antisocial punishment. A regression analysis of punishment behavior, which controls for the deviation, period effects, and sociodemographic composition, shows that antisocial punishment differed highly and significantly across participant pools [χ2(14) = 64.9, P = 0.000; tables S3 and S4]. Although there was very little antisocial punishment in some participant pools, in others people punished those who contributed the same or more than them as harshly as those who rode free on them. By contrast, punishment of free riding was only weakly significantly different across participant pools [χ2(14) = 23.1, P = 0.059; tables S3 and S4].

The punishment of free riding is likely triggered by negative emotions that arise from a violation of fairness norms and from feeling exploited (1, 2, 18). But what explains antisocial punishment? One plausible reason is that people might not accept punishment and therefore seek revenge (810). Revenge is a “human universal” (19) and part of a culture of honor in many societies. Our measure for vengeful punishment is the punishment people mete out as a function of received punishment in the previous period. Controlling for contributions of the punisher and the punished participant, we find a highly significant increase in antisocial punishment across all participant pools as a function of the amount of punishment received in the previous period. Broken down by participant pools, the effect is highly significant (at P < 0.01) in seven participant pools, weakly significantly positive in two participant pools, insignificantly positive in six participant pools, and insignificantly negative in one (tables S3 and S4).

The presence of a punishment opportunity had dramatic consequences on the achieved cooperation levels (Fig. 2). Contributions were highly significantly different across participant pools [Kruskal-Wallis test with group averages over all 10 periods as independent observations, χ2(15) = 113.1; P = 0.000]. Cooperation was stabilized in all participant pools but at vastly different levels (Fig. 2A). Cooperation in about half of the participant pools remained at the initial level (period 1 of the P experiment), whereas contributions increased over time in the others (table S5). The most-cooperative participant pool (in which people contributed 90% of their endowment, on average) contributed 3.1 times as much as the least-cooperative participant pool (with an average contribution of 29% of the endowment). The differences in cooperation across participant pools are significantly negatively related to antisocial punishment: The higher antisocial punishment is in a participant pool, the lower is the average cooperation level in that participant pool (Fig. 2B).

Fig. 2.

(A) Mean contributions to the public good over the 10 periods of the P experiment. Each line corresponds to the average contribution of a particular participant pool. The numbers in parentheses indicate the mean contribution (out of 20) in a particular participant pool. (B) Mean antisocial punishment and mean contribution (across all periods) per participant pool. Rho indicates Spearman rank order correlation between participant pool averages.

As a consequence of the different patterns of punishment and cooperation, there were also substantial participant-pool differences in earnings in the P experiment. The average per-period earnings differed by more than 250 percentage points between the participant pool with the highest average earnings and that with the lowest average earnings (fig. S3 and table S6).

An important reason for the large participant pool differences in cooperation rates is the fact that participant pools reacted very differently to punishment received. Regression analyses (table S7) show that, in all but one participant pool, people who contributed less than the group average in period t and who were subsequently punished increased their contribution in period t + 1. The increase is only significant (at P < 0.05) in 11 participant pools, however, and the extent of the mean estimated increase per punishment point received varies considerably between participant pools. Thus, punishment did not have an equally strong disciplinary effect on free riders in all participant pools in the sense of steering low contributors toward higher contributions; in some participant pools, punishment had no cooperation-enhancing effect at all.

The disciplinary effectiveness of punishment for below-average contributions is associated with the extent of antisocial punishment in a participant pool. There is a strong negative correlation between the mean antisocial punishment in a participant pool and the regression coefficient that measures the mean increase per punishment point received for a below-average contribution (Spearman's ρ = –0.87, P = 0.000, n = 16). One explanation is that the prospect of getting punished for at- or above-average contributions in some participant pools limits the low contributors' incentives to increase their contributions. Another explanation has to do with how people perceive the moral message behind punishment because there is evidence that even nonmonetary sanctions (which signal social disapproval) can induce low contributors to increase their contributions (20). Participant pools might have differed in the extent to which people feel ashamed when being punished for low contributions.

A regression analysis (Table 1) summarizes our findings on the impact of punishment on cooperation. To also account for variations of punishment in different groups within participant pools, we use the group average contributions as independent observations.

Table 1.

Punishment and cooperation levels. Ordinary least squares regressions with the group average contributions of all groups, which show any variation in contributions as independent observation (n = 273). The group average contributions over periods 2 to 10 are the dependent variables. The independent variables are the group average contributions in period 1, the group averages of punishment points assigned to group members who contributed less than the punishing participant (group average punishment of free riding) and to group members who are equally or more cooperative than the punishing participant (group average antisocial punishment). Model 1 does not control for the mean cooperation level in a participant pool, whereas model 2 controls for it by adding participant pool dummies. The adjusted r2 increases by only 7% and the results remain robust, although the coefficient for antisocial punishment is reduced in size. *P < 0.1, **P < 0.05, ***P < 0.01. Numbers in parentheses indicate robust standard errors.

View this table:

The results show that groups that started at high levels in period 1 of the P experiment also had high group average contributions over the remaining periods 2 to 10; groups that started at low levels in period 1 of the P experiment had low group average contributions over the remaining periods. Group average punishment of free riding relative to the punishers' own contributions is positively correlated with this group's average contribution, ceteris paribus. The opposite conclusion holds for antisocial punishment.

We also found significant participant pool differences in the N experiment, which serves as a benchmark for the P experiment [Kruskal-Wallis test with group averages over all 10 periods as independent observations, χ2(15) = 46.5, P = 0.000]. Mean contributions varied between 4.9 and 11.5 tokens of the least- and most-cooperative participant pool, respectively (Fig. 3). The span of 6.6 tokens between the least- and most-cooperative participant pool was thus substantially lower than the span of 12.3 tokens in the P experiment (Fig. 2A). Moreover, in contrast to the P experiment, where contributions were stabilized at vastly different levels, contributions in the N experiment dwindled to lower levels almost everywhere (table S8 and fig. S4).

Fig. 3.

Mean contributions to the public good over the 10 periods of the N experiment. Each line corresponds to the average contribution of a particular participant pool. The numbers in parentheses indicate the mean contribution (out of 20) in a particular participant pool.

Compared with the N experiment, the presence of a punishment option had at least a weakly significant cooperation-enhancing effect in 11 participant pools (Wilcoxon signed ranks test with independent group average contribution rates across all periods as observations, fig. S4 and table S9); the change in cooperation between the N and the P experiment was not significant in the other five participant pools. Thus, the cooperation-enhancing effect of a punishment opportunity cannot be taken for granted. This finding stands in contrast to previous results from experiments conducted in the United States and Western Europe, where punishment always increased cooperation in experiments with comparable fixed-group designs and parameters (8, 10, 2022).

The reason for this result is related to antisocial punishment: the higher antisocial punishment was in a participant pool, the lower was the rate of increase in cooperation in the P experiment relative to the N experiment (Spearman's ρ = –0.76, P = 0.001, n = 16). Furthermore, participant pools' average cooperation levels in period 1 of the P experiment (where participants had not yet acquired any experience with punishment) were significantly negatively correlated with their subsequent mean expenditures on antisocial punishment: The more a participant pool expended on antisocial punishment in the later stages of the experiment, the lower was its initial cooperation level (Spearman's ρ = –0.78, P = 0.000, n = 16).

What explains the large participant pool differences in antisocial punishment and hence cooperation levels? Punishment may be related to social norms of cooperation. Social norms exist at a macrosocial level and refer to widely shared views about acceptable behaviors and the deviations subject to possible punishment (23, 24). Thus, if participant pools held different social norms with regard to cooperation and free riding, they actually might have punished differently. An interesting set of relevant social norms are norms of civic cooperation (14) as they are expressed in people's attitudes to tax evasion, abuse of the welfare state, or dodging fares on public transport. These are all situations that can be modeled as public goods problems. The stronger norms of civic cooperation are in a society, the more free riding might be viewed as unacceptable and the more it might be punished in consequence. The flip side of the argument is that cooperators, who behave in the normatively desirable way, should not get punished; strong norms of civic cooperation might act as a constraint on antisocial punishment.

The strength of the rule of law in a society might also have an impact on antisocial punishment. If the rule of law is strong, people trust the law enforcement institutions, which are perceived as being effective, fair, impartial, and bound by the law (25). Revenge is shunned. If the rule of law is weak, the opposite holds. Thus, the rule of law reflects how norms are commonly enforced in a society.

We construct the variable norms of civic cooperation from data taken from the World Values Survey (13) (fig. S1A). The variable is derived from answers of a large number of selected representative residents of a country to questions on how justified (on a 10-point scale; 1 is fully justified; 10 is never justified) people think tax evasion, benefit fraud, or dodging fares on public transport are. The more reproachful these behaviors are in the eyes of the average citizen, the stronger are a society's norms of civic cooperation (14). The country scores of our 16 participant pools vary between 6.91 and 9.79 (the mean is 8.53); the available world sample range (n = 81 countries; mean = 8.64) lies between 6.75 and 9.81. Thus, the societies of our participant pools cover almost the whole available worldwide range of the distribution of norms of civic cooperation.

The rule of law indicator (13) (fig. S1B) is based on a host of different variables that measure “the extent to which agents have confidence in and abide by the rules of society, and in particular the quality of contract enforcement, the police, and the courts, as well as the likelihood of crime and violence” (25). The theoretical range is –2.5 (very weak rule of law) to 2.5 (very strong rule of law). The empirically observed range of the 211 countries for which this indicator is available is –2.20 to 1.99. The rule of law indicator varies between –1.23 and 1.96 in the countries of our participant pools.

Because both indicators reflect the views of the average citizen in a given society, it is likely that our participants, through various forms of cultural transmission (26), have been exposed to the prevalent social norms and have perceptions of the quality of the rule of law in their respective societies. Moreover, previous research, conducted in small-scale societies, suggests that experimentally observed behavior reflects socioeconomic conditions and daily experiences (11). Thus, there are good reasons to expect that the experimentally observable punishment behavior might be correlated with our indicators.

We investigated the link between punishment and the two indicators econometrically by running regressions of punishment expenditures on the variables norms of civic cooperation and rule of law. We distinguished between punishment of free riding and antisocial punishment, and we also controlled for the punisher's contribution, the contribution of the punished participant, the contribution of other group members, period effects, and the individual socioeconomic characteristics (to control for differences in participant-pool composition). The estimation method is Tobit (with robust standard errors clustered on the independent group).

The estimation results (Table 2) show that the stronger norms of civic cooperation are in the society, the harder people in the respective participant pool punish those who contributed less than them (P < 0.01). Rule of law has an insignificantly positive impact on the punishment of free riding. With respect to antisocial punishment, we found that both norms of civic cooperation and rule of law are significantly negatively correlated with punishment (at P < 0.05). In other words, antisocial punishment is harsher in participant pools from societies with weak norms of civic cooperation and a weak rule of law. Additional analyses (table S10) show that antisocial punishment also varies highly significantly with a variety of indicators developed by social scientists in order to characterize societies (table S1). Thus, the extent of antisocial punishment is most likely affected by the wider societal background.

Table 2.

Punishment, norms of civic cooperation, and the rule of law. The dependent variable is assigned punishment points to participants who contributed less than the punishing participants (models 1 to 3) or to participants who contributed the same or more than the punishing participant (models 4 to 6). The independent variables are the country scores of norms of civic cooperation and rule of law. Controls include the participants' own contribution, the punished participant's contribution, the average contribution of the remaining two participants, the period, a dummy for the final period, and individual socioeconomic characteristics. We show the coefficients of Tobit estimates (43). Robust standard errors are calculated by using the group as the independent cluster. Table S10 contains further analyses.

View this table:

Discussion. Evidence from economics, sociology, political science, and anthropology suggests that human social groups differ strongly in how successfully they solve cooperation problems (14, 2729). In reality, many exogenous factors, institutional and environmental conditions as well as population characteristics, can explain varying degrees of cooperative success. Our contribution is to show experimentally that (antisocial) punishment can lead to very strong differences in cooperation levels among comparable social groups acting in identical environments.

Antisocial punishment of cooperators existed in all our participant pools, but its importance and detrimental consequences varied strongly across them. Revenge is a likely explanation for antisocial punishment in most participant pools, but other (population-specific) motives might be relevant as well. Some antisocial punishment may be efficiency-enhancing in intent to induce the punished individual to increase his or her contributions. The fact that in most participant pools antisocial punishment was lower the higher the punished participant's contribution was is consistent with this explanation (table S7A). Because punishment in our experiment was cheaper for the punisher than for the punished participant, people with a strong taste for dominance (30), a competitive personality (31), or a desire to maximize relative payoffs (32) might not only punish freeloaders but also cooperators, even including those who contributed the same amount as the punisher. Low contributors might also view high conributors as do-gooders who have shown them up. Punishment may therefore be an act of “do-gooder derogation” (33). Similarly, as observed in some bargaining experiments (12, 34, 35) in which people reject hyperfair offers, people for various reasons might be suspicious of others who appear too generous. Normative conformity, a desire and expectation to behave as all others do, is part of human psychology (36) and may lead to the punishment of all deviators, cooperators, and free riders alike. Punishment may be also related to in-group–out-group distinctions (37) because people might retaliate if punished by an out-group member (38). Societies also differ in the extent to which their social structures are governed by in-group–out-group distinctions. For instance, according to some cross-cultural psychologists (15, 39) in “collectivist” societies many interactions are confined to close-knit social networks, whereas in “individualistic” societies interactions are more permeable across social groups. Because in our experiment all participants were strangers to one another, people in collectivist societies might be more inclined than people in individualistic societies to perceive other participants as out-group members. Therefore, antisocial punishment might be stronger in collectivist than in individualistic societies. Our evidence is consistent with this possibility because in regressions similar to those of Table 2 antisocial punishment is highly significantly correlated with a widely used societal-level measure of individualism-collectivism (15) (table S10).

Our finding that social norms of cooperation and punishment are linked is of relevance for the debate about social capital (14) and in particular a literature that argues that informal sanctions often substitute for formal enforcement mechanisms if these are lacking or not working well (7, 27, 4042). The fact that antisocial punishment is negatively correlated with the strength of the rule of law and also with cooperation levels suggests that the quality of the formal law enforcement institutions and informal sanctions are complements (rather than substitutes). Informal sanctions might be more effective in sustaining voluntary cooperation when the formal law enforcement institutions operate more effectively because antisocial punishment is lower in these societies. The detrimental effects of antisocial punishment on cooperation (and efficiency) also provide a further rationale why modern societies shun revenge and centralize punishment in the hands of the state.

Supporting Online Material

Materials and Methods

SOM Text

Figs. S1 to S4

Tables S1 to S10

References and Notes

References and Notes

View Abstract

Stay Connected to Science

Navigate This Article