## Abstract

In the Ultimatum Game, two players are offered a chance to win a certain sum of money. All they must do is divide it. The proposer suggests how to split the sum. The responder can accept or reject the deal. If the deal is rejected, neither player gets anything. The rational solution, suggested by game theory, is for the proposer to offer the smallest possible share and for the responder to accept it. If humans play the game, however, the most frequent outcome is a fair share. In this paper, we develop an evolutionary approach to the Ultimatum Game. We show that fairness will evolve if the proposer can obtain some information on what deals the responder has accepted in the past. Hence, the evolution of fairness, similarly to the evolution of cooperation, is linked to reputation.

The Ultimatum Game is quickly catching up with the Prisoner's Dilemma as a prime showpiece of apparently irrational behavior. In the past two decades, it has inspired dozens of theoretical and experimental investigations. The rules of the game are surprisingly simple. Two players have to agree on how to split a sum of money. The proposer makes an offer. If the responder accepts, the deal goes ahead. If the responder rejects, neither player gets anything. In both cases, the game is over. Obviously, rational responders should accept even the smallest positive offer, since the alternative is getting nothing. Proposers, therefore, should be able to claim almost the entire sum. In a large number of human studies, however, conducted with different incentives in different countries, the majority of proposers offer 40 to 50% of the total sum, and about half of all responders reject offers below 30% (1–6).

The irrational human emphasis on a fair division suggests that players have preferences which do not depend solely on their own payoff, and that responders are ready to punish proposers offering only a small share by rejecting the deal (which costs less to themselves than to the proposers). But how do these preferences come about? One possible explanation is that the players do not grasp that they interact only once. Humans are accustomed to repeated interactions. Repeating the Ultimatum Game is like haggling over a price, and fair splits are more likely (6–8). Another argument is based on the view that allowing a co-player to get a large share is conceding a relative advantage to a direct rival. This argument holds only for very small groups, however: a simple calculation shows that responders should only reject offers that are less than 1/*n*th of the total sum, where *n* is the number of individuals in the group (9). A third explanation is based on the idea that a substantial proportion of humans maximize a subjective utility function different from the payoff (10–12).

Here, we studied the Ultimatum Game from the perspective of evolutionary game theory (13). To discuss this model, both analytically and by means of computer simulations, we set the sum which is to be divided equal to 1 and assumed that players are equally likely to be in either of the two roles. Their strategies are given by two parameters *p* and *q* ∈ [0,1]. When acting as proposer, the player offers the amount *p*. When acting as responder, the player rejects any offer smaller than *q*. The parameter *q* can be seen as an aspiration level. It is reasonable to assume that the share kept by the player acting as proposer, 1 − *p*, should not be smaller than the aspiration level, *q*. Therefore, only strategies with *p*+ *q* ≤ 1 were considered (14).

The expected payoff for a player using strategy*S*
_{1} = (*p*
_{1},*q*
_{1}) against a player using *S*
_{2} = (*p*
_{2},*q*
_{2}) is given (up to the factor 1/2, which we henceforth omit) by (i) 1 −*p*
_{1} + *p*
_{2}, if*p*
_{1} ≥ *q*
_{2} and*p*
_{2} ≥ *q*
_{1}; (ii) 1 −*p*
_{1}, if *p*
_{1} ≥*q*
_{2} and *p*
_{2} <*q*
_{1}; (iii) *p*
_{2}, if*p*
_{1} < *q*
_{2} and*p*
_{2} ≥ *q*
_{1}; and (iv) 0, if *p*
_{1} < *q*
_{2} and*p*
_{2} < *q*
_{1}.

Before studying the full game, with its continuum of strategies, let us first consider a so-called minigame with only two possible offers*h* and *l* (high and low), with 0 <*l* < *h* < 1/2 (9, 15). There are four different strategies (*l*,*l*), (*h*,*l*), (*h*,*h*), and (*l*,*h*), which we enumerate, in this order, by*G*
_{1} to *G*
_{4}.*G*
_{1} is the “reasonable” strategy of offering little and rejecting nothing [for the cognoscenti: it is the only subgame perfect Nash equilibrium of the minigame (16)]. *G*
_{2} makes a high offer but is willing to accept a low offer. *G*
_{3} is the “fair” strategy, offering and demanding a high share. For the sake of exposition, we omit *G*
_{4}, which gets eliminated anyway. To describe the change in the frequencies*x*
_{1}, *x*
_{2}, and*x*
_{3} of the strategies *G*
_{1},*G*
_{2}, and *G*
_{3}, respectively, we use the replicator equation. It describes a population dynamics where successful strategies spread, either by cultural imitation or biological reproduction (17). Under these dynamics, the reasonable strategy *G*
_{1} will eventually reach fixation. Populations that consist only of *G*
_{1}and *G*
_{3} players will converge to pure*G*
_{1} or *G*
_{3} populations depending on the initial frequencies of the two strategies. Mixtures of*G*
_{1} and *G*
_{2} players will always tend to *G*
_{1}, but mixtures of*G*
_{2} and *G*
_{3} players are neutrally stable and subject to random drift. Hence, starting with any mixture of *G*
_{1}, *G*
_{2}, and*G*
_{3} players, evolution will always lead to a population that consists entirely of *G*
_{1} players (18). Reason dominates fairness.

Let us now introduce the possibility that players can obtain information about previous encounters. In this case, individuals have to be careful about their reputation: if they accept low offers, this may become known, and the next proposer may think twice about making a high offer. Assume, therefore, that the average offer of an*h*-proposer to an *l*-responder is lowered by an amount *a*. Even if this amount is very small—possibly because obtaining information on the co-player is difficult or because the information may be considered unreliable by*h*-proposers—the effect is drastic (19). In a mixture of *h*-proposers only, the fair strategy,*G*
_{3} dominates. The whole system is now bistable: depending on the initial condition, either the reasonable strategy*G*
_{1} or the fair strategy*G*
_{3} reaches fixation (Fig. 1). In the extreme case, where*h*-proposers have full information on the responder's type and offer only *l* when they can get away with it, we observe a reversal of the game: *G*
_{3} reaches fixation whereas mixtures between *G*
_{1} and*G*
_{2} are neutrally stable. Intuitively, this reversal occurs because it is now the responder who has the initiative: it is up to the proposer to react.

For 0 < *a* < *h* − *l*,*G*
_{3} risk-dominates (20): this implies that whenever stochastic fluctuations are added to the population (by allowing mutation, for instance, or spatial diffusion), the fair strategy will supersede the reasonable one in the long run (Fig. 1).

Let us now study the evolutionary dynamics on the continuum of all strategies,*S*(*p*,*q*). Consider a population of *n* players. In every generation, several random pairs are formed. Suppose each player will be proposer on average*r* times and be responder the same number of times. The payoffs of all individuals are then summed up. For the next generation, individuals leave a number of offspring proportional to their total payoff. Offspring adopt the strategy of their parents, plus or minus some small random value. Thus, this system includes selection and mutation. As before, we can interpret these dynamics as denoting biological or cultural reproduction. We observe that the evolutionary dynamics lead to a state where all players adopt strategies that are close to the rational strategy, *S*(0,0).

Let us now add the possibility that a proposer can sometimes obtain information on what offers have been accepted by the responder in the past. We stress that the same players need not meet twice. We assume that a proposer will offer, whatever is smaller, his own*p*-value or the minimum offer that he knows has been accepted by the responder during previous encounters. In addition, we include a small probability that proposers will make offers that are reduced by a small, randomly chosen amount. This effect allows a proposer to test for responders who are willing to accept low offers. Hence,*p* can be seen as a proposer's maximum offer, whereas*q* represents a responder's minimum acceptance level. Each accepted deal is made known to a fraction *w* of all players. Thus, individuals who accept low offers run the risk of receiving reduced offers in the future. In contrast, the costly act of rejecting a low offer buys the reputation that one accepts only fair offers.Figure 2 shows that this process can readily lead to the evolution of fairness. The average *p* and*q* values depend on the number of games per individual,*r*, and the fraction *w* of individuals who find out about any given interaction. Larger *r* and *w*values lead to fairer solutions.

Hence, evolutionary dynamics, in accordance with the predictions of economic game theory, lead to rational solutions in the basic Ultimatum Game. Thus, one need not assume that the players are rational utility-maximizers to predict the prevalence of low offers and low aspiration levels. Whatever the evolutionary mechanism—learning by trial and error, imitation, inheritance—it always promotes the same reasonable outcome: low offers and low demands.

If, however, we include the possibility that individuals can obtain some information on which offers have been accepted by others in previous encounters, the outcome is dramatically different. Under these circumstances, evolutionary dynamics tend to favor strategies that demand and offer a fair share of the prize. This effect, which does not require the same players to interact twice, suffices to keep the aspiration levels high. Accepting low offers damages the individual's reputation within the group and increases the chance of receiving reduced offers in subsequent encounters. Rejecting low offers is costly, but the cost is offset by gaining the reputation of somebody who insists on a fair offer. When reputation is included in the Ultimatum Game, adaptation favors fairness over reason. In this most elementary game, information on the co-player fosters the emergence of strategies that are nonrational, but promote economic exchange. This agrees well with findings on the emergence of cooperation (21) or of bargaining behavior (22). Reputation based on commitment and communication plays an essential role in the natural history of economic life (23).

↵* To whom correspondence should be addressed. E-mail: nowak{at}ias.edu