Cooperation

Closely related to the concept of altruism is the concept of co-operation. The chief difference between the two concepts is that, whereas altruism is often one-sided, co-operation is a two-way street. Two (or more) co-operating individuals mutually aid each other. Conspicuous examples of co-operation (although almost never of ultimate self-sacrifice) also occur where relatedness is either absent or low. In fact, there are numerous examples of co-operation between organisms belonging to different species. When it is highly developed and characteristic of most members of two species, such interspecific co-operation is called a mutualistic symbiosis. Examples include:

a) the species of fungus and the species of alga which together compose a lichen.
b) certain species of ants and certain species of acacias, where the trees house and provide food for the ants which, in turn, protect the trees.
c) fig wasps and fig trees, where the wasps (whose larvae eat fig flowers) serve as the tree's only means of pollination and seed set.

We are now going to consider a theoretical model of co-operation proposed by Robert Axelrod and W. D. Hamilton (Science, 1981, 211, 1390-1396). The basic set of problems with which the theory deals are the following. Many benefits of life seem to be disproportionately available to co-operating groups of animals; and the individual animals living in such groups benefit from mutual co-operation. Nevertheless, it is almost always the case than any single individual could do even better by exploiting the co-operative behaviour of other members of the group without being willing to reciprocate. Thus, cheating should usually pay off by enhancing the personal fitness of the cheater. In other words, natural selection operating at the level of the individual should generally favour cheating (and thus disfavour co-operation) in situations where the co-operators are not relatives.

However, an important exception to this general rule exists if individual animals are likely to interact with one another repeatedly. Whenever repeated interactions occur, complex strategies may evolve. Mathematical game theory in general and the Prisoner's Dilemma game, in particular, allow us to describe and make inferences about these strategic interactions. The Prisoner's Dilemma game deals specifically with problems of co-operation; and we will be following Axelrod and Hamilton's application of the Prisoner's Dilemma to the evolution of mutual co-operation.

To keep their analysis fairly simple, Axelrod and Hamilton focused on the two-player version of the Prisoner's Dilemma, the version which described interactions between pairs of individuals. In the Prisoner's Dilemma game, the two individuals can either "co-operate" or "defect" (not co-operate); and we assume that the players receive payoffs in personal fitness (probability of survival and/or reproduction).

The following diagram depicts the possible outcomes of the game:



This diagram shows the payoff to Player A with illustrative numbers. The game is defined by two inequalities:
and .

No matter what the other individual does, the selfish choice of defection yields a higher payoff than co-operation. But, if both individuals defect, both do worse than if both had co-operated. Given that the other player co-operates, there is a choice between co-operation which yields R (the reward for mutual co-operation) or defection which yields T (the temptation to defect). By assumption T > R, so that it pays to defect if the other player co-operates. On the other hand, if the other player defects, there is a choice between co-operation which yields S (the Sucker's payoff) or defection which yields P (the punishment for mutual defection). By assumption P > S, so that it pays to defect if the other player defects. Thus, no matter what the other player does, it pays to defect. Yet, if both players defect, both players get P rather than the larger R which they could have gotten if both had co-operated. Hence the dilemma.

With two individuals destined never to meet again, the only strategy that can be called a solution to the game is to defect always despite the seemingly paradoxical outcome that both players do worse than they could have done had they co-operated. In addition to being the solution in game theory, defection is also the solution in terms of biological evolution. It is the outcome of inevitable evolutionary trends through mutation and natural selection. That is, if the payoffs are in terms of individual fitness, and if interactions between pairs of individuals are random and not repeated, then any population containing a mixture of heritable strategies will evolve to a state where all the individuals are defectors. Moreover, no single differing mutant strategy can do better than defection when the population consists of defectors. Thus, under these circumstances, defection is an evolutionarily stable strategy, i.e. it will evolve and cannot be successfully invaded by mutant individuals following any other strategy.

However, in many biological settings, the same two individuals may meet more than once. If an individual can recognize another individual with whom it has interacted previously, and if the individual can remember some aspects of the outcomes of prior interactions, then the strategic situation becomes what is called an iterated Prisoner's Dilemma. In an iterated Prisoner's Dilemma, the possibilities for successful strategies are much broader. In such a game, a strategy consists of a decision rule for determining the probability of co-operation or defection as a function of the history of the interaction to date.

However, if there is a known number of interactions between two individuals, then defection is still the only evolutionarily stable strategy. The reason for this is that, given a known number of interactions, it is advantageous for both parties to defect on the last interaction. This being the case, it would be advantageous for both players to try to "beat the other to the punch" by defecting on the next to last interaction, and so on back to the first interaction. Thus, for defection not to be an evolutionarily stable strategy, we must assume that individuals do not know how often they will interact.

In their model, Axelrod and Hamilton state that there will always be some probability, , that any given interaction is not the last interaction between a particular pair of individuals. Factors which could be expected to affect the magnitude of this probability of meeting again include average lifespan, relative mobility, and the health of the interacting individuals. Moreover, for any value of w unconditional defection (ALL D) is evolutionarily stable. That is, if all individuals use ALL D, no mutant strategy can invade the population. However, if > 0, other strategies may be evolutionarily stable as well. In fact, according to Axelrod and Hamilton, if is large enough, there is no single best strategy which is independent of the behaviour of other members of the population.

Before we consider the further development of the theory, let's consider the range of biological phenomena covered by this game approach. To start with, an animal doesn't need a brain to employ a strategy. For example, bacteria can play strategic games because:

i.) they are highly responsive to selected aspects of their environment, especially their chemical environment.
ii.) they can respond differently depending upon what other organisms in their vicinity are doing.
iii.) these conditional strategies of behaviour can be inherited.
iv.) the behaviour of a bacterium can affect the fitness of other organisms around it, just as the behaviour of other organisms can affect the fitness of the bacterium.

While bacterial strategies covered by the theory can easily include differential responsiveness to recent changes in the environment or to cumulative averages over time, in other ways the range of responsiveness is limited. Bacteria cannot remember or interpret a complex past sequence of changes; and they probably can't distinguish alternative origins of adverse or favourable changes. Some bacteria, for example, produce their own antibiotic, called bacteriocins. These chemicals are harmless to the bacteria producing them, but they destroy other kinds of bacteria. A bacterium might thus make production of its own bacteriocin dependent on the presence of hostile bacteriocins in its environment. However, it could not aim its toxin at an offending initiator. In fact existing evidence suggests that discrimination is at the level of the species rather than the individual or group.

However, game-playing behaviour increases in complexity in animals with increasingly complex nervous systems. At the opposite extreme from bacteria, the intelligence of primates permits a number of important improvements in game-playing behaviour: more complex memory, more complex processing of information to determine the next action as a function of the interaction so far, a better estimate of the probability of future interaction with the same individual, and a better ability to distinguish between different individuals. The capacity to discriminate among others is probably one of the most important abilities because it allows an animal to handle interactions with many individuals without having to treat them all the same. Animals who can discriminate among individuals can reward co-operation in one animal and punish defection in another.

The model of the iterated Prisoner's Dilemma is much less restricted than it might at first appear. Not only can it handle interactions among bacteria or among primates, it can also handle interactions between a colony of bacteria and a primate. This is so because the model makes no assumption about the commensurability of payoffs between the two sides. Provided that the payoffs to each side satisfy the inequalities which define the Prisoner's Dilemma (T > R > P > S and ), the results of the analysis will be applicable.

The model does assume that the choices are made simultaneously and with discrete time intervals. For most analytic purposes, this is equivalent to a continuous interaction over time, with the time period of the model corresponding to the minimum time between a change in behaviour by one side and a response by the other. However, while the model treats the choices as simultaneous, it would make little difference if they were treated as sequential.

Axelrod and Hamilton used their model to answer three kinds of questions about the evolution of co-operative behaviour:

1) Robustness. What type of co-operative strategy can thrive in a variegated environment composed of other individuals using a wide variety of other more or less sophisticated strategies?

2) Stability. Under what conditions can a strategy that has become established in a population resist invasion by mutant strategies?

3) Initial viability. Even if a strategy is robust and stable, how can it ever get a foothold in an environment which is predominantly nonco-operative?

Robustness

To see what type of strategy could thrive in a variegated environment of more or less sophisticated strategies, Axelrod conducted a computer tournament for the Prisoner's Dilemma. Fourteen different kinds of strategies submitted by economists, sociologists, political scientists, and mathematicians were tested against one another in games lasting for 200 moves. Although some of the strategies tested were quite intricate, the result of the tournament was that the highest average score was obtained by the simplest of all the strategies tested: TIT FOR TAT. This strategy consists of co-operating on the first move and then doing whatever the other player did last on each succeeding move. Thus, TIT FOR TAT is a strategy of co-operation based on reciprocity.

The results of this first computer tournament were circulated together with a request for other strategies to test in a second round. This time 62 different strategies were tested. However, TIT FOR TAT won again. Analysis indicated that TIT FOR TAT's robustness depended on three characteristics:

a) TIT FOR TAT never defects first.
b) Defection by the other side provokes immediate retaliation from TIT FOR TAT.
c) TIT FOR TAT can return to co-operation after only one act of retaliation.

The robustness of TIT FOR TAT was also manifested in an "ecological analysis" of strategies. The ecological approach takes as given the varieties strategies which are present and investigates how they do over time when interacting with each other. The analysis was based on what would happen if each of the strategies in the second round were submitted to a hypothetical next round in proportion to its success in the previous round. This process was then repeated to generate the time path of the distribution of strategies. The results showed that, as the less successful rules were displaced, TIT FOR TAT continued to do well interacting with the other rules which had also attained initial success. In other words, TIT FOR TAT's success is not dependent on its interaction with losing strategies. It also succeeds when interacting with other winning strategies.

Stability

The question of stability assumes that a strategy has become evolutionarily fixed and asks whether that strategy can resist invasion by a mutant strategy. Axelrod and Hamilton were able to show that, once established, TIT FOR TAT can resist invasion by any other strategy provided that the interacting individuals have a high enough probability of meeting again (i.e. provided w is high enough. The proof runs as follows. Recall that TIT FOR TAT "remembers" back only one step in the interaction. Thus, one C from the other player is sufficient to reset the situation to what it was at the beginning of the game. Similarly, one D sets the situation to what it was at the second round after a D was played in the first. Since there is a fixed chance, , of the interaction not ending on any given move, a strategy cannot be maximal in playing with TIT FOR TAT unless it does the same thing both at the first occurrence of a given state and at each resetting to that state. Thus, if a rule is maximal and begins with C, the second round has the same state as the first, and thus a maximal rule will continue to co-operate with TIT FOR TAT. But such a rule will not do better than TIT FOR TAT does with another TIT FOR TAT, and hence it cannot invade.

On the other hand, if a rule begins with D, then this first D induces a switch in the state of TIT FOR TAT and there are two possibilities for continuation that could be maximal. If D follows the first D, then this being maximal at the start implies that it is everywhere maximal to follow D with D, making the strategy equivalent to ALL D. If C follows the initial D, the game is then reset as for the first move; so it must be maximal to repeat the sequence of DC indefinitely. These points show that the task of searching a seemingly infinite array of rules of behaviour for one potentially capable of invading TIT FOR TAT is really not so difficult as it seemed. If neither ALL D nor DC can invade TIT FOR TAT, then no strategy can.

To see when these strategies can invade, we note that the probability that the nth interaction actually occurs is . Therefore, the expression for the total payoff is easily found by applying weights of 1, , to the payoff sequence and summing the resultant series. When TIT FOR TAT plays another TIT FOR TAT, it gets a payoff of R each move for a total of , the expected value of which is equal to .

When ALL D plays with TIT FOR TAT, ALL D gets T on the first move and P thereafter for an expected payoff of . Thus ALL D cannot invade TIT FOR TAT if:

Similarly, when alternation of D and C plays with TIT FOR TAT, it gets a payoff of:

Alternation of D and C thus cannot invade TIT FOR TAT if

This demonstrates that TIT FOR TAT is evolutionarily stable if and only if the interactions between the individuals have a sufficiently large probability of continuing. The following table shows the relationship between the expected payoffs for the three strategies as a function of when R=3, S=0, T=5, and P=1:

TIT FOR TAT ALL D DC ALTERNATION

0.00

3.00

5.00

5.00

0.05

3.16

5.05

5.01

0.10

3.33

5.11

5.05

0.15

3.53

5.18

5.12

0.20

3.75

5.25

5.21

0.25

4.00

5.33

5.33

0.30

4.29

5.43

5.49

0.35

4.62

5.54

5.70

0.40

5.00

5.67

5.95

0.45

5.45

5.82

6.27

0.50

6.00

6.00

6.67

0.55

6.67

6.22

7.17

0.60

7.50

6.50

7.81

0.65

8.57

6.86

8.66

0.70

10.00

7.33

9.80

0.75

12.00

8.00

11.43

0.80

15.00

9.00

13.89

0.85

20.00

10.67

18.02

0.90

30.00

14.00

26.32

0.95

60.00

24.00

51.28

0.99

300.00

104.00

251.26


Initial Viability

TIT FOR TAT is not the only strategy that can be evolutionarily stable. If fact, ALL D is evolutionarily stable no matter what is the probability of the interaction continuing. This raises the problem of how an evolutionary trend toward co-operation could ever get started. Genetic kinship theory suggests one plausible escape from ALL D. Close relatedness of interactants permits true altruism - the sacrifice of one individual's fitness for the sake of another. As we have seen, true altruism can evolve when conditions of cost, benefit, and relatedness yield net gains for altruism-causing genes that are resident in related individuals (i.e. if C < Br). In effect, recalculation of the Prisoner's Dilemma payoff matrix so that an individual has a part interest in the other player's gain (i.e. reckoning payoffs in terms of inclusive fitness) often results in eliminating or reversing the inequalities T > R and P > S. Whenever this happens, co-operation is unconditionally favoured. In this way, we can imagine that the benefits of co-operation in Prisoner's Dilemma like situations might begin to be harvested by groups of relatives.

Once genes for co-operation exist, selection will promote strategies that base co-operative behaviour on cues in the environment. Such factors as promiscuous fatherhood and events at ill-defined margins lead to uncertain degrees of relatedness among socially interacting individuals. The recognition of any improved correlates of relatedness and use of these cues to determine co-operative behaviour will always permit advance in inclusive fitness. When a co-operative choice has been made, one cue to relatedness is reciprocation of the co-operation. Thus, modifiers for more selfish behaviour after a negative response from the other are advantageous whenever the degree of relatedness is low or doubtful. As such, conditionality is acquired, and co-operation can spread into circumstances of less and less relatedness. Finally, when the probability of two individuals meeting each other again is sufficiently high, co-operation based on reciprocity can thrive and be evolutionarily stable in a population with no relatedness at all.

Another mechanism that can get co-operation started when virtually everyone is using ALL D is clustering. Suppose that a small group of individuals is using a strategy such as TIT FOR TAT and that a certain proportion, p, of the interactions of members of this cluster are with other members of the cluster. Then the average score attained by members of the cluster playing TIT FOR TAT is:


If members of the cluster provide a negligible proportion of the interaction for the other individuals, then the score obtained by those using ALL D is still . When p and are large enough, a cluster of TIT FOR TAT individuals can then become initially viable in an environment composed overwhelmingly of ALL D.

Clustering is often associated with kinship, and the two mechanisms can reinforce each other in promoting the initial viability of reciprocal co-operation. However, it is possible for clustering to be effective without kinship.

We have seen that TIT FOR TAT can intrude in a cluster on a population of ALL D, even though ALL D is evolutionarily stable. This is possible because a cluster of TIT FOR TATs gives each member of the cluster a non-trivial probability of meeting another individual who will reciprocate co-operation. While this suggests a mechanism for the initiation of co-operation, it also raises the question about whether the reverse could happen once a strategy like TIT FOR TAT had become established. However, it turns out that clustering will not lead to ALL D invading TIT FOR TAT.

Let us define a nice strategy as one, such as TIT FOR TAT, which will never be the first to defect. Whenever, two nice strategists interact, they both receive R on each move; and by definition R is the highest average score which an individual can get by interacting with another individual using the same strategy. Therefore, if a strategy is nice and evolutionarily stable, it cannot be intruded upon by a cluster. This is because the score achieved by the strategy that comes in a cluster is the weighted average of how it does with others of its own kind and with others of the dominant kind. Each of these components is less than or equal to the score achieved by the predominant, nice, evolutionarily stable strategy. Therefore, a strategy arriving in a cluster cannot intrude upon a nice, evolutionarily stable strategy. Thus, when w is large enough to make TIT FOR TAT an evolutionarily stable strategy, TIT FOR TAT can resist invasion by any cluster of any other strategy. In other words, "the gear wheels of co-operative social evolution have a ratchet."

The general story to emerge from this kind of an analysis is as follows. ALL D is the most primitive state; and it is evolutionarily stable. This means that it can resist invasion by virtually any strategy so long as the invading strategy must have almost all its interactions with ALL D. However, co-operation based on reciprocity can gain a foothold in either one of two ways. First, there can be kinship between mutant strategists, giving the genes of the mutants some stake in one another's success and thereby altering the effective payoff matrix of the interaction when viewed from the perspective of the gene rather than the individual. A second mechanism to overcome total defection is for the mutant strategists to arrive in a cluster so that they provide a non-trivial proportion of the interactions each has. Then even if there are so few mutants that they provide a negligible proportion of interactions for the ALL Ds, the mutants may gain a foothold. And the tournament approach suggests that, once a mixture of strategies is present, TIT FOR TAT is extremely robust and does so well against so many kinds of other strategies that it is likely to evolve to fixation. Once fixed in a population, its future is particularly secure since it can resist invasion by cluster.