Evolution of Cooperation @ To Live! And to Write for Life!

星期天在火車上遇到一個英國籍漢莎航空飛行員，他剛下飛機，好像要回家(不然他住飯店就好唄)。上火車的時候，火車有延滯卻沒有廣播，我趕快問坐在前面的他確定一下自己有沒有做錯車。要下車的時候才發現我們同一站下車，這時他突然跟我講英語，我嚇了一跳問他母語是什麼，就這樣聊起來了。

他跟我一樣都是大學讀工科，他說他從來沒想到今天會變成飛行員。公司(漢莎航空)後來還讓他去讀MBA。他說讀了一些他根本用不上的東西，例如寫企劃書，算應收帳款等等。

跟他提到我想離開工程去讀社會研究，因為工程公司人和很難，這樣 defeats the purpose of working in this field，他心有戚戚焉點點頭。

接著他跟我說有一本社會研究的書很有趣，叫做 The Evolution of Cooperation，裡面有談到 game theory (尤其是 Prisoner's Dilemma)。我很高興，因為很多這種書只談生物行為，但是無法解釋複雜的人類行為。他說這本書的作者是個芝加哥大學畢業的政治學教授，跟我有地緣關係（很久沒有聽到這種話，哈哈）。

火車延遲8分鐘，他擔心趕不上下班火車，希望能當第一個下車的人，可惜他前面的小姐有點自私不讓他。我想到要交換聯絡資料時已經沒有多少時間了，很可惜，不過我想我們哪天再見面會認得出對方的。

在此節錄 Wikipedia 對 The Evolution of Cooperation (1985) 這本書的簡介 (目前只有英文，日文，德文和阿拉伯文)。

Axelrod's Tournaments

Axelrod initially solicited strategies from other game theorists to compete in the first tournament. Each strategy was paired with each other strategy for 200 iterations of a Prisoner's Dilemma game, and scored on the total points accumulated through the tournament. The winner was a very simple strategy submitted by Anatol Rapoport called "TIT FOR TAT" (TFT) that cooperates on the first move, and subsequently echoes (reciprocates) what the other player did on the previous move. The results of the first tournament were analyzed and published, and a second tournament held to see if anyone could find a better strategy. TIT FOR TAT won again. Axelrod analyzed the results, and made some interesting discoveries about the nature of cooperation, which he describes in his book^[30]

作者讓數個賽局策略下去玩囚徒困境200回，結果積分最高的是TFT (以牙還牙)策略。這一策略有兩個步驟：

第一個回合選擇合作
下一回合是否選合作要看上一回對方是否合作，若對方上一回背叛，此回合我亦背叛；若對方上一回合作，此回合繼續合作

In both actual tournaments and various replays the best performing strategies were nice^[31]: that is, they were never the first to defect. Many of the competitors went to great lengths to gain an advantage over the "nice" (and usually simpler) strategies, but to no avail: tricky strategies fighting for a few points generally could not do as well as nice strategies working together. TFT (and other "nice" strategies generally) "won, not by doing better than the other player, but by eliciting cooperation [and] by promoting the mutual interest rather than by exploiting the other's weakness."^[32]

Being "nice" can be beneficial, but it can also lead to being suckered. To obtain the benefit – or avoid exploitation – it is necessary to be provocable to both retaliation and forgiveness. When the other player defects, a nice strategy must immediately be provoked into retaliatory defection.^[33] The same goes for forgiveness: return to cooperation as soon as the other player does. Overdoing the punishment risks escalation, and can lead to an "unending echo of alternating defections" that depresses the scores of both players.^[34]

Most of the games that game theory had heretofore investigated are "zero-sum" – that is, the total rewards are fixed, and a player does well only at the expense of other players. But real life is not zero-sum. Our best prospects are usually in cooperative efforts. In fact, TFT cannot score higher than its partner; at best it can only do "as good as". Yet it won the tournaments by consistently scoring a strong second-place with a variety of partners.^[35] Axelrod summarizes this as don't be envious;^[36] in other words, don't strive for a payoff greater than the other player's.^[37]

In any IPD game there is a certain maximum score each player can get by always cooperating. But some strategies try to find ways of getting a little more with an occasional defection (exploitation). This can work against some strategies that are less provocable or more forgiving than TIT FOR TAT, but generally they do poorly. "A common problem with these rules is that they used complex methods of making inferences about the other player [strategy] – and these inferences were wrong."^[38] Against TFT (and "nice" strategies generally) one can do no better than to simply cooperate.^[39] Axelrod calls this clarity. Or: don't be too clever.^[40]

The success of any strategy depends on the nature of the particular strategies it encounters, which depends on the composition of the overall population. To better model the effects of reproductive success Axelrod also did an "ecological" tournament, where the prevalence of each type of strategy in each round was determined by that strategy's success in the previous round. The competition in each round becomes stronger as weaker performers are reduced and eliminated. The results were amazing: a handful of strategies – all "nice" – came to dominate the field.^[41] In a sea of non-nice strategies the "nice" strategies – provided they were also provokable – did well enough with each other to offset the occasional exploitation. As cooperation became general the non-provocable strategies were exploited and eventually eliminated, whereupon the exploitive (non-cooperating) strategies were out-performed by the cooperative strategies.

In summary, success in an evolutionary "game" correlated with the following characteristics:

Be nice: cooperate, never be the first to defect.
Be provocable: return defection for defection, cooperation for cooperation.
Don't be envious: be fair with your partner.
Don't be too clever: or, don't try to be tricky.

以牙還牙策略有四個特點：

合作：以牙還牙者開始一定採取合作態度，不會背叛對方
報復性：遭到對方背叛，以牙還牙者一定會還擊作出報復
不妒忌：不要想要得到的比對方多
不耍小聰明：不主動耍奸背叛一向合作的人

針對報復，有一點很重要，就是報復只是手段，不是心態：當對方停止背叛，以牙還牙者會原諒對方，繼續合作。因此以牙還牙者在每一回合永遠不會得到超過對手的利益，頂多相等，但是200回合下來他的積分最高。

在一片壞心眼的對手裡，會報復的以牙還牙者積分都比壞心眼的高。若一直剔除積分低的人，最後會只剩下以牙還牙者和合作的人。

Foundation of reciprocal cooperation

The lessons described above apply in environments that support cooperation, but whether cooperation is supported at all depends crucially on the probability (called ω [omega]) that the players will meet again,^[42] also called the discount parameter or, poetically, the shadow of the future. When ω is low – that is, the players have a negligible chance of meeting again – each interaction is effectively a single-shot Prisoner's Dilemma game, and one might as well defect in all cases (a strategy called "ALL D"), because even if one cooperates there is no way to keep the other player from exploiting that. But in the iterated PD the value of repeated cooperative interactions can become greater than the benefit/risk of a single exploitation (which is all that a strategy like TFT will tolerate).

Curiously, rationality and deliberate choice are not necessary, nor trust nor even consciousness,^[43] as long as there is a pattern that benefits both players (e.g., increases fitness), and some probability of future interaction. Often the initial mutual cooperation is not even intentional, but having "discovered" a beneficial pattern both parties respond to it by continuing the conditions that maintain it.

This implies two requirements for the players, aside from whatever strategy they may adopt. First, they must be able to recognize other players, to avoid exploitation by cheaters. Second, they must be able to track their previous history with any given player, in order to be responsive to that player's strategy.^[44]

Even when the discount parameter ω is high enough to permit reciprocal cooperation there is still a question of whether and how cooperation might start. One of Axelrod's findings is that when the existing population never offers cooperation nor reciprocates it – the case of ALL D – then no nice strategy can get established by isolated individuals; cooperation is strictly a sucker bet. (The "futility of isolated revolt".^[45]) But another finding of great significance is that clusters of nice strategies can get established. Even a small group of individuals with nice strategies with infrequent interactions can yet do so well on those interactions to make up for the low level of exploitation from non-nice strategies.^[46]

當和對手再次交手的機會不大時，其實以牙還牙的策略便失效。合作的決定常常不是理性的或有心的，也不需要對彼此有信任感或有意識的，往往只是看到一個互惠的模式還有想到將來的合作機會。

除了必須在能和對手交手很多次的環境/結構條件下，合作的出現必須還要有2個環境/結構條件。一是能夠分辨這次的對手是誰(總共有數名對手)，二是能查到對手以前的紀錄。

在環境/結構不好以及合作的人(合作策略和以牙還牙者)也不多的情況下，合作的幾位仍然可以因為幾次合作成功的高利益回報贏得不錯的積分，尤其是如果他們有足夠的社交資源能串聯起來增加彼此合作的機會。

Subsequent work

In 1984 Axelrod estimated that there were "hundreds of articles on the Prisoner's Dilemma cited in Psychological Abstracts",^[47] and estimated that citations to The Evolution of Cooperation alone were "growing at the rate of over 300 per year".^[48] To fully review this literature is infeasible. What follows are therefore only a few selected highlights.

Axelrod has a subsequent book, The Complexity of Cooperation,^[49] which he considers a sequel to The Evolution of Cooperation. Other work on the evolution of cooperation has expanded to cover prosocial behavior generally,^[50] and in religion,^[51] other mechanisms for generating cooperation,^[52] the IPD under different conditions and assumptions,^[53] and the use of other games such as the Public Goods and Ultimatum games to explore deep-seated notions of fairness and fair play.^[54] It has also been used to challenge the rational and self-regarding "economic man" model of economics,^[55] and as a basis for replacing Darwinian sexual selection theory with a theory of social selection.^[56]

Nice strategies are better able to invade if they have social structures or other means of increasing their interactions. Axelrod discusses this in chapter 8; in a later paper he and Rick Riolo and Michael Cohen^[57] use computer simulations to show cooperation rising among agents who have negligible chance of future encounters but can recognize similarity of an arbitrary characteristic (such as a green beard).

When an IPD tournament introduces noise (errors or misunderstandings) TFT strategies can get trapped into a long string of retaliatory defections, thereby depressing their score. TFT also tolerates "ALL C" (always cooperate) strategies, which then give an opening to exploiters.^[58]

當有溝通不良的情況發生時，以牙還牙者便容易和壞心眼的對手陷入一長串的報復行為，造成積分大大降低。以牙還牙者也可能因為容忍總是合作的對手，給剝削者一個機會。

In a 2006 paper Nowak listed five mechanisms by which natural selection can lead to cooperation.^[60] In addition to kin selection and direct reciprocity, he shows that:

Indirect reciprocity is based on knowing the other player's reputation, which is the player's history with other players. Cooperation depends on a reliable history being projected from past partners to future partners.

Network reciprocity relies on geographical or social factors to increase the interactions with nearer neighbors; it is essentially a virtual group.

Group selection^[61] assumes that groups with cooperators (even altruists) will be more successful as a whole, and this will tend to benefit all members.

And there is the very intriguing paper "The Coevolution of Parochial Altruism and War" by Jung-Kyoo Choi and Samuel Bowles. From their summary:

Altruism—benefiting fellow group members at a cost to oneself —and parochialism—hostility towards individuals not of one's own ethnic, racial, or other group—are common human behaviors. The intersection of the two—which we term "parochial altruism"—is puzzling from an evolutionary perspective because altruistic or parochial behavior reduces one's payoffs by comparison to what one would gain from eschewing these behaviors. But parochial altruism could have evolved if parochialism promoted intergroup hostilities and the combination of altruism and parochialism contributed to success in these conflicts.... [Neither] would have been viable singly, but by promoting group conflict they could have evolved jointly.^[64]

They do not claim that humans have actually evolved in this way, but that computer simulations show how war could be promoted by the interaction of these behaviors.

另外有學者提出狹隘利他主義和戰爭的共同演化。利他主義者願意犧牲自己成就團體的利益，狹隘利他主義則是成就一個排外的團體(例如種族和政治歧視，在自己內部也算)。以長李推斷，利他主義和排外的行為非常不利於個人的利益，但是如果團體間互相有很重大的摩擦，而且狹隘利他主義能幫一個團體勝過其他團體，那麼這些團體就會希望更加激化衝突。

像不像台灣的政治現況? 那就不要參與排外的團體，也不要被犧牲小我完成大我的利他主義洗腦了。

請尊重作者著作權，轉貼前請告知本人，謝謝。

shiningc

To Live! And to Write for Life!

shiningc 發表在痞客邦留言(1) 人氣()

E-mail轉寄

To Live! And to Write for Life!

Sharing feelings, ideas, minds & hearts, wonderful discoveries in everyday life!

Evolution of Cooperation

留言列表

文章分類

English as a Foreign Language (2)

個人成長 Personal Growth (9)

英文求職 (1)

最新迴響

近期文章

參觀人氣