Section 4.7 Repeated Prisoner's Dilemma¶
In this section we look at two players playing Prisoner's Dilemma repeatedly. We call this game an iterated Prisoner's Dilemma. Recall the general Prisoner's Dilemma matrix from previous sections, given again in Table 4.7.1.
|Driver 1||Cooperate||\((3, 3)\)||\((0, 5)\)|
|Defect||\((5, 0)\)||\((1, 1)\)|
Before playing the iterated version, think about how you would play the above game if you only play it once with an opponent. But let's also give the game some context as in the following exercise.
Exercise 4.7.2. A single internet purchase.
Suppose the above matrix represents the situation of purchasing an item (say, a used textbook) on the internet where both parties are untraceable. You agree to send the money at the same time that the seller agrees to send the book. Then we can think of Cooperating as each of you sending money/ book, and Defecting as not sending money/ book. Why might a player Cooperate? Why might a player Defect?
Exercise 4.7.3. Repeated internet purchases.
Now suppose you wish to continue to do business with the other party. For example, instead of buying a used textbook, maybe you are buying music downloads. Why might a player cooperate? Why might a player defect? Do these resons differ from your reasons in Exercise 4.7.2?
It is possible that your answers to the above questions depended on the context, so let's go back to just thinking about the game as a simple matrix game. Think about how you played the Class-wide Prisoner's Dilemma, but now think about repeating the game several times with the same player. Do you think your strategy would change? Remember, if we repeat the game we are not restricted to playing a pure strategy.
Exercise 4.7.4. Strategy for repeated Prisoner's Dilemma.
Suggest a strategy for playing the Prisoner's Dilemma in Table 4.7.1 repeatedly. DON'T SHARE YOUR STRATEGY WITH ANYONE YET!
Now let's see how your strategy works by actually playing the game several times.
Exercise 4.7.5. Play Prisoner's Dilemma repeatedly.
Play 10 rounds of Prisoner's Dilemma with someone in class. Use your suggested strategy. Keep track of the payoffs. Did your strategy seem effective? What should it mean to have an “effective” strategy?
We are now going to play a Prisoner's Dilemma Tournament! Several strategies are suggested below. Choose one of the strategies below (or keep playing with your own strategy). You are to play your strategy against everyone else in the class. Keep a running total of your score. Don't tell your opponents your strategy.
Some possible strategies:
Strategy: Defection. Your strategy is to always choose DEFECT (D).
Strategy: Cooperation. Your strategy is to always choose COOPERATE (C).
Strategy: Tit for Tat. Your strategy is to play whatever your opponent just played. Your first move is to COOPERATE (C), but then you need to repeat your opponent's last move.
Strategy: Tit for Two Tats. Your strategy is to COOPERATE (C) unless your opponent DEFECTS twice in a row. After two D's you respond with D.
Strategy: Random (1/2, 1/2). Your strategy is to COOPERATE (C) randomly 50% of the time and DEFECT (D) 50% of the time. [Note: it will be hard to be truly random, but try to play each option approximately the same amount.]
Strategy: Random (3/4, 1/4). Your strategy is to COOPERATE (C) randomly 75% of the time and DEFECT (D) 25% of the time. [Note: it will be hard to be truly random, but try to play C more often than D.]
Strategy: Random (1/4, 3/4). Your strategy is to COOPERATE (C) randomly 25% of the time and DEFECT (D) 75% of the time. [Note: it will be hard to be truly random, but try to play D more often than C.]
Strategy: Tit for Tat with Occasional Surprise D. Your strategy is to play whatever your opponent just played. Your first move is to COOPERATE (C), but then you need to repeat your opponent's last move. Occasionally, you will deviate from this strategy by playing D.
Exercise 4.7.6. A Prisoner's Dilemma tournament.
WITHOUT SHARING YOUR STRATEGY, play Prisoner's Dilemma 10 times with each of the other members of the class. Keep track of the payoffs for each game and your total score.
After playing Repeated Prisoner's Dilemma as a class, can you determine who used which strategy? At this point you may share your strategy with others. Was your strategy more effective with some strategies than with others? If some of the above strategies were not used, can you guess how your strategy would have done against them?
Exercise 4.7.7. Effectiveness of your strategy.
Describe which opponents' strategies seemed to give you more points, which seemed to give you fewer?
Exercise 4.7.8. Winning strategies.
Describe the strategy or strategies that had the highest scores in the tournament. Does this seem surprising? Why or why not? How do the winning strategies compare to the strategy you suggested in Exercise 4.7.4?
What strategies seem the most rational? Are pure strategies the most rational? Does it depend on what sort of strategy your opponent is playing?
Exercise 4.7.9. Rationality in repeated Prisoner's Dilemma.
How does Repeated Prisoner's Dilemma differ from the “one-time” Prisoner's Dilemma? Try to think in terms of rational strategies.
As Exercise 4.7.3 suggests we can think of real-life interactions that can be modeled as a Prisoner's Dilemma.
Exercise 4.7.10. Example of Repeated Prisoner's Dilemma in real life.
Describe a situation from real life that resembles a Repeated Prisoner's Dilemma.
Repeated or Iterated Prisoner's Dilemma has applications to biology and sociology. If you think of higher point totals as “success as a species” in biology or “success of a society” in sociology, we can try to determine which strategies seem the most effective or successful. Individuals do not need the highest point total to be successful, but low point totals will not succeed. Just like grades in a course, you don't need the highest score to pass a class, but very bad scores will fail. In order to model the situation of a society, think about what happens to defectors in a society of cooperators or cooperators in a society of defectors. Who will be able to succeed?
Exercise 4.7.11. Only a few defectors.
How do a few defectors fare in a society of mostly cooperators? How do the cooperators fare? (In other words, who will be more successful?) Keep in mind that everyone is playing with lots of cooperators and only a few defectors. Who will have the most points, cooperators or defectors?
Exercise 4.7.12. Only a few cooperators.
How do a few cooperators fare in a society of mostly defectors? How do the defectors fare? (In other words, who will be more successful?) Keep in mind that everyone is playing with lots of defectors and only a few cooperators. Who will have the most points, cooperators or defectors?
After thinking about individuals in a society playing pure strategies, what happens to individiuals who are playing some of the mixed strategies listed above?
Exercise 4.7.13. A society of TIT-FOR-TATers.
Now consider a society of mostly TIT-FOR-TATers. How do a few defectors fare in a society of mostly TIT-FOR-TATers? How do the TIT-FOR-TATers fare? How would a few cooperators fare with the TIT-FOR-TATers? Would the evolution of such a society favor cooperation or defection?
The TIT-FOR-TAT strategy is particularly interesting in an iterated Prisoner's Dilemma. It has a few special characteristics that lead to success. First it is responsive in that it responds to the strategy of the other player. If the other player is cooperating, the TIT-FOR-TAT strategy will be able to gain the 3 points on each round. If the other player is defecting, it will defect so as to not keep getting the sucker's payoff of 0. The random strategies and pure strategies, for example, are not responsive. They do not respond to how the other player is playing. Chances are when you played in the tournament, you wanted to be able to adapt your strategy to respond to how your opponent was playing.
The TIT-FOR-TAT strategy is also nice in that it starts by cooperating. If it meets another cooperator it will continue to cooperate. If the opponent at some point begins cooperating, it will, too. However, it is also unexploitable. This means that a defector cannot take advantage of the “niceness.” It “punishes” any defection with an in-kind defection.
The TIT-FOR-TAT behavior has been observed in some animal populations, but you also might be able to think of situations in your own life where you or your friends have employed such a strategy with each other! Has it been effective for you?