Repeated games

From Simulace.info
Revision as of 23:23, 26 January 2014 by Xdvob13 (talk | contribs) (Strategie in the tournament)
Jump to: navigation, search

Introduction

In previous chapters of game theory we introduced one-shot games like prisoners dilemma. In other words situations when we are allowed to do one action that result in some outcome (payoff). For example a situation when someone wants to enjoy fast car ride. If he drives really fast he enjoys fast ride and he´s happy but he also rist that police will catch him and arrest. So he must choose between driving fast or not fast according to his ulitily from one decision. But in the real world we often meet situations that repeat or their consequences affect other situations. We can say that our live is set of games in some order and we are playing them. And all of us know that this games, this situations repeat very often. We go to work every day or go shopping for example. Sequence of games that affect other games and are played by the same players is called repeated game.


We can say that every human is rational and follows his own profit. That in other words mean that a person should not have problem with betrayal of other person to get higher payoff. But what if game continues and this betrayal can be avenger? Under these circumstances is the first person willing to betray the second in the first round? It depends on the situation. It depends on the payoff. So we can say that in a game of various number of persons we can achieve cooperation for a time that players benefit from cooperation.


Game theory is often quoted by politicians to gave their opinios seriousness and as a way how to look smarter that they are. And it is also a great example of repeated game. Politicians like senators often vote about laws they propose. Options for the vote are support or not to support. If the vote was only a matter of good proposal/bad proposal, politicians would vote exactly the same. But because they follow their own goals the vote depends on many different things. Lets say that senator A propose a law that after the adoption brings a great popularity to him. But he needs senator B to support his proposal. So he asks senator B for support and he agrees only on the condition that he will in return later vote for his proposal. Regardless of what they promised to each other they now have several options. Senator B can vote for proposel and than expect support for his proposal or defect already in round 1, gain some profit from senator A not having that popularity but senator A will be probably angry and not support his proposal too. If he chooses to support him now have senator A some options. He has gained popularity for his law proposal and he can now return the favor or betray. Remember he already gained popularity so he can make additional profit from betraying his opponent. But if he do so he can probably expect revenge every time he wants to cooperate with senator B. There will be surely more voting in the future so what is the best strategy to maximize the payoff? That is the knowledge we gain from studying repeated games.

The Evolution of Cooperation

The evolution of cooperation can refer to:

  • the study of how cooperation can emerge and persist (also known as cooperation theory) as elucidated by application of game theory
  • a 1981 paper by political scientist Robert Axelrod and evolutionary biologist W. D. Hamilton in the scientific literature
  • a 1984 book by Axelrod that expanded on the paper and popularized the study.


Application of game theory

Game theory and mathematical analysis of behaviour have gained great popularity mainly due to a period of the WW2 and the Cold War. In 1944 John von Neumann and Oskar Morgenstern published famous Theory of Games and Economic Behavior. This book explained how to use game theory for analyzing and developing optimal strategies for military, economy and other purposes. The Cold War was also a great period for game theory usage. US government funded teams that should analyze possible strategies and practices against the enemy. There were two hostile countries capable of total destruction of each other. This state is from view of game theory incredibly interesting. Lets assume the situation that one side lauches nuclear rockets. Rockets gonna hit targets in minutes so second side of conflict has the opportunity to launch counterattack. But it is really necessary to launch nuclear attack? In few minutes they are gonna be dead anyway so they don´t have to care what will be after them because even their children will be dead. So retaliatory nuclear strike would only killed milions of people. So what is the point of launching it? Revenge? And if not, what is the point of even having nuclear weapons? The fact is that not having the opportunity to fatal strike is already case of loosing in the Cold War. Many people today agrue if one of side would actually use nuclear weapons in no matter how escalated situation. Reason to have nuclear weapons is most definitely the threat. But what is the point of threat if our enemy know that we are not gonna use them. Well the enemy must not know this. That is the reason why US presindents and high-ranked military officiers were instructed to act on public a little "crazy and insane" to prove that they will launch the attack no matter what.


Robert Axelrod and his tournaments

Robert Axelrod
Axelrod.gif
Born:
  • May 27, 1943

Robert Axelrod is the Walgreen Professor for the Study of Human Understanding at the University of Michigan. He has appointments in the Department of Political Science and the Gerald R. Ford School of Public Policy. Prior to coming to Michigan he taught at the University of California, Berkeley (1968-74). He holds a BA in mathematics from the University of Chicago (1964), and a PhD in political science from Yale (1969). He is best known for his interdisciplinary work on the evolution of cooperation which has been cited more than 30,000 times.


In 1981 Robert Axelrod published article dealing with the fundamental question: When should a person cooperate, and when should be person selfish, in an ongoing interaction with another person? He used a simple way to represent the type of situation that gives rise to these problems. He explained everything on the particular kind of game called the iterated Prisoner´s Dilemma.

In the same year Robert Axelrod also held a computer tournament in which 15 different strategies for repeated prisoner's dilemma, sent by leading game theorists, competed with each in the game of 200 routes (total of 15 × 15 games). Payout was always calculated as the sum of the basis matrix:


Tournament.jpg



Strategie in the tournament

In addition to the strategies Always Cooperates and Always Defects Axelrod divided strategies used into the following groups:


  • Nice strategies - never betray the first (only in retaliation)
  • Dirty strategies - at least sometimes betray first
  • Relief Strategies - can reciprocate, but has a short memory, forgets the old sins
  • Unforgiving strateges - never forget old wrongs, never break from the cycle of mutual retaliation even against conciliatory opponent
  • Strategies without envy - only thing that matters is profit, beating the opponent is not important
  • Envy strategies - the goal is to defeat the opponent




To the great surprise of all participants gained the most points very simple strategy "tit for tat" , which sent Anatol Rapoport a psychologist and expert on the game theory.

tit for tat - In the first round cooperates and in other rounds repeats the move of opponent.

Finitely Repeated Games

The game is played for a finite and known number of rounds, for example, 2 rounds/repetitions.

Lets recall PD game.

Pd1.jpg

What happens if this game was played T < ∞ times? We first need to decide what the equilibrium notion is. Natural choice, subgame perfect Nash equilibrium (SPE). Recall: SPE ⇐⇒ backward induction. Therefore, start in the last period, at time T . What will happen?

Dominant strategy for the last period is "defect" regardless of the history of the game. The subgame that has the start at time point T has dominant strategy equilibrium "defect,defect". We move to stage T - 1. We now by backward induction that at T players play "defect,defect". So subgame T - 1 has also dominant strategy equilibrium. Knowing this we can say that there exists a unique subgame perfect equilibrium "defect,defectL at each date.


Theorem

Consider repeated game G T (δ) for T < ∞. Suppose that the stage game G has a unique pure strategy equilibrium a∗. Then G T has a unique SPE. In this unique SPE, at = a∗ for each t = 0,1,..., T regardless of history.



Infinitely Repeated Games

The game has no predetermined length. Players act as though it will be played indefinitely or it ends only with some probability. This type of game is interesting but unlike finitely repeated they are not rare. How often we find a situation that we know exactly when it gonna ends and what it will be like? On the other hand, we routinely play many games that are infinitely repeated. Our goal is to find subgame perfect equilibrias in these types of games.

Because we eliminated backward inducion from the model, now we can see that under these circumstances (neither player knows when the game ends) cooperation (strategy vector {N,N}) becomes the equilibrium solution for both players. As we know the solution is always not only one. In this case there is more equilibrium solutions and we call this Folk Theorems.

The first folk theorem states that any attainable individual and rational pay-vector can be repeated game Nash equilibrium. Second folk theorem says that achievable payout vector that Pareto dominates any payroff vector of single game equilibrium can be SPE of repeated game.

Strategies

Unlinke finitely repeated there are many possible strategies to play. We discussed some of them above in the article. Lets just sum well-known strategies for repeated PD:

  • Always Cooperates
  • Always Defects
  • Grudger, Spiteful
  • Tit for Tat
  • Mistrust
  • Naive Prober
  • Remorseful Prober
  • Hard Tit for Tat
  • Gradual
  • Gradual Killer
  • Hard Tit for 2 Tats
  • Soft Tit for 2 Tats
  • Slow Tit for Tat
  • Periodically DDC
  • Periodically CCD
  • Soft Majority
  • Hard Majority
  • Pavlov´s
  • Pavlov´s Pn
  • Random
  • Hard Joss
  • Soft Joss
  • Generous Tit for Tat
  • Better and Better
  • Worse and Worse