GitXplorerGitXplorer
g

PrisonnersDilemma

public
0 stars
0 forks
0 issues

Commits

List of commits on branch master.
Unverified
51125ce01955583d1ee8268a735f27b659349c4c

Update README.md

ggiltho committed 9 years ago
Unverified
4079abb7251f24ee58b578632784aa5c05e9a55e

Update README.md

ggiltho committed 9 years ago
Unverified
7d6cee9a6c203e0e1b9227fb2845141be199fbd5

Create README.md

ggiltho committed 9 years ago
Unverified
d51a7455dc21579baea59b4da241faf865440d65

Added images for the README file

ggiltho committed 9 years ago
Unverified
9b0fef682a58098b203e04e2a41a84f762d94d7c

Merge branch 'master' of https://github.com/Giltho/PrisonnersDilemma

ggiltho committed 9 years ago
Unverified
c5a3bc71ba9eab5fcb6315fc3e3a8c49927cd6ad

Only commit

ggiltho committed 9 years ago

README

The README file for this repository.

PrisonnersDilemma

This project is a very simple application of genetic programming to the facinating "iterated prisonner's dilemma" game.

The iterated prisonner's dilemma

The prisonner's dilemma

According to Wikipedia : > Albert W. Tucker formalized the game with prison sentence rewards and named it, "prisoner's dilemma" (Poundstone, 1992), presenting it as follows: Two members of a criminal gang are arrested and imprisoned. Each prisoner is in solitary confinement with no means of communicating with the other. The prosecutors lack sufficient evidence to convict the pair on the principal charge. They hope to get both sentenced to a year in prison on a lesser charge. Simultaneously, the prosecutors offer each prisoner a bargain. Each prisoner is given the opportunity either to: betray the other by testifying that the other committed the crime, or to cooperate with the other by remaining silent. The offer is: * If A and B each betray the other, each of them serves 2 years in prison * If A betrays B but B remains silent, A will be set free and B will serve 3 years in prison (and vice versa) * If A and B both remain silent, both of them will only serve 1 year in prison (on the lesser charge)

I chose, to be clearer, instead of the years in prison system, a point system presented in an article from Pour la Science, written by the french searcher J.-P. DELAHAYE

/ P1 cooperates P1 betrays
P2 cooperates P2 : 3 / P1 : 3 P2 : 0 / P1 : 5
P2 betrays P2 : 5 / P1 : 0 P2 : 1 / P1 : 1

In a 1 Vs. 1 game, the best strategy, the one that will always get you at least as many, and maybe more points than your opponent will be to betray. That is why the iterated version is very interesting.

The iterated version

When many people play together, everyone plays 1000 rounds agains every players, including himself, betraying is a very bad strategy, because after several round of betraying, the opponent will certainly start to also betray and you will both get a very low amount of point (1 each) instead of both cooperating and winning a higher amount of points (3 each). Therefore, other strategies are to be found.

I used, for my study, the 12 strategies presented by J.-P. DELAHAYE in his article.

Genetic algorithm

Using these 12 strategies, I tried creating new determinist strategies using a genetic algorithm that will try to get the best score possible.

The goal is to optimise a vector of continue values between 0 and 1 that determine the next move that will be played based on coefficient calculated upon the statistics of the precedent rounds.

Results

The results, although not perfect, were really promissing and open to ameliorations. Three very interesting points were to be recorded : * The average results were quite above average : ![alt tag](https://raw.githubusercontent.com/Giltho/PrisonnersDilemma/master/Images/average.png) * The best score I had after many tries, comming from a surprisingly good strain got the best score of all the strategies : ![alt tag](https://raw.githubusercontent.com/Giltho/PrisonnersDilemma/master/Images/best%20pop.png)

Here is the histogram of all the strategies scores (it's an average since it changes every time, considering that random changes everybody's score) alt tag

  • Finally, the algorithm has a very good adaptability capacity, since it got a very good score when I tried it with only two strategies, 'always betray' and 'always cooperates", where the best score you can have is 6000 (always betray) : alt tag