Genetic Prisoner Annotated source code
This sample Python agent uses a basic genetic programming algorithm. It plays three rounds of Prisoners Dilemma at a time, making moves according to a gene that prescribes three actions. Every three rounds it mutates the gene slightly and tries again, while keeping track of the gene with the best outcome. from java.lang import Runnable from pd.PrisonersDilemmaActor.Action import COOPERATE, DEFECT import random

You write your agent as a Python class that has a run() function (mimicking Java's Runnable). The engine will instantiate the class with no arguments and set an actor attribute on the object before calling run() .

Here we put actor and game in local scope just to make code more legible.

class genetic_prisoner(Runnable): def run(self): actor = self.actor game = actor.getGame()
We chose (for clarity) to store all the state in the function scope of run(), so we initialize it here. We could instead have stored state on the agent object and set it up in __init__() . The gene is a sequence of three actions. We'll measure the score it gets and keep track of the best gene. gene = [COOPERATE, COOPERATE, DEFECT] gene_score = 0 index = 0 best_gene = [None, None, None] best_gene_score = -1 # to ensure we copy the first generation

Now we begin a loop that will execute until the game is no longer running. It calls act() with the next action (COOPERATE or DEFECT) in the gene's sequence.

act() waits for the round to be played, then returns an IterationResult that contains each player's action and score for that round.

while (game.isRunning()): result = actor.act(gene[index]) gene_score += result.myPayoff index += 1

Here's the genetic modification. When we reach the end of the gene we evaluate how well it did. If it's the best-performing gene, it becomes the standard model for future mutations.

Then we make a new gene by copying the standard and introducing a mutation.

if index == len(gene): # end this generation and start the next if gene_score > best_gene_score: copy_gene(gene, best_gene) best_gene_score = gene_score copy_gene(best_gene, gene, mutate=True) gene_score = 0 index = 0

Up to this point, the agent behavior is written entirely within the scope of the run function and it is fairly easy to follow along. However, these utilities fit well outside the class.

copy_gene can faithfully copy an array, or it can mutate the array. The mutation changes one action (selected at random) from COOPERATE to DEFECT or vice versa – this is the basis of the agent's adaptation strategy.

def copy_gene(original, target, mutate=False): if mutate: mutation_index = random.randrange(len(original)) else: mutation_index = -1 # which will mutate nothing for i in range(len(original)): if i == mutation_index: target[i] = invert(original[i]) else: target[i] = original[i] def invert(action): if action == COOPERATE: return DEFECT if action == DEFECT: return COOPERATE raise Exception("Can't invert an unknown action %s" % action)

You can see how this agent has plenty of shortcomings. For example, even if it finds the gene with the optimal score, it will keep mutating that gene before it uses it, and never actually use the optimal gene. You may be able to improve upon this.

Another Python example can be seen at PyTitForTat.