Assignment: Game Playing

Your goal is to create an agent that can accurately play Othello (also known as Reversi) using a minimax strategy with alpha-beta pruning. Here are the rules, and you can play it online. Play a few games, especially if you haven't played Othello before.

To get started with your assignment, copy the files othelloBoard.py and othelloPlayers.py into your directory. othelloBoard.py contains everything necessary to get the Othello game moving. Use Python's help facility to see what what methods are in there that you might find useful. You should not modify othelloBoard.py in any way, but I am supplying the code in the event that you find it instructional to look at. To actually start up an Othello game, run python othelloBoard.py from a command prompt. The game allows for two players, each of which can be a human player or a computer player. You'll find a class for each of these players in othelloPlayers.py.

You need to submit code for following two functions/methods inside othelloPlayers.py:

heuristic(board)

Given a particular Othello board, return a heuristic value indicating the value of the board where a large positive number indicates a good position for white, and a large negative number indicates a good position for black. I have already given you code for the function heuristic, but it is very silly. You might need to experiment with this to come up with a good heuristic function. (You may not call the chooseMove function from within heuristic. Doing so would effectively not be calculating the heuristic on the current state, but would instead be looking ahead further.)

chooseMove(self,board) (within the ComputerPlayer class)

Given an Othello board, return a tuple (row,column) indicating the move that the computer chooses to make. Note that rows and columns start at 1, not 0, in order to make the game more playable for non-programmers.

You will code this up twice: once with basic minimax, and the second time with alpha-beta pruning.

Some more details: The number of plies is stored as in instance variable in the ComputerPlayer class. For example, a value of 1 for plies means that you should consider all possible moves you might make, and merely choose the one that gets you the best heuristic value. We will test your code by supplying our own heuristic function to see if your chooseMove method provides a move to a board position that yields the same heuristic value that ours does. Therefore, code this up carefully keeping that in mind, and make sure that you use self.plies as the number of plies to work with. Similarly, make sure that you do not call your heuristic function directly; rather, use the heuristic function that was initialized in the constructor for your ComputerPlayer object (e.g., use self.heuristic).

Doing multiple parts

Part 1: Grab the code, run it, and familiarize yourself with it. Turn in some attempt at a heuristic function. This is not your final one, and not the one we will ultimately grade for some form of correctness, but turn in something reasonable. This is to motivate you to get started on the assignment. You may want to code up a non-pruning version of chooseMove to run against the heuristic, but at this stage, this is up to you.

Part 2: Turn in your program with a better heuristic function, and with chooseMove fully implemented without alpha-beta pruning.

Part 3: Turn in your program again, this time with alpha-beta pruning implemented. Do this by having _another_ ComputerPlayer class, this time called ComputerPlayerPruning. In your submitted code, chooseMove within ComputerPlayer should not have alpha-beta pruning, whereas chooseMove within ComputerPlayerPruning should. By doing so, you can debug by switching back and forth which kind of player object gets created by renaming your classes.

Grading

Here is how the assignment will be graded.

  1. The grader will test your heuristic function to see if it works and does the right sort of thing. (There is no single "correct" answer for this.)
  2. The grader will use use your ComputerPlayer class, but will use a version of the Othello game that constructs it with our own heuristic. The grader will supply a number of different Othello boards to your chooseMove method to see if you choose the correct move, and to see if you call the heuristic function the correct number of times.
  3. The grader will look at your code, and attempt to verify that you are doing alpha-beta pruning correctly.
  4. In our heuristic function, we will insert code to keep a count of the number of times that it has been called. Your alpha-beta pruning code should be checking heuristics considerably less than your non-pruning code, while still producing a move to a state with the same heuristic value than without it.
  5. If you succeed at the steps and have reasonable style, you will receive full credit for the assignment.

After all the game playing agents have been submitted, if we can we'll run a tournament on the heuristic functions to see which agents play the best. Let me know if you'd be interested in volunteering to run the tournament.