MiniMax with Alpha-Beta Cutoff [References]


We are considering two-player, perfect information games. The two players take turns and try respectively to maximize and minimize a scoring function (also called utility function). The two players are called respectively MAX and MIN. We assume that the MAX player makes the first move. We represent the game as a tree where the nodes represent the current position and the arcs represent moves. Since players take turns, successive nodes represent positions where different players must move. We call the nodes MAX or MIN nodes depending of who is the player that must move at that node. A game tree could be infinite. The leaves represent terminal positions, i.e. positions where MAX wins (score: +infinity) or MIN wins (score: -infinity). The ply of a node is the number of moves needed to reach that node (i.e. arcs from the root of the tree). The ply of a tree is the maximum of the plies of its nodes.

An imperfect information game is one where we do not expand fully the game tree and some of the leaves represent positions that could be analyzed further. For these positions, in place of a utility function value, we will have a scoring function value. Otherwise the game trees for perfect and imperfect information games will be treated alike. [There are interesting observations by Pearl on the interaction of ply and errors in the case of imperfect information game trees.]

The MINIMAX GAME STRATEGY for the MAX (MIN) player is to select the move that leads to the successor node with the highest (lowest) score. The scores are computed starting from the leaves of the tree and backing up their scores to their predecessor in accordance with the Minimax strategy. The problem with this strategy is that it explores each node in the tree.

	function MINIMAX(N) is
	begin
	   if N is a leaf then
	        return the estimated score of this leaf
	   else
	        Let N1, N2, .., Nm be the successors of N;
	        if N is a Min node then
		    return min{MINIMAX(N1), .., MINIMAX(Nm)}
	        else
		    return max{MINIMAX(N1), .., MINIMAX(Nm)}
	end MINIMAX;

ALPHA-BETA cutoff is a method for reducing the number of nodes explored in the Minimax strategy. For the nodes it explores it computes, in addition to the score, an alpha value and a beta value.

ALPHA value of a node
It is a value never greater than the true score of this node. Initially it is the score of that node, if the node is a leaf, otherwise it is -infinity. Then at a MAX node it is set to the largest of the scores of its successors explored up to now, and at a MIN node to the alpha value of its predecessor.

BETA value of a node
It is a value never smaller than the true score of this node. Initially it is the score of that node, if the node is a leaf, otherwise it is +infinity. Then at a MIN node it is set to the smallest of the scores of its successors explored up to now, and at a MAX node to the beta value of its predecessor.
It is guaranteed that:

	function MINIMAX-AB(N, A, B) is ;; Here A is always less than B
	begin
	   if N is a leaf then
	        return the estimated score of this leaf
	   else
		Set Alpha value of N to -infinity and 
                    Beta value of N to +infinity;
	        if N is a Min node then
	            For each successor Ni of N loop
		       Let Val be MINIMAX-AB(Ni, A, Min{B,Beta of N});
		       Set Beta value of N to Min{Beta value of N, Val};
		       When A >= Beta value of N then 
			   Return Beta value of N endloop;
                    Return Beta value of N;
	        else
	            For each successor Ni of N loop
		       Let Val be MINIMAX-AB(Ni, Max{A,Alpha value of N}, B);
		       Set Alpha value of N to Max{Alpha value of N, Val};
		       When Alpha value of N >= B then 
			  Return Alpha value of N endloop;
		    Return Alpha value of N;
	end MINIMAX-AB;

The MINIMAX-AB function is called with as parameters the root of the game tree, -infinity (-I) as alpha value, and +infinity (+I) as beta value. Here is an example of Alpha Beta Cutoff. And here is a C (yes C) program that tests the alpha-beta function on a couple of examples.

The alpha-beta search can be easily modified to guaranty that the maximum depth of the game tree never exceeds d.

The efficiency of the Alpha-Beta procedure depends on the order in which successors of a node are examined. If we were lucky, at a MIN node we would always consider the nodes in order from low to high score and at a MAX node the nodes in order from high to low score.
For example, a "lucky" game tree (binary, of depth 4, complete) has leaf scores, from left to right, 4,3,8,7,2,1,6,5. Of these leaves only 5 will be examined.
Another "lucky" game tree (binary, of depth 6, complete) has leaf scores, from left to right 12,11,32,31,10,9,30,29,16,15,28,27,14,13,26,25,4,3,24,23,2,1 22,21,8,7,20,19,6,5,18,17. Of these leaves only 11 will be examined. In general it can be shown that in the most favorable circumstances the alpha-beta search opens as many leaves as minimax on a game tree with double its depth.

Minimax and alpha-beta must be modified when we deal with games that involve chance. For example, in backgammon the moves of each player take place after a throw of the dices. One modifies the game tree to add after each normal node chance nodes to represent the outcomes of the throw; then the player choses moves from these chance nodes. The score of the chance nodes is computed as usual. The score of the original node is the weighted sum of the scores of its successor chance nodes weighted by their probabilities.

References

    Nillson,N.J.: Principles of Artificial Intelligence
	Tioga Publishing Co. 1980	Pages 112-125
    Rich,E.,Knight,K.: Artificial Intelligence
	McGraw-Hill, 1991		Pages 307-319
    Luger,G.F.,Stubblefield,W.A.: Artificial Intelligence: Structures
	and strategies for complex problem solving, Second Edition
	The Benjamin/Cummings Publishing Co. 1993  Pages 137-145
    Russell,S.,Norvig,P.: Artificial Intelligence, a Modern Approach
	Prentice-Hall, 1995		Pages 122-148
    Norvig,P.: Paradigms of Artificial Intelligence Programming
	Morgan-Kaufmann, 1992		Pages 596-626
    Pearl,J.: Heuristics
	Addison-Wesley, 1984		Pages 332-360

ingargiola@cis.temple.edu