Computer Bridge: A Big Win for AI Planning1

Computer Bridge: A Big Win for AI Planning1
AI Magazine, 19(2):93–105, June 1998
Computer Bridge: A Big Win for AI Planning1
Stephen J. J. Smith
[email protected]
Department of Mathematics and
Computer Science
Hood College
Frederick, MD, USA
Dana Nau
[email protected]
Department of Computer Science,
and Institute for Systems Research
University of Maryland
College Park, MD, USA
Tom Throop
[email protected]
Great Game Products
8804 Chalon Drive
Bethesda, MD, USA
Abstract
A computer program that uses AI planning techniques is now the world’s best program for the game of
contract bridge. As reported in The New York Times and The Washington Post, this program—a new
version of Great Game Products’ Bridge Baron program—won the Baron Barclay World Bridge Computer
Challenge, an international competition hosted in July 1997 by the American Contract Bridge League.
It is well known that the game-tree search techniques used in computer programs for games such as
chess and checkers work quite differently from how humans think about such games. In contrast, our new
version of the Bridge Baron emulates the way in which a human might plan declarer play in bridge, by
using an adaptation of Hierarchical Task Network (HTN) planning. This article gives an overview of the
planning techniques that we have incorporated into the Bridge Baron, and discusses what the program’s
victory signifies for research on AI planning and game-playing.
Keywords
bridge, computer games, fielded applications, game-tree search, HTN planning, planning, task networks
Introduction
One long-standing goal of AI research has been to build programs that play challenging games of
strategy well. The classical approach used in AI programs for games of strategy is to do a game-tree search
using the well known minimax formula:
if p is a terminal node
our payoff at the node p

minimax( p) = max{minimax(q ) : q is a child of p} if it is our move at the node p
min{minimax(q ) : q is a child of p} if it is our opponent' s move at the node p

The minimax computation is basically a brute-force search: if implemented as shown in the above formula,
it would examine every node in the game tree. In practical implementations of minimax game-tree
searching, a number of techniques are used to improve the efficiency of this computation: putting a bound
on the depth of the search, using alpha-beta pruning, doing transposition-table lookup, and so forth.
However, even with enhancements such as these, minimax computations often involve examining huge
numbers of nodes in the game tree. For example, in the recent match between Deep Blue and Kasparov,
Deep Blue examined roughly 60 billion nodes per move (IBM 1997). In contrast, humans examine at most
a few dozen board positions before deciding on their next moves (Biermann 1978).
Although computer programs have done very well in games such as chess and checkers (see Table 1),
they have not done so well in the game of contract bridge. Even the best bridge programs can be beaten by
the best players at many local bridge clubs. One reason why traditional game-tree search techniques do
not work so well in bridge is that bridge is an imperfect-information game. Since bridge players don’t know
what cards are in the other players’ hands (except for, after the opening lead, what cards are in the
1
This work was supported in part by an AT&T PhD scholarship to Stephen J. J. Smith, by Maryland Industrial
Partnerships (MIPS) Grant 501.15, by ARPA grant DABT 63-95-C-0037, and by National Science Foundation Grants
NSF EEC 94-02384 and IRI-9306580. Any opinions, findings, and conclusions or recommendations expressed in this
material are those of the authors and do not necessarily reflect the view of the funders.
1
AI Magazine, 19(2):93–105, June 1998
dummy’s hand), each player has only partial knowledge of the state of the world, the possible actions, and
their effects. If we were to construct a game tree that included all of the moves a player might be able to
make, the size of this tree would vary depending on the particular bridge deal—but it would include about
5.6x10 44 leaf nodes in the worst case (Smith 1997, p. 226), and about 2.3x1024 leaf nodes in the average
case (Lopatin 1992, p. 8). Since a bridge hand is normally played in just a few minutes, there is not enough
time for a game-tree search to search enough of this tree to make good decisions.
Table 1. Computer programs in games of strategy. This is an updated and
expanded version of similar tables from (Schaeffer 1993) and (Korf 1994).
ConnectFour
Go-Moku
Qubic
Nine-MenÕs Morris
Othello:
Checkers:
Backgammon:
Chess:
Scrabble:
Go:
Bridge:
solved
solved
solved
solved
probably better than any human
better than any living human
better than all but about 10 humans
better than all but about 250 humans, possibly better?
worse than best humans
worse than best human 9-year-olds
worse than the best players at many local clubs
Our approach to this problem (Smith et al. 1996a; Smith et al. 1996c; Smith et al. 1996e; Smith 1997)
grows out of the observation that bridge is a game of planning. The bridge literature describes a number of
tactical schemes (finessing, ruffing, crossruffing, and so forth) that people combine into strategic plans for
how to play their bridge hands. We have taken advantage of the planning nature of bridge, by adapting and
extending some ideas from Hierarchical Task Network (HTN) planning. We have developed an algorithm
for declarer play in bridge that uses planning techniques to develop game trees whose size depends on the
number of different strategies that a player might pursue rather than the number of different possible ways
to play the cards. Since the number of sensible strategies is usually much less than the number of possible
card plays, this lets us develop game trees that are small enough to be searched completely, as shown in
Table 2.
The rest of this paper contains the following:
• overviews of the game of bridge and HTN planning;
• a discussion how we have adapted HTN planning for declarer play in bridge;
• a synopsis of related work by others;
• a discussion of how our program won the 1997 World Bridge Computer Challenge;
• a few notes about the application of our HTN planning techniques in other domains;
• a discussion of what these things may signify for future work on both computer bridge and AI
planning.
Table 2. Game-tree size produced in bridge by a full game-tree search and by
our HTN planning approach.
Worst case
Average case
Brute-force search
about 5.6x1044 leaf nodes
about 2.3x1024 leaf nodes
Our approach
about 305,000 leaf nodes
about 26,000 leaf nodes
Overview of Bridge
Bridge is a game played by four players, using a standard deck of 52 playing cards, divided into four
suits (spades ♠, hearts ♥, diamonds ♦, and clubs ♣), each containing 13 cards. The players (who are
normally referred to as North, South, East, and West), play as two opposing teams, with North and South
playing as partners against East and West. A bridge deal consists of two phases, bidding and play:
2
AI Magazine, 19(2):93–105, June 1998
North
West
East
South
Figure 1. At the beginning of a bridge hand, the 52 cards are dealt equally among the four players.
North
«Q ª 9 © A ¨ A
ª 7 © K ¨9
«J
«6
©5
«5
©3
West
©6
East
©2
©Q
©8
South
Figure 2. The basic unit of play is the trick, in which each player places a card face-up in the middle
of the table. In this example, West leads the 6 of Diamonds, North (dummy) plays the 2 of Diamonds,
East plays the 8 of Diamonds, and South (declarer) plays the Queen of Diamonds.
1.
Bidding:
•
Whichever player was designated as dealer for the deal deals the cards, distributing them equally
among the four players as shown in Figure 1. Each player holds her or his cards so that no other
player can see them.
•
The players make bids for the privilege of determining which suit is trump and what the level of the
contract is. Nominally, each bid consists of two things: some number of tricks (see below) that the
bidder promises to take, and which suit the bidder is proposing as the trump suit. However, various
bidding conventions have been developed in which these bids are also used to convey information to
the bidder’s partner about how strong the bidder’s hand is.
•
The bidding proceeds until no player wishes to make a higher bid. At that point, the highest bid
becomes the contract for the hand. In the highest bidder’s team, the player who bid this suit first
becomes declarer, and declarer’s partner becomes dummy. The other two players become the
defenders.
3
AI Magazine, 19(2):93–105, June 1998
2.
Play:
•
The first time that it is dummy’s turn to play a card (see below), dummy lays her or his cards on the
table, face up so that everyone can see them; and during the card play, declarer plays both declarer’s
cards and dummy’s cards.
•
The basic unit of card play is the trick, in which each player in turn plays a card by placing it face-up
on the table as shown in Figure 2. The first card played is that card that was led; and whenever
possible, the players must follow suit, that is, play cards in the suit of the card that was led. The
trick is taken by whoever played the highest card in the suit led, unless some player plays a card in
the trump suit, in which case whoever played the highest trump card wins the trick.
•
The card play proceeds, one trick at a time, until no player has any cards left. At that point, the
bridge hand is scored according to how many tricks each team took, and whether declarer’s team
took as many tricks as they promised to take during the bidding.
In playing the cards, there are a number of standard tactical ploys that players can use to try to win
tricks. These have standard names (such as ruffing, cross-ruffing, finessing, cashing out, and discovery
plays); and the ability of a bridge player depends partly on how skillfully he or she can plan and execute
these ploys. This is especially true for declarer, who is responsible for playing both declarer’s cards and
dummy’s cards. In most bridge hands, declarer will spend some time at the beginning of the game
formulating a strategic plan for how to play declarer’s cards and dummy’s cards. This plan will normally
be some combination of various tactical ploys. Because of declarer’s uncertainty about what cards are in
the opponents’ hands and how the opponents may choose to play those cards, the plan will usually need to
contain contingencies for various possible card plays by the opponents.
Overview of HTN Planning
HTN planning (Sacerdoti 1977; Tate 1977; Currie and Tate 1985; Wilkins 1988) is an AI planning
methodology that creates plans by task decomposition. This is a process in which the planning system
decomposes tasks into smaller and smaller subtasks, until primitive tasks are found that can be performed
directly. HTN planning systems have knowledge bases containing methods. Each method includes a
prescription for how to decompose some task into a set of subtasks, with various restrictions that must be
satisfied in order for the method to be applicable, and various constraints on the subtasks and the
relationships among them. Given a task to accomplish, the planner chooses an applicable method,
instantiates it to decompose the task into subtasks, and then chooses and instantiates other methods to
decompose the subtasks even further. If the constraints on the subtasks or the interactions among them
prevent the plan from being feasible, the planning system will backtrack and try other methods.
travel (x,y)
travel by air:
alternative
methods
travel by taxi:
buy ticket (airport(x), airport(y))
get taxi
travel (x, airport(x))
ride taxi (x,y)
fly (airport(x), airport(y))
pay driver
travel (airport(y), y)
Restriction :
use only
for short
distances
Restriction :
use only for
long distances
travel (UMD, MIT)
buy ticket (BWI, Logan)
travel (UMD, BWI)
get taxi
ride taxi (UMD, BWI)
pay driver
fly (BWI, Logan)
travel (Logan, MIT)
get taxi
ride taxi (Logan, MIT)
pay driver
Figure 3. Two methods for traveling from one location to another, and how they might
be used in the task of traveling from the University of Maryland to MIT.
4
AI Magazine, 19(2):93–105, June 1998
As an example. Figure 3 shows two methods for the task of traveling from one location to another:
traveling by air, and traveling by taxi. Traveling by air involves the subtasks of purchasing a plane ticket,
traveling to the local airport, flying to an airport close to our destination, and traveling from there to our
destination. Traveling by taxi involves the subtasks of calling a taxi, riding in it to the final destination, and
paying the driver. Each method has restrictions on when it can be used: air travel is only applicable for
long distances, and travel by taxi is only applicable for short distances.
Now, consider the task of traveling from the University of Maryland to MIT. Since this is a long
distance, the “travel by taxi” method is not applicable, so we must choose the “travel by air” method. As
shown in Figure 3, this decomposes the task into the following subtasks: purchase a ticket from BaltimoreWashington International (BWI) airport to Logan airport, travel from the University of Maryland to BWI,
fly from BWI to Logan, and travel from Logan to MIT. For the subtasks of traveling from the University
of Maryland to BWI and traveling from Logan to MIT, we could use the “travel by taxi” method to produce
additional subtasks as shown in Figure 3.
Solving a planning problem using HTN planning is generally much more complicated than in this
simple example. Here are some of the complications that can arise:
•
The planner may need to recognize and resolve interactions among the subtasks. For example, in
planning how to get to the airport, one needs to make sure one will arrive there in time to catch the
plane.
•
In the example in Figure 3, it was always obvious which method to use—but in general, more than one
method may be applicable to a task. If it is not possible to solve the subtasks produced by one method,
it may be necessary to backtrack and try another method instead.
The first HTN planners were developed more than 20 years ago (Sacerdoti 1974; Tate 1976).
However, because of the complicated nature of HTN planning, it was not until much later that researchers
began to develop a coherent theoretical basis for HTN planning. A formal characterization of HTN
planning now exists that shows it to be strictly more expressive than planning with STRIPS-style operators
(Erol et al., 1994b), and which has made it possible to establish a number of formal properties such as
soundness and completeness of planning algorithms (Erol et al. 1994a), complexity (Erol et al. 1996), and
the relative efficiency of various control strategies (Tsuneto et al. 1996; Tsuneto et al. 1997). Domainspecific HTN planners have been developed for a number of industrial problems (Aarup et at. 1994; Smith
et ai. 1996; Wilkins & Desimone 1994), and a domain-independent HTN planner is available at
<http://www.cs.umd.edu/projects/plus/umcp/manual> for use in experimental studies.
Our Adaptation of HTN Planning for Bridge
We have built a computer program called Tignum 2, which uses an adaptation of HTN planning
techniques to plan declarer play in contract bridge. To represent the various tactical schemes of cardplaying in bridge, Tignum 2 uses structures similar to HTN methods, but modified to represent multiagency and uncertainty. Tignum 2 uses state information sets to represent the locations of cards about
which declarer is certain, and belief functions to represent the probabilities associated with the locations of
cards about which declarer is not certain.
Some methods refer to actions performed by the opponents. In Tignum 2, we allow these methods to
make assumptions about the cards in the opponents’ hands, and design our methods so that most of the
likely states of the world are each covered by at least one method. In any of our methods, the subtasks are
totally ordered; that is, the order in which the subtasks are listed for a method is the order in which these
subtasks must be completed. For example, Figure 4 shows a portion of our task network for finessing in
bridge. Note that it refers to actions performed by each of the players in the game.
To generate game trees, our planning algorithm uses a procedure similar to task decomposition to build
up a game tree whose branches represent moves generated by these methods. It applies all methods
applicable to a given state of the world to produce new states of the world, and continues recursively until
there are no applicable methods that have not already been applied to the appropriate state of the world.
For example, Figure 5 shows how our algorithm would instantiate the finessing method of Figure 4 for
a specific bridge hand. In Figure 5, West (declarer) is trying a finesse, a tactical ploy in which a player tries
to win a trick with a high card, by playing it after an opponent who has a higher card. If North (a defender)
has the ♠Q, but does not play it when spades are led, then East (dummy) will be able to win a trick with the
5
AI Magazine, 19(2):93–105, June 1998
♠J, because East plays after North. (North wouldn’t play the ♠Q if she or he had any alternative, because
then East would win the trick with the ♠K and win a later trick with the ♠J.) However, if South (the other
defender) has the ♠Q, South will play it after East plays the ♠J, and East will not win the trick.
Finesse(P; S)
tasks
methods
FinesseTwo(P2; S)
LeadLow(P; S)
EasyFinesse(P2; S)
PlayCard(P; S, R1 )
StandardFinesse(P2 ; S)
…
BustedFinesse(P2; S)
…
our primitive actions
StandardFinesseTwo(P 2; S)
PlayCard(P 2; S, R2)
StandardFinesseThree(P 3; S)
PlayCard(P 3; S, R3)
FinesseFour(P4; S)
PlayCard(P 4; S, R4)
PlayCard(P 4; S, R4’)
the opponents’ primitive actions
Figure 4: A portion of Tignum 2’s task network for finessing.
Finesse(P; S)
LeadLow(P; S)
PlayCard(P; S, R1)
Us:
East declarer, West dummy
Opponents: defenders, South & North
Contract: East – 3NT
On lead: West at trick 3 East: «KJ74
West: «A2
Out: «QT98653
FinesseTwo(P2; S)
EasyFinesse(P2; S)
West— «2
StandardFinesse(P2; S)
BustedFinesse(P 2; S)
…
…
(North—ª3)
(North—« Q)
StandardFinesseTwo(P 2; S)
PlayCard(P 2; S, R2)
North— «3
StandardFinesseThree(P3; S)
PlayCard(P3; S, R 3)
East— «J
FinesseFour(P4; S)
PlayCard(P 4; S, R4)
PlayCard(P4; S, R 4’)
South— «5
South— «Q
Figure 5: An instantiation of the “finesse” method for a specific bridge hand.
6
AI Magazine, 19(2):93–105, June 1998
"FINESSE"
N—«3
W—«2
S—«Q
…
S—«5
…
…
E—«J
N—«Q
E—«K
S—«3
N—ª3
E—«K
S—«3
Later
tactical
… ploys
N—«3
E—«4
S—«5
…
"CASH OUT"
W—«A
Figure 6: The game tree produced from the instantiated “finesse” method of Figure 5.
S—«Q
0.5
… –100
+265 S—«5
0.5
… +630
S—«3
… +630
S—«3
… +600
S—«5
… +600
"FINESSE"
N—«3
0.9854
W—«2
+270.73
N—«Q
0.0078
N—ª3
0.0078
E—«J
+265
E—«K
+630
+600
+600 "CASH OUT"
W—«A
N—«3
+600
+630
E—«K
+600
E—«4
+600
+600
Figure 7: Evaluating the game tree of Figure 6.
Figure 6 shows the game tree resulting from the instantiation of the finessing method. This game tree
is produced by taking the plays shown in Figure 5 and listing them in the order in which they will occur. In
Figure 6, declarer has a choice between the finessing method and the cashing-out method, in which
declarer simply plays all of the high cards that are guaranteed to win tricks.
For a game tree generated in this manner, the number of branches from each state is not the number of
moves that an agent can make (as in conventional game-tree search procedures), but instead is the number
of different tactical schemes the agent can employ. As shown in Table 2, this results in a smaller branching
factor and a much smaller search tree: our planning algorithm generates game trees small enough that it can
search them all the way to the end, to predict the likely results of the various sequences of cards that the
players might play.
To evaluate the game tree at nodes where it is declarer’s turn to play a card, our algorithm chooses the
play that results in the highest score. For example, in Figure 7, West chooses to play the ♠A that resulted
from the “cash out” method, which results in a score of +600, rather than the ♠2 that resulted from the
“finesse” method, which results in a score of +270.73.
To evaluate the game tree at nodes where it is an opponent’s turn to play a card, our algorithm takes a
weighted average of the node’s children, based on probabilities generated by our belief function. For
example, because the probability is 0.9854 that North holds at least one “low” spade—that is, at least one
7
AI Magazine, 19(2):93–105, June 1998
spade other than the ♠Q—and because North is sure to play a low spade if North has one, our belief
function generates the probability of 0.9854 for North’s play of a low spade. North’s other two possible
plays are much less likely and receive much lower probabilities.
Implementation and Testing
To test our implementation of Tignum 2, we played it against Great Game Products’ Bridge Baron
program. The Bridge Baron was originally developed in 1980 and has undergone many improvements since
then (Throop 1983), and it is generally acknowledged to be the best commercially available program for the
game of bridge. At the time of our tests, it had won four international computer bridge championships. In
their review of seven commercially available bridge-playing programs (Manley 1993), the American
Contract Bridge League rated the Bridge Baron to be the best of the seven, and rated the skill of the Bridge
Baron to be the best of the five that do declarer play without "peeking" at the opponents' cards.
In (Smith 1997) we reported the results of our comparison of Tignum 2’s declarer play against the
Bridge Baron’s declarer play on 1,000 randomly generated bridge deals (including both suit and no-trump
contracts). In order to do this comparison, we formed the following two teams:
•
the Bridge Baron team, consisting of two copies of the Bridge Baron;
•
the Tignum 2 team, consisting of two copies of the following combination of programs: Tignum 2 for
use in declarer play and the Bridge Baron for use in bidding and defender play (since Tignum 2 does
not do bidding and defender play).
To eliminate the possibility of either team gaining an advantage simply by the luck of the deal, each deal
was played twice, once with the Tignum 2 team playing North and South and the Bridge Baron team
playing East and West, and once with the Tignum 2 team playing East and West and the Bridge Baron team
playing North and South. To score each deal, we used Swiss teams board-a-match scoring, in which the
winner is defined to be the team getting the higher number of total points for the deal (if the teams have the
same number of total points for a deal, then each team wins 1/2 of the deal). In our comparison, the
Tignum 2 team defeated the Bridge Baron team by 250 to 191, with 559 ties. These results were
statistically significant at the α = 0.025 level. We had never run Tignum 2 on any of these deals before this
test, so these results were free from any training-set biases.
These tests were performed in February 1997. Since then, we have made additional improvements to
Tignum 2 that have improved its performance considerably. Furthermore, we have worked with Great
Game Products to develop a new version of the Bridge Baron, Bridge Baron 8, that uses the Tignum 2 code
to plan its declarer play. The version of the Tignum 2 code used in Bridge Baron 8 contains 91 HTN task
names and at least 400 HTN methods.
As described later in this article, a pre-release version of Bridge Baron 8 won the latest worldchampionship computer bridge competition; and the official release went on sale in October 1997. In his
review of bridge programs, Jim Loy (1997) said of this new version of the Bridge Baron: “The card play is
noticeably stronger, making it the strongest program on the market.”
Other Work on Computer Bridge
Most of the successful computer programs for bridge other than ours are based on the use of domaindependent pattern-matching techniques, without much look-ahead. However, several researchers have tried
to develop bridge programs that use adaptations of classical game-tree search. Obviously, one of the
biggest problems is how to handle the uncertainty about what cards are in the other players’ hands. The
most common approach to this problem—used, for example, late in the play in the Bridge Baron—has been
to use Monte Carlo techniques.2 The basic idea is to generate many random hypotheses for how the cards
might be distributed among the other players’ hands, generate and search the game trees corresponding to
each of the hypotheses, and average the results to determine the best move.
This Monte Carlo approach removes the necessity of representing uncertainty about the players’ cards
within the game tree itself, thereby reducing the size of the game tree by as much as a multiplicative factor
2
One exception is Lopatin’s Alpha Bridge program, which used classical game-tree search techniques more-or-less
unmodified, with a 20-ply (5-trick) search. Alpha Bridge placed last at the 1992 Computer Olympiad.
8
AI Magazine, 19(2):93–105, June 1998
of 5.2x106. However, the resulting game tree is still quite large, and thus this method by itself has not been
particularly successful.
Ginsberg (1996a) has developed a clever way to make the game tree even smaller. He starts with the
Monte Carlo approach described above, but to search each game tree he uses a tree-pruning technique
called partition search. Partition search reduces the branching factor of the game tree by combining similar
branches—for example, if a player could play the ♠6 or the ♠5, partition search would generate only one
branch for these two plays because they are basically equivalent. Like our approach, this produces game
trees that are small enough to search completely. Ginsberg has implemented this approach in a computer
program called GIB (Ginsberg, 1996b), which was highlighted at AAAI-97's “Hall of Champions” exhibit.
Frank and Basin (1995) have discovered a significant problem with Monte Carlo approaches: the use
of Monte Carlo techniques introduces a fundamental incorrectness into the reasoning procedure. By this,
we do not mean the inaccuracies that might result from random variations in the game trees—inaccuracies
due to random variation can be overcome (at the cost of additional computational effort) by generating a
large enough sample of game trees. Rather, Frank and Basin have shown that any decision-making
procedure that models an imperfect-information game as a collection of perfect-information games will be
incapable of thinking about certain situations correctly.
We do not know how common such situations are, but they can be divided into several classes. The
simplest one is what bridge players call a “discovery play”, in which a player plays a card in order to
observe what cards other players play in response, in order to gain information about what cards theose
players hold. A decision-making procedure that generates and searches random game trees as described
above will be unable to reason about discovery plays correctly, because within each game tree, the
procedure already thinks it has perfect information about the players’ cards. We understand that Ginsberg
is trying to develop a solution to this problem, but we have not yet seen a good way to overcome it.
Tournament Results
The most recent world-championship competition for computer bridge programs was the Baron
Barclay World Bridge Computer Challenge, which was hosted by the American Contract Bridge League
(ACBL). The five-day competition was held in Albuquerque, New Mexico, from 28 July 1997 to 1 August
1997. As reported in The New York Times (Truscot 1997) and The Washington Post (Chandrasekaran
1997), the winner of the competition was the Bridge Baron—more specifically, a pre-release version of
Bridge Baron 8, incorporating our Tignum 2 code. Below we describe the details of the competition.
The contenders included five computer programs: one from Germany, one from Japan, and three from
the USA. As far as we know, most of the programs were based on domain-dependent pattern-matching
techniques that did not involve a lot of look-ahead. The two exceptions included Ginsberg’s GIB program
(described in the previous section), and our new version of the Bridge Baron. For the competition, two
copies of each computer program competed together as a team, sitting East-West one of the times that a
deal was played, and North-South the other time the same deal was played.
Table 3: Results from the qualifying round of the
Baron Barclay World Bridge Computer Challenge.
The top three contenders advanced to the semi-finals;
the bottom two contenders were eliminated from the
competition.
Program
Country
Score
Q-Plus
Germany
+39.74 IMPs
MicroBridge 8
Japan
+18.00 IMPs
Bridge Baron
USA
+7.89 IMPs
Meadowlark
USA
-64.00 IMPs
GIB
USA
-68.89 IMPs
The competition began with a qualifying round, in which each of the five computer-program teams
played ten matches against different human teams. Each match consisted of four deals, so each program
played a total of 40 deals. The results, which are shown in Table 3, were scored using IMPs (International
Match Points), a measure of how much better or worse a bridge partnership’s score is in comparison to its
9
AI Magazine, 19(2):93–105, June 1998
competitors. The bottom two programs (Meadowlark and GIB) were eliminated from the competition; the
top three programs (Q-Plus, MicroBridge 8, and the Bridge Baron) advanced to the semi-finals.
In the semi-final match, Q-Plus was given a bye, and MicroBridge 8 played a head-to-head match
against the Bridge Baron. In this match, which consisted of 22 deals, the Bridge Baron defeated
MicroBridge 8 by 60 IMPs, thus eliminating MicroBridge 8 from the competition.
For the final match, the Bridge Baron played a head-to-head match against Q-Plus. In this match,
which consisted of 32 deals, the Bridge Baron defeated Q-Plus by 22 IMPs, thus winning the competition.
The final place of each program in the competition is shown in Table 4.
Table 4: The final place of each contender in the
Baron Barclay World Bridge Computer Challenge.
Program
Country
Performance
Bridge Baron
USA
1st place
Q-Plus
Germany
2nd place
MicroBridge 8
Japan
3rd place
Meadowlark
USA
4th place
GIB
USA
5th place
Generality of Our Approach
To develop Tignum 2, we needed to extend HTN planning to include ways to represent and reason about
possible actions by other agents (such as the opponents in a bridge game), as well as uncertainty about the
capabilities of those agents (for example, lack of knowledge about what cards they have). However, to
accomplish this, we needed to restrict how Tignum 2 goes about constructing its plans. Most HTN planners
develop plans in which the actions are partially ordered, postponing some of the decisions about the order
in which the actions will be performed. In contrast, Tignum 2 is a total-order planner that expands tasks in
left-to-right order.
Because Tignum 2 expands tasks in the same order that they will be performed when the plan executes,
this means that when it plans for each task, Tignum 2 already knows the state of the world (or as much as
can be known about it in an imperfect-information game) at the time that the task will be performed.
Consequently, we can write each method's preconditions as arbitrary computer code, rather than using the
stylized logical expressions found in most AI planning systems. For example, by knowing the current state,
Tignum 2 can decide which of 19 finesse situations are applicable: with partial-order planning, it would be
much harder to decide which of them can be made applicable. The arbitrary computer code also enables us
to encode the complex numeric computations needed for reasoning about the probable locations of the
opponents' cards.
Because of the power and flexibility that this approach provides for representing planning operators
and manipulating problem data, it can be useful in other planning domains that are very different from
computer bridge. For example, consider the task of generating plans for how to manufacture complex
electronic devices such as the microwave transmit/receive module (MWM) shown in Figure 8. MWMs are
complex electronic devices operating in the 1-20 GHz range, which are used in radars and satellite
communications. Designing and manufacturing electronic devices that operate in this frequency range is
tricky, because the placement of the components and the lengths and widths of the wires can change the
electrical behavior of the devices.
Two of us have participated in the development of a system called EDAPS (Electro-mechanical Design
And Planning System) for computer-aided design and manufacturing planning for MWMs (Hebbar et al.
1996; Smith et al. 1996b; Smith et al. 1996d; Smith 1997). In a contract with Northrop Grumman
Corporation, our group is extending EDAPS into a tool to be used in Northrop Grumman’s design and
manufacturing facility in Baltimore, MD. EDAPS integrates commercial CAD systems for electronic
design and mechanical design, along with a program that generates manufacturing process plans for
MWMs. The process planning program uses the same basic approach (and even some of the same code!)
that we used in our program for declarer play in bridge. For example, Figure 9 shows a portion of a task
network for some of the operations used in MWM manufacture.
10
AI Magazine, 19(2):93–105, June 1998
Figure 8: The same basic approach (and even some of the same code!) that we used for planning
declarer play in the Bridge Baron is also used in a program that plans how to manufacture
microwave transmit/receive modules (MWMs) such as the one shown here.
Making the artwork
(the only applicable method)
Precleaning for artwork
Applying photoresist
Photolithography
Etching
(alternative methods)
Spindling photoresist
Spraying photoresist
Spreading photoresist
Painting photoresist
Figure 9. A portion of a task network containing operations used in manufacturing MWMs.
Conclusions
For games such as chess and checkers, the best computer programs are based on the use of “brute force”
game-tree search techniques. In contrast, our new version of the Bridge Baron bases its declarer play on
the use of HTN planning techniques that more closely approximate how a human might plan the play of a
bridge hand.
11
AI Magazine, 19(2):93–105, June 1998
Since computer programs still have far to go before they can compete at the level of expert human
bridge players, it is difficult to say what approach will ultimately prove best for computer bridge.
However, the Bridge Baron’s championship performance in the Baron Barclay World Bridge Computer
Challenge suggests that bridge may be a game in which HTN planning techniques can be very successful.
Furthermore, we believe that our work illustrates how AI planning is finally “coming of age” as a tool
for practical planning problems. Other AI planning researchers have begun to develop practical
applications of AI planning techniques in several other domains, such as marine oil spills (Agosta 1996),
spacecraft assembly (Aarup et al. 1994), and military air campaigns (Wilkins and Desimone 1994).
Furthermore, as discussed in the previous section, the same adaptation of HTN planning that we used for
computer bridge is also proving useful for the generation and evaluation of manufacturing plans for
microwave transmit/receive modules. Since the same approach works well in domains that are as different
as these, we are optimistic that it will be useful for a wide range of practical planning problems.
References
1. Aarup, M.; Arentoft, M. M.; Parrod, Y.; Stader, J.; and Stokes, I. 1994. OPTIMUM-AIV: A
knowledge-based planning and scheduling system for spacecraft AIV. In Fox, M. and Zweben, M.,
editors, Intelligent Scheduling, 451–469. Morgan Kaufmann, San Mateo, California.
2.
Agosta, J. M. 1996. Constraining influence diagram structure by generative planning: an application
to the optimization of oil spill response. Proceedings of the 12th Conference on Uncertainty in
Artificial Intelligence, 11–19. AAAI Press, Menlo Park, California.
3 . Biermann, A. 1978. Theoretical issues related to computer game playing programs. Personal
Computing, Sept. 1978, 86–88.
4. Chandrasekaran, R. 1997. Program for a better bridge game: A college partnership aids industry
research. The Washington Post, Sept. 15, 1997. Washington Business section, pp. 1, 15, 19.
5.
Currie, K. and Tate, A. 1985. O-Plan—control in the open planner architecture. BCS Expert Systems
Conference, Cambridge University Press, UK.
6.
Erol, K.; Hendler, J.; and Nau, D. 1994. UMCP: a sound and complete procedure for Hierarchical
Task-Network planning,” Proc. 2nd Int'l Conf. on AI Planning Systems, 249-254.
7.
Erol, K.; Nau, D.; Hendler, J. 1994. HTN planning: complexity and expressivity.” Proc. AAAI-94.
8.
Erol, K.; Hendler, J.; and Nau, D. Complexity results for hierarchical task-network planning. Annals
of Mathematics and Artificial Intelligence 18:69–93, 1996.
9.
Frank, I. and Basin, D. 1995. Search in games with incomplete information: a case study using bridge
card play. Research Paper 780, Department of Artificial Intelligence, University of Edinburgh,
Scotland. Submitted to Artificial Intelligence.
10. Ginsberg, Matt. 1996a. Partition search. AAAI-96, 228-233.
11. Ginsberg, Matt. 1996b. GIB, Bridge Baron, and other things. Usenet newsgroup rec.games.bridge,
31 October 1996, Message-Id: <[email protected]>.
12. Great Game Products. 1997. Bridge Baron. <http://www.bridgebaron.com>.
13. Hebbar, K.; Smith, S. J. J.; Minis, I.; and Nau, D. S. 1996. Plan-based evaluation of design for
microwave modules. In ASME Design for Manufacturing Conference, p. 262 (abstract; full paper on
CD-ROM).
14. IBM. 1997. How Deep Blue works. <http://www.chess.ibm.com/meet/html/d.3.2.html>.
15. Korf, R. 1994. Presentation of “Best-First Minimax Search: Othello results” at Twelfth National
Conference on Artificial Intelligence.
16. Lopatin, A. 1992. Two combinatorial problems in programming bridge game. Computer Olympiad,
unpublished.
12
AI Magazine, 19(2):93–105, June 1998
17. Loy, J. 1997. Review of bridge programs for PC compatibles. Usenet newsgroup rec.games.bridge,
9 October 1997, Message-Id: <[email protected]>.
18. Manley, B. 1993. Software “judges” rate bridge-playing products. The Bulletin (published monthly
by the American Contract Bridge League), 59:11, November 1993, 51—54.
19. Sacerdoti, E. D. 1974. Planning in a hierarchy of abstraction spaces. Artificial Intelligence 5:115135.
20. Sacerdoti, E. D. 1977. A Structure for Plans and Behavior. American Elsevier Publishing Company,
New York.
21. Schaeffer, J. 1993. Presentation at plenary session, AAAI Fall Symposium.
22. Smith, S. J. J.; Nau, D. S.; and Throop, T. 1996a. A planning approach to declarer play in contract
bridge. Computational Intelligence 1 2 : 1, February 1996, 106–130. An earlier version is at
<http://www.cs.umd.edu/TR/UMCP-CSD:CS-TR-3513>.
23. Smith, S. J. J.; Nau, D. S.; Hebbar, K.; and Minis, I. 1996b. Hierarchical task-network planning for
process planning for manufacturing of microwave modules. Proceedings: Artificial Intelligence and
Manufacturing Research Planning Workshop, 189—194. AAAI Press, Menlo Park, CA.
24. Smith, S. J. J.; Nau, D. S.; and Throop, T. 1996c. Total-order multi-agent task-network planning for
contract bridge. AAAI-96, 108–113.
25. Smith, S. J. J.; Hebbar, K.; Nau, D. S.; and Minis, I. 1996d. Integrated electrical and mechanical
design and process planning. IFIP Knowledge Intensive CAD Workshop, CMU, 16-18 September
1996.
26. Smith, S. J. J.; Nau, D. S.; and Throop, T. 1996e. AI planning's strong suit. IEEE Expert, 1 1 : 6,
December 1996, 4–5.
27. Smith, S. J. J. 1997. Task-Network Planning Using Total-Order Forward Search, and Applications to
Bridge and to Microwave Module Manufacture. Ph.D. Dissertation, University of Maryland at College
Park. <http://www.cs.umd.edu/users/sjsmith/phd>.
28. Tate, A. 1976. Project planning using a hierarchic non-linear planner. Tech. Report 25, Department
of Artificial Intelligence, University of Edinburgh, Scotland.
29. Tate, A. 1977. Generating project networks. IJCAI-77.
30. Throop, T. 1983. Computer Bridge. Hayden Book Company, Rochelle Park, NJ.
31. Truscott, A. 1997. Bridge. New York Times, 16 August 1997, p. A19.
32. Tsuneto, R.; Erol, K.; Hendler, J.; and Nau, D. 1996. Commitment strategies in hierarchical task
network planning. In Proc. Thirteenth National Conference on Artificial Intelligence, pp. 536-542.
33. Tsuneto, R.; Nau, D.; and Hendler, J. 1997. Plan-refinement strategies and search-space size. In Proc.
European Conference on AI Planning.
34. Wilkins, D. E. 1988. Practical Planning. Morgan Kaufmann, San Mateo, California.
35. Wilkins, D. E. and Desimone, R. V. 1994. Applying an AI planner to military operations planning. In
Fox, M. and Zweben, M., editors, Intelligent Scheduling, 685—709. Morgan Kaufmann, San Mateo,
California.
13
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement