Guiding Monte Carlo Tree Search simulations through Bayesian Opponent Modeling in The Octagon Theory

Board games present a very challenging problem in the decision-making topic of Artificial Intelligence. Although classical tree search approaches have been successful in various board games, such as Chess, these approaches are still very limited by modern technology when applied to higher complexity...

Full description

Bibliographic Details
Main Author: Hugo Miguel Pimenta Lopes Fernandes (author)
Format: masterThesis
Language:eng
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10216/67642
Country:Portugal
Oai:oai:repositorio-aberto.up.pt:10216/67642
Description
Summary:Board games present a very challenging problem in the decision-making topic of Artificial Intelligence. Although classical tree search approaches have been successful in various board games, such as Chess, these approaches are still very limited by modern technology when applied to higher complexity games such as Go. In light of this, it was not until the appearance of Monte Carlo Tree Search (MCTS) methods that higher complexity games became the main focus of research, as solution perspectives started to appear in this domain. This thesis builds on the current state-of-the-art in MCTS methods, by investigating the integration of Opponent Modeling with MCTS. The goal of this integration is to guide the simulations of the MCTS algorithm according to knowledge about the opponent, obtained in real-time through Bayesian Opponent Modeling, with the intention of reducing the number of irrelevant computations that are performed in purely stochastic, domain-independent methods. For this research, the two player deterministic board game The Octagon Theory was used, as its rules, fixed problem length and board configuration, present not only a difficult challenge for both the creation of opponent models and the execution of the MCTS method itself, but also a clear benchmark for comparison between algorithms. Through the analysis of a performed computation on the gametree complexity, the large board version of the game is believed to be in the same complexity class of Shogi and the 19x19 version of Go, turning it into a suitable board game for research in this area. Throughout this report, several MCTS policies and enhancements are presented and compared with not only the proposed variation, but also standard Monte Carlo search and the best known greedy approach for The Octagon Theory. The experiments reveal that a combination of Move Groups, Decisive Moves, Upper Confidence Bounds for Trees (UCT), Limited Simulation Lengths and an Opponent Modeling based simulation policy turn a former losing MCTS agent into the best performing one in a domain with estimated game-tree complexity of 10^293, even when the provided computational budget is kept low.