Dynamic Stochastic Control (DSC)
This technique might be translated into non-statistical parlance as
'the calculus of making a sequence of optimal decisions', in particular,
in situations involing quantifiable uncertainty. It applied to control
decision processes that are modelled as evolving randomly over time
(e.g. share dealing and drug trials). It requires that the problem
be formulated according to a specific framework.
I concentrated on using the finite-horizon framework to
model the problem of game playing.
DSC Framework
The actionspace, A, is the set of actions available.
In the games I modelled, A consists of the set of gametree searches and possible game moves.
The statespace, S, corresponds to the set of all possible game states. Actions
taken influence the next state arrived at in a random fashion, according to the
transition matrix. A mapping from S->A is termed a policy,
since it described the action(s) to take each state. The thesis introduces some search games
and deduces policies which are optimal in the sense of maximising the expected payoff from
the game.
Models Developed
The thesis develops 1- and 2- player search game models. The statespace is typically a vector
(boardposition, search status, time left), allowing for
rational decisions both of how to allocate time units between moves in a game as well
as which nodes to expand. The most powerful new game developed is an extention of the
branching bandits model by Weiss, proved in a different fashion which allows the generality.
Although optimal policies are derived for the majority of the models developed,
they remain necessarily limited in their scope by the need to prove optimality of the
policies derived, so there remains a large gap between this and the ad-hoc search methods
studied in other parts of this thesis.
Click here to get back to my Ph.D. page.