Non-Deterministic Environments



Consider a 2-player game with 2 biased coins


MAX picks a coin, and flips it. Then MIN picks a coin and flips it.


MAX wins if different outcomes


Coin A: 75% Heads, 25% Tails
Coin B: 40% Heads, 60% Tails



How would you draw this game tree?


Expectiminimax Trees


Coin A: 75% Heads, 25% Tails
Coin B: 40% Heads, 60% Tails



Markov Models

Markov Models


Markov Models are a way to model sequences of events



Each event is a state

Each transition is a probability

Each sequence is a path

Markov Models


Markov Property


$P(X_t | X_1, X_2, \dots, X_{t-1}) = P(X_t | X_{t-1})$

Weather patterns!


Assume that the weather on any day
depends only on the previous day.


E.g., if it's overcast today,
it's likely to rain tomorrow.


Weather patterns!


Consider the following probability table


0.7 0.2 0.1
0.2 0.4 0.4
0.33 0.33 0.34
0.7 0.2 0.1
0.2 0.4 0.4
0.33 0.33 0.34

What is the probability that
it will be sunny on day 3?

0.7 0.2 0.1
0.2 0.4 0.4
0.33 0.33 0.34

Let's see what the average weather looks like.


In general, let $\pi$ be the probability distribution over states.


$\pi = [\pi_1, \pi_2, \dots, \pi_n]$



$\pi_t = \pi_{t-1} P$

In the long run,


$\pi = \pi P$

$Av=\lambda v$


$\sum_i \pi_i = 1$


What if we don't know Probabilities, P?


Markov Chain Monte Carlo
(learn through random walks)


What is the probability of the weather being
sunny, followed by rainy after exactly 1 time step?



This is simply the value $P_{\textrm{(Sunny, Rainy)}}$ from our matrix.

What is the probability of the weather being
rainy two days after it was sunny?



This is simply the value $P^2{\textrm{(Sunny, Rainy)}}$.

At each time step, we are covering all $k-$hop neighbors from start.


$P_{ij}(n) = P^n_{i,j}$

What happens to $A^n$ as $n\to \infty$?


Converges to $\pi$


Initial state doesn't matter*
(in certain settings)

The Gambler's Ruin



Consider a conservative gambler with a fortune of $50.


He plays a betting game with two outcomes (p=0.5)


Wins or loses $25 each round.


Quits if broke or gets to $75.


The Gambler's Ruin



Consider a conservative gambler with a fortune of $50.


He plays a betting game with two outcomes (p=0.5)


Wins or loses $25 each round.


Quits if broke or gets to $75.


Any insights?

The Gambler's Ruin



$0 $25 $50 $75
$0 1 0 0 0
$25 0.5 0 0.5 0
$50 0 0.5 0 0.5
$75 0 0 0 1

What does $P^\infty$ look like?

The Gambler's Ruin



The Gambler's Ruin



What if we started with even more money?


Let's try $75 - what does the model look like?


The Gambler's Ruin



$0 $25 $50 $75 $100
$0 1 0 0 0 0
$25 0.5 0 0.5 0 0
$50 0 0.5 0 0.5 0
$75 0 0 0.5 0 0.5
$100 0 0 0 0 1

What does $P^\infty$ look like?

The Gambler's Ruin



Applications


Predictive Text

Stock Market Models