RECAP



What are Markov Models?


Remember Stationary Distributions ($\pi, P^\infty$)?

Hidden Markov Models



We now see observations, instead of true states


Reason about true state from the observations


Hidden Markov Models



Hidden Markov Models



True states are hidden

Hidden Markov Models



What kind of problems can we solve?


Voice Recognition (mapping to words)

Genetic sequence alignment

Predicting stock prices

Parts of speech tagging

Predicting weather

Hidden Markov Models



Imagine you are Sherlock Holmes


Figure out whether my train was:
  • Very Late
  • Late
  • On Time


... by observing whether I'm irritable or happy

Hidden Markov Models



Hidden Markov Models



\[ P = \begin{bmatrix} 0.1 & 0.3 & 0.6\\ 0.4 & 0.2 & 0.4\\ 0.1 & 0.1 & 0.8 \end{bmatrix} \] \[ B = \begin{bmatrix} 0.4 & 0.6\\ 0.5 & 0.5\\ 0.9 & 0.1 \end{bmatrix} \]

Hidden Markov Models



3 types of reasoning



Probability of observing a sequence


Most likely sequence, given observations


Finding model parameters that best explain observations


Hidden Markov Models



Calculate the probability:

\[ P = \begin{bmatrix} 0.1 & 0.3 & 0.6\\ 0.4 & 0.2 & 0.4\\ 0.1 & 0.1 & 0.8 \end{bmatrix} B = \begin{bmatrix} 0.4 & 0.6\\ 0.5 & 0.5\\ 0.9 & 0.1 \end{bmatrix} \]

Hidden Markov Models



Calculate the probability:

\[ P = \begin{bmatrix} 0.1 & 0.3 & 0.6\\ 0.4 & 0.2 & 0.4\\ 0.1 & 0.1 & 0.8 \end{bmatrix} B = \begin{bmatrix} 0.4 & 0.6\\ 0.5 & 0.5\\ 0.9 & 0.1 \end{bmatrix} \]

Hidden Markov Models



Now, calculate the probability:

\[ P = \begin{bmatrix} 0.1 & 0.3 & 0.6\\ 0.4 & 0.2 & 0.4\\ 0.1 & 0.1 & 0.8 \end{bmatrix} B = \begin{bmatrix} 0.4 & 0.6\\ 0.5 & 0.5\\ 0.9 & 0.1 \end{bmatrix} \]

Required Computation


\[ P(VL)\ P(Happy|VL)\ P(VL|VL)\ P(Sad|VL)\ P(VL|VL)\ P(Happy|VL) + \\ P(VL)\ P(Happy|VL)\ P(VL|VL)\ P(Sad|VL)\ P(L|VL)\ P(Happy|L) + \\ P(VL)\ P(Happy|VL)\ P(VL|VL)\ P(Sad|VL)\ P(OT|VL)\ P(Happy|OT) + \\ P(VL)\ P(Happy|VL)\ P(L|VL)\ P(Sad|L)\ P(VL|L)\ P(Happy|VL) + \\ P(VL)\ P(Happy|VL)\ P(L|VL)\ P(Sad|L)\ P(L|L)\ P(Happy|L) + \\ P(VL)\ P(Happy|VL)\ P(L|VL)\ P(Sad|L)\ P(OT|L)\ P(Happy|OT) + \\ P(VL)\ P(Happy|VL)\ P(OT|VL)\ P(Sad|OT)\ P(VL|OT)\ P(Happy|VL) + \\ P(VL)\ P(Happy|VL)\ P(OT|VL)\ P(Sad|OT)\ P(L|OT)\ P(Happy|L) + \\ P(VL)\ P(Happy|VL)\ P(OT|VL)\ P(Sad|OT)\ P(OT|OT)\ P(Happy|OT) + \\ P(L)\ P(Happy|L)\ P(VL|L)\ P(Sad|VL)\ P(VL|VL)\ P(Happy|VL) + \\ P(L)\ P(Happy|L)\ P(VL|L)\ P(Sad|VL)\ P(L|VL)\ P(Happy|L) + \\ P(L)\ P(Happy|L)\ P(VL|L)\ P(Sad|VL)\ P(OT|VL)\ P(Happy|OT) + \\ P(L)\ P(Happy|L)\ P(L|L)\ P(Sad|L)\ P(VL|L)\ P(Happy|VL) + \\ P(L)\ P(Happy|L)\ P(L|L)\ P(Sad|L)\ P(L|L)\ P(Happy|L) + \\ P(L)\ P(Happy|L)\ P(L|L)\ P(Sad|L)\ P(OT|L)\ P(Happy|OT) + \\ P(L)\ P(Happy|L)\ P(OT|L)\ P(Sad|OT)\ P(VL|OT)\ P(Happy|VL) + \\ P(L)\ P(Happy|L)\ P(OT|L)\ P(Sad|OT)\ P(L|OT)\ P(Happy|L) + \\ P(L)\ P(Happy|L)\ P(OT|L)\ P(Sad|OT)\ P(OT|OT)\ P(Happy|OT) + \\ P(OT)\ P(Happy|OT)\ P(VL|OT)\ P(Sad|VL)\ P(VL|VL)\ P(Happy|VL) + \\ P(OT)\ P(Happy|OT)\ P(VL|OT)\ P(Sad|VL)\ P(L|VL)\ P(Happy|L) + \\ P(OT)\ P(Happy|OT)\ P(VL|OT)\ P(Sad|VL)\ P(OT|VL)\ P(Happy|OT) + \\ P(OT)\ P(Happy|OT)\ P(L|OT)\ P(Sad|L)\ P(VL|L)\ P(Happy|VL) + \\ P(OT)\ P(Happy|OT)\ P(L|OT)\ P(Sad|L)\ P(L|L)\ P(Happy|L) + \\ P(OT)\ P(Happy|OT)\ P(L|OT)\ P(Sad|L)\ P(OT|L)\ P(Happy|OT) + \\ P(OT)\ P(Happy|OT)\ P(OT|OT)\ P(Sad|OT)\ P(VL|OT)\ P(Happy|VL) + \\ P(OT)\ P(Happy|OT)\ P(OT|OT)\ P(Sad|OT)\ P(L|OT)\ P(Happy|L) + \\ P(OT)\ P(Happy|OT)\ P(OT|OT)\ P(Sad|OT)\ P(OT|OT)\ P(Happy|OT) \]

Clearly not a good idea!



Forward Algorithm
(Dynamic Programming)

Forward Algorithm
(Dynamic Programming)



\[ \alpha_t(S_i) = \sum_j \alpha_{t-1}(S_j) P(S_i|S_j) P(O^{(t)}|S_i) \]

\[ \alpha_1(S_i) = \pi[S_i] P(O^{(1)} | S_i) \]

\[ P(O^{(1)}, O^{(2)}, \dots, O^{(t)}) = \sum_{i=1}^n \alpha_{t-1}(S_i) \]

Most likely sequence!



Calculating \[\argmax_X P(X|O) \]


Viterbi Algorithm


Viterbi Algorithm



\[ \delta_1(S_i) = \pi[S_i] P(O^{(1)} | S_i) \]

\[ \delta_t(S_i) = \max_j \left[ \delta_{t-1}(S_j) P(S_i|S_j) P(O^t | S_i) \right] \]

Estimating the HMM Parameters



Expectation Maximization


Hill Descent