Eric's Notes

Hidden Markov Model axes semantics

What are the semantics behind a Hidden Markov Model's matrices?

Previously, I wrote about the "event, batch, and sample" shape axes. (see my blog post on probability distributions.)

Building off that, here's a few more.

Input shape

First off, there's the transition matrix. It is a square matrix, and its axis semantics are (num_states, num_states). From here, we already know that there's at least two tensor axes that have a states semantic. The transition matrix is necessary and (I think) sufficient for specifying a Markov Model. (The Hidden piece needs extra stuff.)

The transition matrix is one of the inputs, and we can control it using a Dirichlet Process prior (see also: Hierarchical Dirichlet Process Hidden Markov Model Transition Matrices).

There's also the emission parameters. Assume we have Gaussian-distributed observations from each of the hidden states, and that there are no autoregressive terms. Per state, we need a Gaussian central tendency vector $\mu$ of length n_dimensions, and a covariance matrix $\sigma$ of dimension (n_dimensions, n_dimensions). Therefore, for an entire HMM, we need to also specify $\mu$ with shape (n_states, n_dimensions) and covariance matrix $\sigma$ with shape (n_states, n_dimensions, n_dimensions).

Event shape

We can view the Hidden Markov Model as a probability distribution, by also specifying its output shapes. If we semantically define one draw of a hidden Markov model as a timeseries sequence of states drawn from the model, then the draw should have an event shape of (n_timesteps,). Multiple draws would give an event shape of (n_samples, n_timesteps).

Assuming we have a Gaussian emission distribution, we have to now

Pages that link here

Hidden Markov Model
Notes on Hidden Markov Models A description of the Autoregressive Hidden Markov Model equations