Eric's Notes

Stick Breaking Process

One algorithmic protocol for generating Dirichlet Process draws.

Steps:

Start with a stick of length 1
Draw realization $i$ from a pre-configured Beta Distribution. (What is a pre-configured probability distribution) Call this realization $p_i$.
Split stick of length 1 into two, with fraction $p_i$ of the stick on the left, and fraction $1 - p_i$ of the stick on the right.
Store the left stick as $l_i$.
Repeat this ad infinitum (if we're talking about it in abstract), or up till a fixed number of draws.

We'll now have a series of draws for $p_i$ and $l_i$:

$l = (l_1, l_2, l_3, ... l_n)$
$p = (p_1, p_2, p_3, ... p_n)$

Each $p$ came from an independent Beta Distribution draw, while each $l$ was the result of breaking whatever was leftover from the previous round of stick breaking.

If we finished at a finite stopping point, then $l$ is guaranteed to not sum to 1, as we never know what length of stick was leftover on that last stick breaking step. To use $l$ as a valid probability vector, it must be re-normalized to sum to 1, i.e.:

$$l_{norm} = \frac{l}{\sum{l}}$$

Pages that link here

Dirichlet Process
What exactly is a Dirichlet process? We start from the Dirichlet Distribution, which provides a vector of probabilities over states