Beta Distribution

The Beta distribution is a Probability distribution that has support over the interval $[0, 1]$. It's most commonly used to express degree of belief in the value of a probability term.

Probability distribution

A probability distribution is an object that assigns credibility values to discrete or continuous values. For parametrized distributions, there is usually a math function that takes in one or more parameters and returns a value across the number line.

Dirichlet Distribution

The Dirichlet distribution is often known as the multi-class (or multivariate) generalization of the Beta Distribution. In contrast to the Beta distribution, which provides a probability draw for one class, the Dirichlet distribution generalizes this to multi-class.

Stick Breaking Process

One algorithmic protocol for generating Dirichlet Process draws.

Steps:

  1. Start with a stick of length 1
  2. Draw realization $i$ from a pre-configured Beta Distribution. (What is a pre-configured probability distribution) Call this realization $p_i$.
  3. Split stick of length 1 into two, with fraction $p_i$ of the stick on the left, and fraction $1 - p_i$ of the stick on the right.
  4. Store the left stick as $l_i$.
  5. Repeat this ad infinitum (if we're talking about it in abstract), or up till a fixed number of draws.

We'll now have a series of draws for $p_i$ and $l_i$:

  • $l = (l_1, l_2, l_3, ... l_n)$
  • $p = (p_1, p_2, p_3, ... p_n)$

Each $p$ came from an independent Beta Distribution draw, while each $l$ was the result of breaking whatever was leftover from the previous round of stick breaking.

If we finished at a finite stopping point, then $l$ is guaranteed to not sum to 1, as we never know what length of stick was leftover on that last stick breaking step. To use $l$ as a valid probability vector, it must be re-normalized to sum to 1, i.e.:

$$l_{norm} = \frac{l}{\sum{l}}$$