Directional Identifiability

From Jonas Peters' lecture 3 on causality at the Broad.

If we have a presumed causal model of the linear form:

$Y = \alpha X + N_y$

where $N_y$ is i.i.d. noise in $Y$ , and $X$ and $N_y$ are both independent and non-Gaussian, then we cannot find

$X = \beta Y + N_x$

where $N_x$ is i.i.d. noise in $X$ that also satisfies the independence constraints.

In simpler words, if we assume that the distributions of $X$ and $N_y$ are non-Gaussian, then we will know that the causal model goes from $X \rightarrow Y$ and not $Y \rightarrow X$ .

Let's simulate this.

import numpy as np
import matplotlib.pyplot as plt

%load_ext autoreload
%autoreload 2
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

Firstly, we will generate non-Gaussian Xs and Ys.

X_ng = np.random.uniform(-1, 1, size=1000)
alpha = 2
N_y_ng = np.random.uniform(-0.4, 0.4, size=1000)
y_ng = alpha * X_ng + N_y_ng

Now, let's plot Y against X.

plt.scatter(X_ng, y_ng)
plt.ylabel("Y")
plt.xlabel("X")
plt.show()

Now, let's also simulate the case where $X$ and $N_y$ are Gaussian-distributed and independent.

X_g = np.random.normal(0, 0.5, size=1000)
alpha_g = 2
N_y_g = np.random.normal(0, 1, size=1000)
y_g = alpha_g * X_g + N_y_g

plt.scatter(X_g, y_g)
plt.xlabel("X")
plt.ylabel("Y")
plt.show()

We will now fit X as a function of Y, and do a residual analysis to see whether our residuals (i.e. noise) are independent of the input (in this case Y). Remember, we are looking to check that the inverse condition holds.

from sklearn.linear_model import LinearRegression
lm = LinearRegression()

Firstly, we X as function of Y and obtain a coefficient.

lm.fit(y_g.reshape(-1, 1), X_g)
coeff_g = lm.coef_

We then do the same for the non-Gaussian case.

lm.fit(y_ng.reshape(-1, 1), X_ng)
coeff_ng = lm.coef_

Great! Now that we have the coefficients out, let's move on to the analysis of residuals. We will be checking that the residuals ( $X - \beta Y$ ) should be independent of $Y$ .

First off, the Gaussian case.

residuals_g = X_g - coeff_g * y_g
plt.scatter(y_g, residuals_g)
plt.xlabel("Y")
plt.ylabel("residual")
plt.title("Gaussian")
plt.show()

We see that there is no trend in the residuals.

Now, the non-gaussian case.

residuals_ng = X_ng - coeff_ng  * y_ng
plt.scatter(y_ng, residuals_ng)
plt.xlabel("Y")
plt.ylabel("residuals")
plt.title("non-Gaussian")
plt.show()

We see that there is a clear trend - residual depends on the value of y in the non-Gaussian case, whereas it does not in the Gaussian case.

This empirical simulation illustrates how we cannot recover an inverse model where the noise in X ( $N_x$ ) is independent of the value of $Y$ . Hence, we have an identifiable model under non-Gaussian assumptions.