Directional Identifiability
From Jonas Peters' lecture 3 on causality at the Broad.
If we have a presumed causal model of the linear form:
where N_y is i.i.d. noise in Y, and X and N_y are both independent and non-Gaussian, then we cannot find
where N_x is i.i.d. noise in X that also satisfies the independence constraints.
In simpler words, if we assume that the distributions of X and N_y are non-Gaussian, then we will know that the causal model goes from X \rightarrow Y and not Y \rightarrow X.
Let's simulate this.
import numpy as np
import matplotlib.pyplot as plt
%load_ext autoreload
%autoreload 2
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
Firstly, we will generate non-Gaussian Xs and Ys.
X_ng = np.random.uniform(-1, 1, size=1000)
alpha = 2
N_y_ng = np.random.uniform(-0.4, 0.4, size=1000)
y_ng = alpha * X_ng + N_y_ng
Now, let's plot Y against X.
plt.scatter(X_ng, y_ng)
plt.ylabel("Y")
plt.xlabel("X")
plt.show()
Now, let's also simulate the case where X and N_y are Gaussian-distributed and independent.
X_g = np.random.normal(0, 0.5, size=1000)
alpha_g = 2
N_y_g = np.random.normal(0, 1, size=1000)
y_g = alpha_g * X_g + N_y_g
plt.scatter(X_g, y_g)
plt.xlabel("X")
plt.ylabel("Y")
plt.show()
We will now fit X as a function of Y, and do a residual analysis to see whether our residuals (i.e. noise) are independent of the input (in this case Y). Remember, we are looking to check that the inverse condition holds.
from sklearn.linear_model import LinearRegression
lm = LinearRegression()
Firstly, we X as function of Y and obtain a coefficient.
lm.fit(y_g.reshape(-1, 1), X_g)
coeff_g = lm.coef_
We then do the same for the non-Gaussian case.
lm.fit(y_ng.reshape(-1, 1), X_ng)
coeff_ng = lm.coef_
Great! Now that we have the coefficients out, let's move on to the analysis of residuals. We will be checking that the residuals (X - \beta Y) should be independent of Y.
First off, the Gaussian case.
residuals_g = X_g - coeff_g * y_g
plt.scatter(y_g, residuals_g)
plt.xlabel("Y")
plt.ylabel("residual")
plt.title("Gaussian")
plt.show()
We see that there is no trend in the residuals.
Now, the non-gaussian case.
residuals_ng = X_ng - coeff_ng * y_ng
plt.scatter(y_ng, residuals_ng)
plt.xlabel("Y")
plt.ylabel("residuals")
plt.title("non-Gaussian")
plt.show()
We see that there is a clear trend - residual depends on the value of y in the non-Gaussian case, whereas it does not in the Gaussian case.
This empirical simulation illustrates how we cannot recover an inverse model where the noise in X (N_x) is independent of the value of Y. Hence, we have an identifiable model under non-Gaussian assumptions.