Finding ancestors of each variable

By using utils.extract_ancestors, which implements Algorithm 1 of RCD method [1], we can extract the ancestors of variables. Since RCD allows for the existence of unobserved common causes, we can search for ancestors even when there are unobserved common causes, as in the following example.

References

Import and settings

import random

import numpy as np
import pandas as pd

from sklearn.utils import check_array
from lingam.utils import make_dot, extract_ancestors

Test data

def get_coef():
    coef = random.random()
    return coef if coef >= 0.5 else coef - 1.0
get_external_effect = lambda n: np.random.normal(0.0, 0.5, n) ** 3

B = np.array([[       0.0,        0.0,        0.0,        0.0,        0.0, get_coef(),        0.0],
              [       0.0,        0.0,        0.0,        0.0,        0.0, get_coef(),        0.0],
              [get_coef(), get_coef(),        0.0,        0.0,        0.0,        0.0,        0.0],
              [       0.0,        0.0, get_coef(),        0.0,        0.0,        0.0, get_coef()],
              [       0.0,        0.0, get_coef(),        0.0,        0.0,        0.0, get_coef()],
              [       0.0,        0.0,        0.0,        0.0,        0.0,        0.0,        0.0],
              [       0.0,        0.0,        0.0,        0.0,        0.0,        0.0,        0.0]])

samples = 500
f0 = get_external_effect(samples)
f1 = get_external_effect(samples)
x0 = f0 * B[0, 5] + get_external_effect(samples)
x1 = f0 * B[1, 5] + get_external_effect(samples)
x2 = x0 * B[2, 0] + x1 * B[2, 1] + get_external_effect(samples)
x3 = x2 * B[3, 2] + f1 * B[3, 6] + get_external_effect(samples)
x4 = x2 * B[4, 2] + f1 * B[4, 6] + get_external_effect(samples)

# f0 and f1 are latent confounders
X = pd.DataFrame(np.array([x0, x1, x2, x3, x4]).T ,columns=['x0', 'x1', 'x2', 'x3', 'x4'])

make_dot(B, labels=['x0', 'x1', 'x2', 'x3', 'x4', 'f0', 'f1'])
../_images/extract_ancestors.svg

Extract the ancestors of each observed variable

M = extract_ancestors(X)

for i in range(X.shape[1]):
    if len(M[i]) == 0:
        print(f'x{i} has no ancestors.')
    else:
        print(f'The ancestors of x{i} are ' + ', '.join([f'x{n}' for n in M[i]]))
x0 has no ancestors.
x1 has no ancestors.
The ancestors of x2 are x0, x1
The ancestors of x3 are x0, x1, x2
The ancestors of x4 are x0, x1, x2