MultiGroupRCD
- class lingam.MultiGroupRCD(max_explanatory_num=2, cor_alpha=0.01, ind_alpha=0.01, shapiro_alpha=0.01, MLHSICR=True, bw_method='mdbs', independence='hsic', ind_corr=0.5)[source]
Implementation of RCD Algorithm with multiple groups
- __init__(max_explanatory_num=2, cor_alpha=0.01, ind_alpha=0.01, shapiro_alpha=0.01, MLHSICR=True, bw_method='mdbs', independence='hsic', ind_corr=0.5)[source]
Construct a model.
- Parameters:
max_explanatory_num (int, optional (default=2)) – Maximum number of explanatory variables.
cor_alpha (float, optional (default=0.01)) – Alpha level for pearson correlation.
ind_alpha (float, optional (default=0.01)) – Alpha level for HSIC.
shapiro_alpha (float, optional (default=0.01)) – Alpha level for Shapiro-Wilk test.
MLHSICR (bool, optional (default=True)) – If True, use MLHSICR for multiple regression, if False, use OLS for multiple regression.
bw_method (str, optional (default=``mdbs``)) –
The method used to calculate the bandwidth of the HSIC.
mdbs
: Median distance between samples.scott
: Scott’s Rule of Thumb.silverman
: Silverman’s Rule of Thumb.
independence ({'hsic', 'fcorr'}, optional (default='hsic')) – Methods to determine independence. If ‘hsic’ is set, test for independence by HSIC. If ‘fcorr’ is set, independence is determined by F-correlation.
ind_corr (float, optional (default=0.5)) – The threshold value for determining independence by F-correlation; independence is determined when the value of F-correlation is below this threshold value.
- property adjacency_matrices_
Estimated adjacency matrices.
- Returns:
adjacency_matrices_ – The list of adjacency matrix B for multiple datasets. The shape of B is (n_features, n_features), where n_features is the number of features.
- Return type:
array-like, shape (B, …)
- property ancestors_list_
Estimated ancestors list.
- Returns:
ancestors_list_ – The list of causal ancestors sets, where n_features is the number of features.
- Return type:
array-like, shape (n_features)
- bootstrap(X_list, n_sampling)[source]
Evaluate the statistical reliability of DAG based on the bootstrapping.
- Parameters:
X_list (array-like, shape (X, ...)) – Multiple datasets for training, where
X
is an dataset. The shape of ‘’X’’ is (n_samples, n_features), wheren_samples
is the number of samples andn_features
is the number of features.n_sampling (int) – Number of bootstrapping samples.
- Returns:
results – Returns the results of bootstrapping for multiple datasets.
- Return type:
array-like, shape (BootstrapResult, …)
- estimate_total_effect(X_list, from_index, to_index)[source]
Estimate total effect using causal model.
- Parameters:
X_list (array-like, shape (X, ...)) – Multiple datasets for training, where
X
is an dataset. The shape of ‘’X’’ is (n_samples, n_features), wheren_samples
is the number of samples andn_features
is the number of features.from_index – Index of source variable to estimate total effect.
to_index – Index of destination variable to estimate total effect.
- Returns:
total_effect – Estimated total effect.
- Return type:
float
- estimate_total_effect2(from_index, to_index)[source]
Estimate total effect using causal model.
- Parameters:
from_index – Index of source variable to estimate total effect.
to_index – Index of destination variable to estimate total effect.
- Returns:
total_effect – Estimated total effect.
- Return type:
float
- fit(X_list)[source]
Fit the model to multiple datasets.
- Parameters:
X_list (list, shape [X, ...]) – Multiple datasets for training, where
X
is an dataset. The shape of ‘’X’’ is (n_samples, n_features), wheren_samples
is the number of samples andn_features
is the number of features.- Returns:
self – Returns the instance itself.
- Return type:
object
- get_error_independence_p_values(X_list)[source]
Calculate the p-value matrix of independence between error variables.
- Parameters:
X_list (array-like, shape (X, ...)) – Multiple datasets for training, where
X
is an dataset. The shape of ‘’X’’ is (n_samples, n_features), wheren_samples
is the number of samples andn_features
is the number of features.- Returns:
independence_p_values – p-value matrix of independence between error variables.
- Return type:
array-like, shape (n_datasets, n_features, n_features)