tools
- lingam.tools.bootstrap_with_imputation(X, n_sampling, n_repeats, prior_knowledge=None, apply_prior_knowledge_softly=False, random_state=None)[source]
Discovering causal relations in data which has NaNs.
bootstrap_with_imputation is a function to perform a causal discovery on a dataset with missing values. bootstrap_with_imputation creates n_sampling bootstrap samples from the dataset, creates n_repeats samples for each bootstrap sample, completes the missing values in each sample, and runs a causal discovery assuming a common causal structure for n_repeats samples.
- Parameters:
X (array-like, shape (n_samples, n_features)) – Training data, where
n_samples
is the number of samples andn_features
is the number of features.n_sampling (int) – Number of bootstraps.
n_repeats (int) – Number of times to complete missing values for each bootstrap sample.
prior_knowledge (array-like, shape (n_features, n_features), optional (default=None)) –
Prior knowledge used for causal discovery, where
n_features
is the number of features.The elements of prior knowledge matrix are defined as follows:
0
: \(x_i\) does not have a directed path to \(x_j\)1
: \(x_i\) has a directed path to \(x_j\)-1
: No prior knowledge is available to know if either of the two cases above (0 or 1) is true.
apply_prior_knowledge_softly (boolean, optional (default=False)) – If True, apply prior knowledge softly.
random_state (int, optional (default=None)) –
random_state
is the seed used by the random number generator.
- Returns:
causal_orders (array-like, shape (n_sampling, n_features)) – The causal order of fitted model, where n_features is the number of features.
adj_matrices_list (array-like, shape (n_sampling, n_repeats, n_features, n_features)) – The list of adjacency matrices.
resampled_indices_ (array-like, shape (n_sampling, n_samples)) – The list of original index of resampled samples.
imputation_results (array-like, shape (n_sampling, n_repeats, n_samples, n_features)) – This array shows a result of the imputation. Elements which are not NaN are the imputation values.