utils

lingam.utils.print_causal_directions(cdc, n_sampling, labels=None)[source]

Print causal directions of bootstrap result to stdout.

Parameters:
  • cdc (dict) – List of causal directions sorted by count in descending order. This can be set the value returned by BootstrapResult.get_causal_direction_counts() method.
  • n_sampling (int) – Number of bootstrapping samples.
  • labels (array-like, optional (default=None)) – List of feature lables. If set labels, the output feature name will be the specified label.
lingam.utils.print_dagc(dagc, n_sampling, labels=None)[source]

Print DAGs of bootstrap result to stdout.

Parameters:
  • dagc (dict) – List of directed acyclic graphs sorted by count in descending order. This can be set the value returned by BootstrapResult.get_directed_acyclic_graph_counts() method.
  • n_sampling (int) – Number of bootstrapping samples.
  • labels (array-like, optional (default=None)) – List of feature lables. If set labels, the output feature name will be the specified label.
lingam.utils.make_prior_knowledge(n_variables, exogenous_variables=None, sink_variables=None, paths=None, no_paths=None)[source]

Make matrix of prior knowledge.

Parameters:
  • n_variables (int) – Number of variables.
  • exogenous_variables (array-like, shape (index, ..), optional (default=None)) – List of exogenous variables(index). Prior knowledge is created with the specified variables as exogenous variables.
  • sink_variables (array-like, shape (index, ..), optional (default=None)) – List of sink variables(index). Prior knowledge is created with the specified variables as sink variables.
  • paths (array-like, shape ((index, index), ..), optional (default=None)) – List of variables(index) pairs with directed path. If (i, j), prior knowledge is created that xi has a directed path to xj.
  • no_paths (array-like, shape ((index, index), ..), optional (default=None)) – List of variables(index) pairs without directed path. If (i, j), prior knowledge is created that xi does not have a directed path to xj.
Returns:

prior_knowledge – Return matrix of prior knowledge used for causal discovery.

Return type:

array-like, shape (n_variables, n_variables)

lingam.utils.remove_effect(X, remove_features)[source]

Create a dataset that removes the effects of features by linear regression.

Parameters:
  • X (array-like, shape (n_samples, n_features)) – Data, where n_samples is the number of samples and n_features is the number of features.
  • remove_features (array-like) – List of features(index) to remove effects.
Returns:

X – Data after removing effects of remove_features.

Return type:

array-like, shape (n_samples, n_features)

lingam.utils.make_dot(adjacency_matrix, labels=None, lower_limit=0.01, prediction_feature_indices=None, prediction_target_label='Y(pred)', prediction_line_color='red', prediction_coefs=None, prediction_feature_importance=None, ignore_shape=False)[source]

Directed graph source code in the DOT language with specified adjacency matrix.

Parameters:
  • adjacency_matrix (array-like with shape (n_features, n_features)) – Adjacency matrix to make graph, where n_features is the number of features.
  • labels (array-like, optional (default=None)) – Label to use for graph features.
  • lower_limit (float, optional (default=0.01)) – Threshold for drawing direction. If float, then directions with absolute values of coefficients less than lower_limit are excluded.
  • prediction_feature_indices (array-like, optional (default=None)) – Indices to use as prediction features.
  • prediction_target_label (string, optional (default='Y(pred)'))) – Label to use for target variable of prediction.
  • prediction_line_color (string, optional (default='red')) – Line color to use for prediction’s graph.
  • prediction_coefs (array-like, optional (default=None)) – Coefficients to use for prediction’s graph.
  • prediction_feature_importance (array-like, optional (default=None)) – Feature importance to use for prediction’s graph.
  • ignore_shape (boolean, optional (default=False)) – Ignore checking the shape of adjaceny_matrix or not.
Returns:

graph – Directed graph source code in the DOT language. If order is unknown, draw a double-headed arrow.

Return type:

graphviz.Digraph

lingam.utils.get_sink_variables(adjacency_matrix)[source]

The sink variables(index) in the adjacency matrix.

Parameters:adjacency_matrix (array-like, shape (n_variables, n_variables)) – Adjacency matrix, where n_variables is the number of variables.
Returns:sink_variables – List of sink variables(index).
Return type:array-like
lingam.utils.get_exo_variables(adjacency_matrix)[source]

The exogenous variables(index) in the adjacency matrix.

Parameters:adjacency_matrix (array-like, shape (n_variables, n_variables)) – Adjacency matrix, where n_variables is the number of variables.
Returns:exogenous_variables – List of exogenous variables(index).
Return type:array-like
lingam.utils.find_all_paths(dag, from_index, to_index, min_causal_effect=0.0)[source]

Find all paths from point to point in DAG.

Parameters:
  • dag (array-like, shape (n_features, n_features)) – The adjacency matrix to fine all paths, where n_features is the number of features.
  • from_index (int) – Index of the variable at the start of the path.
  • to_index (int) – Index of the variable at the end of the path.
  • min_causal_effect (float, optional (default=0.0)) – Threshold for detecting causal direction. Causal directions with absolute values of causal effects less than min_causal_effect are excluded.
Returns:

  • paths (array-like, shape (n_paths)) – List of found path, where n_paths is the number of paths.
  • effects (array-like, shape (n_paths)) – List of causal effect, where n_paths is the number of paths.

lingam.utils.simulate_linear_sem(adjacency_matrix, n_samples, sem_type, noise_scale=1.0)[source]

Simulate samples from linear SEM with specified type of noise.

Parameters:
  • adjacency_matrix (array-like, shape (n_features, n_features)) – Weighted adjacency matrix of DAG, where n_features is the number of variables.
  • n_samples (int) – Number of samples. n_samples=inf mimics population risk.
  • sem_type (str) – SEM type. gauss, exp, gumbel, logistic, poisson.
  • noise_scale (float) – scale parameter of additive noise.
Returns:

X – Data generated from linear SEM with specified type of noise, where n_features is the number of variables.

Return type:

array-like, shape (n_samples, n_features)

lingam.utils.simulate_linear_mixed_sem(adjacency_matrix, n_samples, sem_type, dis_con, noise_scale=None)[source]

Simulate mixed samples from linear SEM with specified type of noise.

Parameters:
  • adjacency_matrix (array-like, shape (n_features, n_features)) – Weighted adjacency matrix of DAG, where n_features is the number of variables.
  • n_samples (int) – Number of samples. n_samples=inf mimics population risk.
  • sem_type (str) – SEM type. gauss, mixed_random_i_dis.
  • dis_con (array-like, shape (1, n_features)) – Indicator of discrete/continuous variables, where “1” indicates a continuous variable, while “0” a discrete variable.
  • noise_scale (float) – scale parameter of additive noise.
Returns:

X – Data generated from linear SEM with specified type of noise, where n_features is the number of variables.

Return type:

array-like, shape (n_samples, n_features)

lingam.utils.is_dag(W)[source]

Check if W is a dag or not.

Parameters:W (array-like, shape (n_features, n_features)) – Binary adjacency matrix of DAG, where n_features is the number of features.
Returns:G – Returns true or false.
Return type:boolean
lingam.utils.count_accuracy(W_true, W, W_und=None)[source]

Compute recalls and precisions for W, or optionally for CPDAG = W + W_und.

Parameters:
  • W_true (array-like, shape (n_features, n_features)) – Ground truth graph, where n_features is the number of features.
  • W (array-like, shape (n_features, n_features)) – Predicted graph.
  • W_und (array-like, shape (n_features, n_features)) – Predicted undirected edges in CPDAG, asymmetric.
Returns:

  • recall (float) – (true positive) / (true positive + false negative).
  • precision (float) – (true positive) / (true positive + false positive).

lingam.utils.simulate_parameter(B, w_ranges=((-2.0, -0.5), (0.5, 2.0)))[source]

Simulate SEM parameters for a DAG.

Parameters:
  • B (array-like, shape (n_features, n_features)) – Binary adjacency matrix of DAG, where n_features is the number of features.
  • w_ranges (tuple) – Disjoint weight ranges.
Returns:

adjacency_matrix – Weighted adj matrix of DAG, where n_features is the number of features.

Return type:

array-like, shape (n_features, n_features)

lingam.utils.simulate_dag(n_features, n_edges, graph_type)[source]

Simulate random DAG with some expected number of edges.

Parameters:
  • n_features (int) – Number of features.
  • n_edges (int) – Expected number of edges.
  • graph_type (str) – ER, SF.
Returns:

B – binary adjacency matrix of DAG.

Return type:

array-like, shape (n_features, n_features)

lingam.utils.predict_adaptive_lasso(X, predictors, target, gamma=1.0)[source]

Predict with Adaptive Lasso.

Parameters:
  • X (array-like, shape (n_samples, n_features)) – Training data, where n_samples is the number of samples and n_features is the number of features.
  • predictors (array-like, shape (n_predictors)) – Indices of predictor variable.
  • target (int) – Index of target variable.
Returns:

coef – Coefficients of predictor variable.

Return type:

array-like, shape (n_features)

lingam.utils.likelihood_i(x, i, b_i, bi_0)[source]

Compute local log-likelihood of component i.

Parameters:
  • x (array-like, shape (n_features, n_samples)) – Data, where n_samples is the number of samples and n_features is the number of features.
  • i (array-like) – Variable index.
  • b_i (array-like) – The i^th column of adjacency matrix, B[i].
  • bi_0 (float) – Constant value for the i^th variable.
Returns:

ll – Local log-likelihood of component i.

Return type:

float

lingam.utils.log_p_super_gaussian(s)[source]

Compute density function of the normalized independent components.

Parameters:s (array-like, shape (1, n_samples)) – Data, where n_samples is the number of samples.
Returns:x – Density function of the normalized independent components, whose disturbances are super-Gaussian.
Return type:float
lingam.utils.variance_i(X, i, b_i)[source]

Compute empirical variance of component i.

Parameters:
  • x (array-like, shape (n_features, n_samples)) – Data, where n_samples is the number of samples and n_features is the number of features.
  • i (array-like) – Variable index.
  • b_i (array-like) – The i^th column of adjacency matrix, B[i].
Returns:

variance – Empirical variance of component i.

Return type:

float