Introduction
Learning a Bayesian network can be split into two main components: structure learning and parameter learning, both of which are implemented in bnlearn
.
Structure learning: Given a set of data samples, estimate a Directed Acyclic Graph (DAG) that captures the dependencies between variables.
Parameter learning: Given a set of data samples and a DAG that captures the dependencies between variables, estimate the (conditional) probability distributions of the individual variables.
- The library supports parameter learning for discrete nodes using:
Maximum Likelihood Estimation
Bayesian Estimation
Available Functions
Structure learning
Parameter learning
Inference
Sampling
Plotting
Network comparison
Loading BIF files
Conversion of directed to undirected graphs
Structure Learning Algorithms
bnlearn
contains score-based, local discovery, Bayesian network, and constraint-based structure learning algorithms for discrete, fully observed networks.
Score-based approaches consist of two main components: * A search algorithm to optimize throughout the search space of all possible DAGs * A scoring function that indicates how well the Bayesian network fits the data
- Score-based algorithms can be used with the following score functions for categorical data (multinomial distribution):
Bayesian Information Criterion (BIC)
K2 score
Score equivalent Dirichlet posterior density (BDeu)
Constraint-based Structure Learning Algorithms
Constraint-based structure learning constructs a DAG by identifying independencies in the dataset using hypothesis tests, such as the chi-square test statistic. This approach relies on statistical tests and conditional hypotheses to learn independence among the variables in the model. The p-value of the chi-square test represents the probability of observing the computed chi-square statistic, given the null hypothesis that X and Y are independent given Z. This can be used to make independence judgments at a given significance level. An example of a constraint-based approach is the PC algorithm, which starts with a complete, fully connected graph and removes edges based on the results of independence tests until a stopping criterion is achieved.
constraintsearch (cs)