Introduction
Learning a Bayesian network can be split into structure learning and parameter learning which are both implemented in bnlearn
.
Structure learning: Given a set of data samples, estimate a DAG that captures the dependencies between the variables.
Parameter learning: Given a set of data samples and a DAG that captures the dependencies between the variables, estimate the (conditional) probability distributions of the individual variables.
- The library supports Parameter learning for discrete nodes:
Maximum Likelihood Estimation
Bayesian Estimation
The following functions are available
Structure learning
Parameter learning
Inference
Sampling
Plot
comparing two networks
loading bif files
conversion of directed to undirected graphs
Structure Learning algorithms
bnlearn
contains score-based, local discovery, Bayesian network, and constraint-based structure learning algorithms for discrete, fully observed networks.
Score-based approaches have two main components: * The search algorithm to optimize throughout the search space of all possible DAGs. * The scoring function indicates how well the Bayesian network fits the data.
Score-based algorithm can be used with the following score functions:
- categorical data (multinomial distribution):
the Bayesian Information Criterion (bic)
the K2 score (k2)
a score equivalent Dirichlet posterior density (bdeu);
Score-based algorithms
ExhaustiveSearch (ex)
Hill Climbing (hc)
Local discovery algorithms
Chow-liu (cl)
Bayesian network classifiers
NaiveBayesian (nb)
Tree-augmented Naive Bayes (tan)
Constraint-based structure learning algorithms
With constraint-based structure learning, a DAG can be constructed by identifying independencies in the data set using hypothesis tests, such as chi2 test statistic. This approach does rely on statistical tests and conditional hypotheses to learn independence among the variables in the model. The P-value of the chi2 test is the probability of observing the computed chi2 statistic, given the null hypothesis that X and Y are independent given Z. This can be used to make independent judgments, at a given level of significance. An example of a constraint-based approach is the PC algorithm which starts with a complete fully connected graph and removes edges based on the results of the tests if the nodes are independent until a stopping criterion is achieved.
constraintsearch (cs)