Data Conversions
Modeling your data and getting it in the right shape may require various conversions, depending your starting point and/or your goal. In bnlearn
various functionalities are readily implemented to make conversions from or to the adjacency matrix or vectors.
Available functionalities:
adjmat2dict : Convert adjacency matrix to dictionary.
adjmat2vec : Convert adjacency matrix into vector with source and target.
vec2adjmat : Convert source-target edges with its weights into an adjacency matrix.
dag2adjmat : Convert model into adjacency matrix.
vec2df : Convert source-target edges into sparse dataframe.
Adjacency matrix
The adjacency matrix is used to store relationships across source-target variables (nodes) with its edges.
In graph theory, a square matrix is used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph.
bnlearn
outputs an adjacency matrix in some functionalities. Values 0 or False indicate that nodes are not connected whereas pairs of vertices with value >0 or True are connected.
Importing a DAG
Extracting adjacency matrix from imported DAG:
# Import library
import bnlearn as bn
# Import DAG
model = bn.import_DAG('sachs')
# Show the retrieved adjacency matrix for Sachs:
model['adjmat']
# print
print(model['adjmat'])
Reading the table from left to right, we see that gene Erk is connected to Akt in a directed manner. This indicates that Erk influences gene Ark but not the otherway arround because gene Akt does not show a edge with Erk. In this example form, there may be a connection at the “…”.
Erk |
Akt |
PKA |
Mek |
Jnk |
… |
Raf |
P38 |
PIP3 |
PIP2 |
Plcg |
|
Erk |
False |
True |
False |
False |
False |
… |
False |
False |
False |
False |
False |
Akt |
False |
False |
False |
False |
False |
… |
False |
False |
False |
False |
False |
PKA |
True |
True |
False |
True |
True |
… |
True |
True |
False |
False |
False |
Mek |
True |
False |
False |
False |
False |
… |
False |
False |
False |
False |
False |
Jnk |
False |
False |
False |
False |
False |
… |
False |
False |
False |
False |
False |
PKC |
False |
False |
True |
True |
True |
… |
True |
True |
False |
False |
False |
Raf |
False |
False |
False |
True |
False |
… |
False |
False |
False |
False |
False |
P38 |
False |
False |
False |
False |
False |
… |
False |
False |
False |
False |
False |
PIP3 |
False |
False |
False |
False |
False |
… |
False |
False |
False |
True |
False |
PIP2 |
False |
False |
False |
False |
False |
… |
False |
False |
False |
False |
False |
Plcg |
False |
False |
False |
False |
False |
… |
False |
False |
True |
True |
False |
Vector
The vector is used to store relationships based on source-target variables (nodes), and with its weigths. An example is illustrated below for which edges are defined when weights are True or a number >=1.
source |
target |
weight |
Cloudy |
Sprinkler |
True |
Cloudy |
Rain |
True |
Sprinkler |
Wet_Grass |
True |
Rain |
Wet_Grass |
True |
adjmat2vec
Converting an adjacency matrix into vector with bnlearn.bnlearn.adjmat2vec()
import bnlearn as bn
# Load DAG
DAG = bn.import_DAG('Sprinkler')
# Convert adjmat to vector:
vector = bn.adjmat2vec(DAG['adjmat'])
source |
target |
weight |
Cloudy |
Sprinkler |
True |
Cloudy |
Rain |
True |
Sprinkler |
Wet_Grass |
True |
Rain |
Wet_Grass |
True |
vec2adjmat
Converting the created vector in the example above back into an adjacency matrix with bnlearn.bnlearn.vec2adjmat()
import bnlearn as bn
# Convert vector back to adjmat.
adjmat = bn.vec2adjmat(vector['source'], vector['target'], weights=vector['weight'])
source |
Rain |
Sprinkler |
Wet_Grass |
Cloudy |
---|---|---|---|---|
Rain |
0 |
0 |
1 |
0 |
Sprinkler |
0 |
0 |
1 |
0 |
Wet_Grass |
0 |
0 |
0 |
0 |
Cloudy |
1 |
1 |
0 |
0 |
adjmat2dict
Convert adjacency matrix to dictionary with bnlearn.bnlearn.adjmat2dict()
# Import library
import bnlearn as bn
# Load DAG
DAG = bn.import_DAG('Sprinkler')
# Convert adjmat to vector:
adjmat_dict = bn.adjmat2dict(DAG['adjmat'])
# print
print(adjmat_dict)
# {'Cloudy': ['Sprinkler', 'Rain'],
# 'Sprinkler': ['Wet_Grass'],
# 'Rain': ['Wet_Grass'],
# 'Wet_Grass': []}
dag2adjmat
Convert model into adjacency matrix with bnlearn.bnlearn.dag2adjmat()
# Import library
import bnlearn as bn
# Load DAG
DAG = bn.import_DAG('Sprinkler')
# Extract edges from model and store in adjacency matrix
adjmat=bn.dag2adjmat(DAG['model'])
source |
Rain |
Sprinkler |
Wet_Grass |
Cloudy |
---|---|---|---|---|
Rain |
0 |
0 |
1 |
0 |
Sprinkler |
0 |
0 |
1 |
0 |
Wet_Grass |
0 |
0 |
0 |
0 |
Cloudy |
1 |
1 |
0 |
0 |
vec2df
Convert edges between source and taget into a dataframe based on the weight with bnlearn.bnlearn.vec2df()
For demonstration purposes, A small example is created below for which can be seen that the weights are indicative for the number of rows; a weight of 2 will result that a row with the edge is created 2 times.
# Import library
import bnlearn as bn
# Create source-target edges with its weights
source=['Cloudy','Cloudy','Sprinkler','Rain']
target=['Sprinkler','Rain','Wet_Grass','Wet_Grass']
weights=[1,2,1,3]
# Convert into sparse dataframe.
df = bn.vec2df(source, target, weights=weights)
Cloudy |
Rain |
Sprinkler |
Wet_Grass |
|
---|---|---|---|---|
0 |
1 |
0 |
1 |
0 |
1 |
1 |
1 |
0 |
0 |
2 |
1 |
1 |
0 |
0 |
3 |
0 |
0 |
1 |
1 |
4 |
0 |
1 |
0 |
1 |
5 |
0 |
1 |
0 |
1 |
6 |
0 |
1 |
0 |
1 |
To demonstrate the full functionality A larger example can be loaded containing 352 edges from the book A Storm of Swords. The results is that 107 unique names are extracted with 4324 edges. This dataframe can for example be an input for structure learning approaches.
# Import library
import bnlearn as bn
# Load large example with source-target edges from the book A Storm of Swords
vec = bn.import_example("stormofswords")
# Convert into sparse dataframe.
df = bn.vec2df(vec['source'], vec['target'], weights=vec['weight'])
# sparse matrix:
print(df.shape)
# (4324, 107)