Data Conversions

Modeling your data and getting it in the right shape may require various conversions, depending your starting point and/or your goal. In bnlearn various functionalities are readily implemented to make conversions from or to the adjacency matrix or vectors.

Available functionalities:

adjmat2dict : Convert adjacency matrix to dictionary.

adjmat2vec : Convert adjacency matrix into vector with source and target.

vec2adjmat : Convert source-target edges with its weights into an adjacency matrix.

dag2adjmat : Convert model into adjacency matrix.

vec2df : Convert source-target edges into sparse dataframe.

Adjacency matrix

The adjacency matrix is used to store relationships across source-target variables (nodes) with its edges. In graph theory, a square matrix is used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph. bnlearn outputs an adjacency matrix in some functionalities. Values 0 or False indicate that nodes are not connected whereas pairs of vertices with value >0 or True are connected.

Importing a DAG

Extracting adjacency matrix from imported DAG:

# Import library
import bnlearn as bn
# Import DAG
model = bn.import_DAG('sachs')
# Show the retrieved adjacency matrix for Sachs:
model['adjmat']

# print
print(model['adjmat'])

Reading the table from left to right, we see that gene Erk is connected to Akt in a directed manner. This indicates that Erk influences gene Ark but not the otherway arround because gene Akt does not show a edge with Erk. In this example form, there may be a connection at the “…”.

	Erk	Akt	PKA	Mek	Jnk	…	Raf	P38	PIP3	PIP2	Plcg
Erk	False	True	False	False	False	…	False	False	False	False	False
Akt	False	False	False	False	False	…	False	False	False	False	False
PKA	True	True	False	True	True	…	True	True	False	False	False
Mek	True	False	False	False	False	…	False	False	False	False	False
Jnk	False	False	False	False	False	…	False	False	False	False	False
PKC	False	False	True	True	True	…	True	True	False	False	False
Raf	False	False	False	True	False	…	False	False	False	False	False
P38	False	False	False	False	False	…	False	False	False	False	False
PIP3	False	False	False	False	False	…	False	False	False	True	False
PIP2	False	False	False	False	False	…	False	False	False	False	False
Plcg	False	False	False	False	False	…	False	False	True	True	False

Vector

The vector is used to store relationships based on source-target variables (nodes), and with its weigths. An example is illustrated below for which edges are defined when weights are True or a number >=1.

source	target	weight
Cloudy	Sprinkler	True
Cloudy	Rain	True
Sprinkler	Wet_Grass	True
Rain	Wet_Grass	True

adjmat2vec

Converting an adjacency matrix into vector with bnlearn.bnlearn.adjmat2vec()

import bnlearn as bn
# Load DAG
DAG = bn.import_DAG('Sprinkler')
# Convert adjmat to vector:
vector = bn.adjmat2vec(DAG['adjmat'])

source	target	weight
Cloudy	Sprinkler	True
Cloudy	Rain	True
Sprinkler	Wet_Grass	True
Rain	Wet_Grass	True

vec2adjmat

Converting the created vector in the example above back into an adjacency matrix with bnlearn.bnlearn.vec2adjmat()

import bnlearn as bn
# Convert vector back to adjmat.
adjmat = bn.vec2adjmat(vector['source'], vector['target'], weights=vector['weight'])

source	Rain	Sprinkler	Wet_Grass
Rain	0	0	1
Sprinkler	0	0	1
Wet_Grass	0	0	0
Cloudy	1	1	0

adjmat2dict

Convert adjacency matrix to dictionary with bnlearn.bnlearn.adjmat2dict()

# Import library
import bnlearn as bn
# Load DAG
DAG = bn.import_DAG('Sprinkler')
# Convert adjmat to vector:
adjmat_dict = bn.adjmat2dict(DAG['adjmat'])
# print
print(adjmat_dict)

# {'Cloudy': ['Sprinkler', 'Rain'],
#  'Sprinkler': ['Wet_Grass'],
#  'Rain': ['Wet_Grass'],
#  'Wet_Grass': []}

dag2adjmat

Convert model into adjacency matrix with bnlearn.bnlearn.dag2adjmat()

# Import library
import bnlearn as bn
# Load DAG
DAG = bn.import_DAG('Sprinkler')
# Extract edges from model and store in adjacency matrix
adjmat=bn.dag2adjmat(DAG['model'])

source	Rain	Sprinkler	Wet_Grass
Rain	0	0	1
Sprinkler	0	0	1
Wet_Grass	0	0	0
Cloudy	1	1	0

vec2df

Convert edges between source and taget into a dataframe based on the weight with bnlearn.bnlearn.vec2df() For demonstration purposes, A small example is created below for which can be seen that the weights are indicative for the number of rows; a weight of 2 will result that a row with the edge is created 2 times.

# Import library
import bnlearn as bn
# Create source-target edges with its weights
source=['Cloudy','Cloudy','Sprinkler','Rain']
target=['Sprinkler','Rain','Wet_Grass','Wet_Grass']
weights=[1,2,1,3]
# Convert into sparse dataframe.
df = bn.vec2df(source, target, weights=weights)

	Cloudy	Rain	Sprinkler	Wet_Grass
0	1	0	1	0
1	1	1	0	0
2	1	1	0	0
3	0	0	1	1
4	0	1	0	1
5	0	1	0	1
6	0	1	0	1

To demonstrate the full functionality A larger example can be loaded containing 352 edges from the book A Storm of Swords. The results is that 107 unique names are extracted with 4324 edges. This dataframe can for example be an input for structure learning approaches.

# Import library
import bnlearn as bn
# Load large example with source-target edges from the book A Storm of Swords
vec = bn.import_example("stormofswords")
# Convert into sparse dataframe.
df = bn.vec2df(vec['source'], vec['target'], weights=vec['weight'])
# sparse matrix:
print(df.shape)
# (4324, 107)