Data Conversions

Modeling your data and getting it in the right shape may require various conversions, depending your starting point and/or your goal. In bnlearn various functionalities are readily implemented to make conversions from or to the adjacency matrix or vectors.

Available functionalities:

  • adjmat2dict : Convert adjacency matrix to dictionary.

  • adjmat2vec : Convert adjacency matrix into vector with source and target.

  • vec2adjmat : Convert source-target edges with its weights into an adjacency matrix.

  • dag2adjmat : Convert model into adjacency matrix.

  • vec2df : Convert source-target edges into sparse dataframe.

Adjacency matrix

The adjacency matrix is used to store relationships across source-target variables (nodes) with its edges. In graph theory, a square matrix is used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph. bnlearn outputs an adjacency matrix in some functionalities. Values 0 or False indicate that nodes are not connected whereas pairs of vertices with value >0 or True are connected.

Importing a DAG

Extracting adjacency matrix from imported DAG:

# Import library
import bnlearn as bn
# Import DAG
model = bn.import_DAG('sachs')
# Show the retrieved adjacency matrix for Sachs:
model['adjmat']

# print
print(model['adjmat'])

Reading the table from left to right, we see that gene Erk is connected to Akt in a directed manner. This indicates that Erk influences gene Ark but not the otherway arround because gene Akt does not show a edge with Erk. In this example form, there may be a connection at the “…”.

Erk

Akt

PKA

Mek

Jnk

Raf

P38

PIP3

PIP2

Plcg

Erk

False

True

False

False

False

False

False

False

False

False

Akt

False

False

False

False

False

False

False

False

False

False

PKA

True

True

False

True

True

True

True

False

False

False

Mek

True

False

False

False

False

False

False

False

False

False

Jnk

False

False

False

False

False

False

False

False

False

False

PKC

False

False

True

True

True

True

True

False

False

False

Raf

False

False

False

True

False

False

False

False

False

False

P38

False

False

False

False

False

False

False

False

False

False

PIP3

False

False

False

False

False

False

False

False

True

False

PIP2

False

False

False

False

False

False

False

False

False

False

Plcg

False

False

False

False

False

False

False

True

True

False

Vector

The vector is used to store relationships based on source-target variables (nodes), and with its weigths. An example is illustrated below for which edges are defined when weights are True or a number >=1.

source

target

weight

Cloudy

Sprinkler

True

Cloudy

Rain

True

Sprinkler

Wet_Grass

True

Rain

Wet_Grass

True

adjmat2vec

Converting an adjacency matrix into vector with bnlearn.bnlearn.adjmat2vec()

import bnlearn as bn
# Load DAG
DAG = bn.import_DAG('Sprinkler')
# Convert adjmat to vector:
vector = bn.adjmat2vec(DAG['adjmat'])

source

target

weight

Cloudy

Sprinkler

True

Cloudy

Rain

True

Sprinkler

Wet_Grass

True

Rain

Wet_Grass

True

vec2adjmat

Converting the created vector in the example above back into an adjacency matrix with bnlearn.bnlearn.vec2adjmat()

import bnlearn as bn
# Convert vector back to adjmat.
adjmat = bn.vec2adjmat(vector['source'], vector['target'], weights=vector['weight'])

source

Rain

Sprinkler

Wet_Grass

Cloudy

Rain

0

0

1

0

Sprinkler

0

0

1

0

Wet_Grass

0

0

0

0

Cloudy

1

1

0

0

adjmat2dict

Convert adjacency matrix to dictionary with bnlearn.bnlearn.adjmat2dict()

# Import library
import bnlearn as bn
# Load DAG
DAG = bn.import_DAG('Sprinkler')
# Convert adjmat to vector:
adjmat_dict = bn.adjmat2dict(DAG['adjmat'])
# print
print(adjmat_dict)

# {'Cloudy': ['Sprinkler', 'Rain'],
#  'Sprinkler': ['Wet_Grass'],
#  'Rain': ['Wet_Grass'],
#  'Wet_Grass': []}

dag2adjmat

Convert model into adjacency matrix with bnlearn.bnlearn.dag2adjmat()

# Import library
import bnlearn as bn
# Load DAG
DAG = bn.import_DAG('Sprinkler')
# Extract edges from model and store in adjacency matrix
adjmat=bn.dag2adjmat(DAG['model'])

source

Rain

Sprinkler

Wet_Grass

Cloudy

Rain

0

0

1

0

Sprinkler

0

0

1

0

Wet_Grass

0

0

0

0

Cloudy

1

1

0

0

vec2df

Convert edges between source and taget into a dataframe based on the weight with bnlearn.bnlearn.vec2df() For demonstration purposes, A small example is created below for which can be seen that the weights are indicative for the number of rows; a weight of 2 will result that a row with the edge is created 2 times.

# Import library
import bnlearn as bn
# Create source-target edges with its weights
source=['Cloudy','Cloudy','Sprinkler','Rain']
target=['Sprinkler','Rain','Wet_Grass','Wet_Grass']
weights=[1,2,1,3]
# Convert into sparse dataframe.
df = bn.vec2df(source, target, weights=weights)

Cloudy

Rain

Sprinkler

Wet_Grass

0

1

0

1

0

1

1

1

0

0

2

1

1

0

0

3

0

0

1

1

4

0

1

0

1

5

0

1

0

1

6

0

1

0

1

To demonstrate the full functionality A larger example can be loaded containing 352 edges from the book A Storm of Swords. The results is that 107 unique names are extracted with 4324 edges. This dataframe can for example be an input for structure learning approaches.

# Import library
import bnlearn as bn
# Load large example with source-target edges from the book A Storm of Swords
vec = bn.import_example("stormofswords")
# Convert into sparse dataframe.
df = bn.vec2df(vec['source'], vec['target'], weights=vec['weight'])
# sparse matrix:
print(df.shape)
# (4324, 107)