The ciflypy package
In this article, we discuss how CIfly can be used from Python with the ciflypy
package. It is available on PyPI and can be installed with pip install ciflypy
.
At its core, it provides a single function reach
that performs a CIfly algorithm for an input graph and input sets according to a specified rule table. As a simple example, consider the code below that tests amenability using the table for finding non-amenable nodes discussed in the previous article.
import ciflypy as cf
# it is assumed that X and Y are disjoint
def is_amenable(g, X, Y):
sets = {"X": X}
table_path = "./not_amenable_cpdag.txt"
reached = cf.reach(g, sets, table_path)
return not set(reached).intersection(Y)
g = {"-->": [(0, 1), (2, 1)], "---": [(2, 3)]}
X = [3]
Y = [1]
print(is_amenable(g, X, Y))
Here, cf.reach
is called with a graph g
(as a list of edges per edge type), a dictionary of sets (corresponding to the set names specified in the rule table) and a path to the text file containing the rule table (in this case assuming not_amenable_cpdag.txt
lies in the current working directory; for better options, see below). The reach
function returns a list of nodes, the output of the CIfly algorithm specified by the rule table. In the example above, these are all non-amenable nodes with respect to which are used to test amenability of a pair and . We now discuss how the arguments should be passed to reach
.
Graph
The graph is represented by a dictionary with each key describing an edge type, which maps to a list or set containing tuples of non-negative integers — the edges of this type. For example, the graph which has edges would be represented as {"-->": [(0, 1), (2, 1)], "---": [(2, 3)]}
. This is equivalent to {"-->": [(0,1)], "<--": [(1, 2)], "---": [(2,3)]}
provided the rule table contains the edge description edges: --> <--, ---
specifying -->
and <--
to be reciprocal.
For the node IDs, we strongly recommend to use the numbers for a graph with nodes. Generally, any non-negative integers can be used, however, the memory requirements of ciflypy
are linear in the largest node ID because of the internal representation of graphs and sets. Hence, unnecessarily large node numbers lead to slower run times and potential memory problems. Non-integer node names have to be stored outside of the ciflypy
graph format, e.g., in a list indexed by the node IDs.
Sets
The sets are represented by a dictionary as well with keys mapping the name of a node set specified in the rule table to its instantiation as a Python list or set of non-negative integers (a single integer can also be used). As an example, {"X": [0, 1], "Z": [3], "W": [2, 4]}
would represent the three sets , and . The same sets could also be stated as {"X": {0, 1}, "Z": 3, "W": [2, 4]}
. The names of the sets must match the names specified in the rule table.
Rule table
In its basic version, the rule table is passed to reach
by its file path as string. If used outside of simple scripts, make sure that the rule-table text file is included within the package and that its path is specified relative to the source file and not the current working directory. Better code compared to the example above would be to use pathlib
to achieve this.
import ciflypy as cf
from pathlib import Path
table_name = "not_amenable_cpdag.txt"
# path to 'not_amenable_cpdag.txt' in the same directory as the source file
table_path = str(Path(__file__).parent.resolve() / table_name)
# ...
# pass table_path to cf.reach as third argument
cf.reach(g, sets, table_path)
If you do not want rule-table text files in your project, it is also possible to directly embed tables into your Python code as multi-line strings.
not_amenable_table = """
EDGES --> <--, ---
SETS X
COLORS init, yield
START ... [init] AT X
OUTPUT ... [yield]
... [init] | --- [yield] | next not in X
... [yield] | ---, --> [yield] | next not in X
"""
# ...
# pass table as string and set keyword argument table_as_string accordingly
cf.reach(g, sets, not_amenable_table, table_as_string=True)
For this, the keyword argument table_as_string
has to be set to True
.
Output
reach
returns a list of non-negative integers, representing the nodes that are reached in an output state (as specified in the rule table) when running the reachablity algorithm on the given graph with the provided sets.
Logging
To help with debugging, it is possible to pass the keyword argument verbose=True
to reach
. This logs the steps the reachability algorithm takes during its execution.
Advanced Usage
In every reach
call, the implementation needs to parse the graph, sets and rule table. This can introduce unnecessary overhead in the case that the same graph, sets or rule table are used repeatedly. To address this, ciflypy
offers the option to parse these objects outside of the reach
function with the constructors for the classes Graph
, Sets
and Ruletable
.
The Ruletable
constructor can be called as Ruletable(table_path)
or Ruletable(table_string, table_as_string=True)
. The first argument of the Graph
and Sets
constructors are also in the same encoding as the respective reach
arguments. However, it is necessary to pass a rule table as second argument (this can be done via file path, as a multi-line string with table_as_string
set to True
or by passing a Ruletable
object). Afterwards, these objects must be used in combination with rule tables that have the same EDGES
, respectively SETS
specification as the passed rule table. All three objects are internal CIfly representations and can only be passed as their respective arguments into reach
. We offer no further methods on these classes.
As an example, consider the code below that computes all amenable nodes for every possible . Here, the same rule table and the same graph are used repeatedly. Hence, parsing them once outside of reach
avoids unnecessary overhead. We also parse sets
outside of reach
. As this is done once per reach
call it, however, offers no performance advantage in this setting.
import ciflypy as cf
table = cf.Ruletable("./not_amenable_cpdag.txt")
# p stores the number of nodes
p = 4
g = cf.Graph({"-->": [(0, 1), (2, 1)], "---": [(2, 3)]}, table)
for x in range(p):
sets = cf.Sets({"X": [x]}, table)
not_amenable = cf.reach(g, sets, table)
not_amenable.append(x)
print("amenable for", x, set(range(p)).difference(not_amenable))
Conversion to the ciflypy graph format
The ciflypy
graph format is based on a dictionary of edge lists for different edge types. Graphs can usually be converted to this format with little effort. For example, consider a directed graph in networkx
:
import networkx as nx
# construct graph
g = nx.DiGraph()
g.add_edge(1, 2)
g.add_edge(2, 3)
# convert to ciflypy format
cifly_graph = {"-->": list(g.edges)}
print(cifly_graph)
This assumes that the directed edges are referred to by the string "-->"
in the rule tables.
NumPy adjacency matrices can be converted just as easily. We use the same example graph and offer a function that works for both possible matrix orientations, i --> j
if g[(i, j)] = 1
or if g[(j, i)]
, as specified by the edge_direction
argument.
import numpy as np
# function for conversion to ciflypy graph format
def adjacencymatrix2cifly(g, edge_direction):
edges = list(zip(*np.nonzero(g)))
if edge_direction == "from row to column":
return {"-->": edges}
elif edge_direction == "from column to row":
return {"<--": edges}
raise ValueError("Invalid edge_direction")
# construct graph
g = np.zeros((3, 3), dtype=int)
g[(0, 1)] = 1
g[(1, 2)] = 1
# convert to ciflypy format using the option "from row to column"
# thus, the matrix orientation is specified as i --> j if g[(i, j)] = 1
# conversely, "from column to row" would imply i --> j if g[(j, i)] = 1
cifly_graph = adjacencymatrix2cifly(g, "from row to column")
print(cifly_graph)