ciflypy · CIfly

In this article, we discuss how CIfly can be used from Python with the ciflypy package. It is available on PyPI and can be installed with pip install ciflypy. At its core, it provides a single function reach that performs a CIfly algorithm for an input graph and input sets according to a specified rule table. As a simple example, consider the code below that tests amenability using the table for finding non-amenable nodes discussed in the previous article.

import ciflypy as cf

# it is assumed that X and Y are disjoint
def is_amenable(g, X, Y):
    sets = {"X": X}
    table_path = "./not_amenable_cpdag.txt"
    reached = cf.reach(g, sets, table_path)
    return not set(reached).intersection(Y)

g = {"-->": [(0, 1), (2, 1)], "---": [(2, 3)]}
X = [3]
Y = [1]
print(is_amenable(g, X, Y))

Here, cf.reach is called with a graph g (as a list of edges per edge type), a dictionary of sets (corresponding to the set names specified in the rule table) and a path to the text file containing the rule table (in this case assuming not_amenable_cpdag.txt lies in the current working directory; for better options, see below). The reach function returns a list of nodes, the output of the CIfly algorithm specified by the rule table. In the example above, these are all non-amenable nodes with respect to $X$ which are used to test amenability of a pair $X$ and $Y$ . We now discuss how the arguments should be passed to reach.

Graph

The graph is represented by a dictionary with each key describing an edge type, which maps to a list or set containing tuples of non-negative integers — the edges of this type. For example, the graph which has edges $0 \rightarrow 1 \leftarrow 2 - 3$ would be represented as {"-->": [(0, 1), (2, 1)], "---": [(2, 3)]}. This is equivalent to {"-->": [(0,1)], "<--": [(1, 2)], "---": [(2,3)]} provided the rule table contains the edge description edges: --> <--, --- specifying --> and <-- to be reciprocal.

For the node IDs, we strongly recommend to use the numbers $0, 1, \dots, p-1$ for a graph with $p$ nodes. Generally, any non-negative integers can be used, however, the memory requirements of ciflypy are linear in the largest node ID because of the internal representation of graphs and sets. Hence, unnecessarily large node numbers lead to slower run times and potential memory problems. Non-integer node names have to be stored outside of the ciflypy graph format, e.g., in a list indexed by the node IDs.

Sets

The sets are represented by a dictionary as well with keys mapping the name of a node set specified in the rule table to its instantiation as a Python list or set of non-negative integers (a single integer can also be used). As an example, {"X": [0, 1], "Z": [3], "W": [2, 4]} would represent the three sets $X = \{0, 1\}$ , $Z = \{3\}$ and $W = \{2, 4\}$ . The same sets could also be stated as {"X": {0, 1}, "Z": 3, "W": [2, 4]}. The names of the sets must match the names specified in the rule table.

Rule table

In its basic version, the rule table is passed to reach by its file path as string. If used outside of simple scripts, make sure that the rule-table text file is included within the package and that its path is specified relative to the source file and not the current working directory. Better code compared to the example above would be to use pathlib to achieve this.

import ciflypy as cf
from pathlib import Path

table_name = "not_amenable_cpdag.txt"
# path to 'not_amenable_cpdag.txt' in the same directory as the source file
table_path = str(Path(__file__).parent.resolve() / table_name)
# ...
# pass table_path to cf.reach as third argument
cf.reach(g, sets, table_path)

If you do not want rule-table text files in your project, it is also possible to directly embed tables into your Python code as multi-line strings.

not_amenable_table = """
EDGES --> <--, ---
SETS X
COLORS init, yield
START ... [init] AT X
OUTPUT ... [yield]

... [init]  | ---      [yield] | next not in X
... [yield] | ---, --> [yield] | next not in X
"""
# ...
# pass table as string and set keyword argument table_as_string accordingly
cf.reach(g, sets, not_amenable_table, table_as_string=True)

For this, the keyword argument table_as_string has to be set to True.

Output

reach returns a list of non-negative integers, representing the nodes that are reached in an output state (as specified in the rule table) when running the reachablity algorithm on the given graph with the provided sets.

Logging

To help with debugging, it is possible to pass the keyword argument verbose=True to reach. This logs the steps the reachability algorithm takes during its execution.

Advanced Usage

In every reach call, the implementation needs to parse the graph, sets and rule table. This can introduce unnecessary overhead in the case that the same graph, sets or rule table are used repeatedly. To address this, ciflypy offers the option to parse these objects outside of the reach function with the constructors for the classes Graph, Sets and Ruletable.

The Ruletable constructor can be called as Ruletable(table_path) or Ruletable(table_string, table_as_string=True). The first argument of the Graph and Sets constructors are also in the same encoding as the respective reach arguments. However, it is necessary to pass a rule table as second argument (this can be done via file path, as a multi-line string with table_as_string set to True or by passing a Ruletable object). Afterwards, these objects must be used in combination with rule tables that have the same EDGES, respectively SETS specification as the passed rule table. All three objects are internal CIfly representations and can only be passed as their respective arguments into reach. We offer no further methods on these classes.

As an example, consider the code below that computes all amenable nodes for every possible $x$ . Here, the same rule table and the same graph are used repeatedly. Hence, parsing them once outside of reach avoids unnecessary overhead. We also parse sets outside of reach. As this is done once per reach call it, however, offers no performance advantage in this setting.

import ciflypy as cf

table = cf.Ruletable("./not_amenable_cpdag.txt")

# p stores the number of nodes
p = 4
g = cf.Graph({"-->": [(0, 1), (2, 1)], "---": [(2, 3)]}, table)

for x in range(p):
    sets = cf.Sets({"X": [x]}, table)
    not_amenable = cf.reach(g, sets, table)
    not_amenable.append(x)
    print("amenable for", x, set(range(p)).difference(not_amenable))

Conversion to the ciflypy graph format

The ciflypy graph format is based on a dictionary of edge lists for different edge types. Graphs can usually be converted to this format with little effort. For example, consider a directed graph in networkx:

import networkx as nx

# construct graph
g = nx.DiGraph()
g.add_edge(1, 2)
g.add_edge(2, 3)

# convert to ciflypy format
cifly_graph = {"-->": list(g.edges)}
print(cifly_graph)

This assumes that the directed edges are referred to by the string "-->" in the rule tables.

NumPy adjacency matrices can be converted just as easily. We use the same example graph and offer a function that works for both possible matrix orientations, i --> j if g[(i, j)] = 1 or if g[(j, i)], as specified by the edge_direction argument.

import numpy as np

# function for conversion to ciflypy graph format
def adjacencymatrix2cifly(g, edge_direction):
    edges = list(zip(*np.nonzero(g)))
    if edge_direction == "from row to column":
        return {"-->": edges}
    elif edge_direction == "from column to row":
        return {"<--": edges}
    raise ValueError("Invalid edge_direction")

# construct graph
g = np.zeros((3, 3), dtype=int)
g[(0, 1)] = 1
g[(1, 2)] = 1

# convert to ciflypy format using the option "from row to column"
# thus, the matrix orientation is specified as i --> j if g[(i, j)] = 1
# conversely, "from column to row" would imply i --> j if g[(j, i)] = 1
cifly_graph = adjacencymatrix2cifly(g, "from row to column")
print(cifly_graph)