Finding minimal d-separators

In this article, we implement the algorithm developed by van der Zander, Liśkiewicz (2020) for finding a minimal d-separator using CIfly. The concept of d-separation lies at the center of graphical causality as a bridge between graphs and conditional independencies. We have already seen an simple CIfly algorithm for testing d-separation on the home page. However, now we consider the task of finding d-separators and, more precisely, minimal ones. A d-separator ZZ for sets of nodes XX and YY is minimal if no subset of ZZ d-separates XX and YY.

For our implementation, we follow the algorithmic techniques presented in Section 5 in (van der Zander, Liśkiewicz 2020) which apply to quite general classes of causal graph including CPDAGs and MAGs, both popular in the context of causal structure learning. However, here we only focus on acyclic directed mixed graphs (ADMGs). At the core of the algorithm lies the notion of a closure of a set of nodes in a causal graph. It is used to find a nearest d-separator that, in turn, can be used to find a minimal d-separator. We also use nearest d-separators later when implementing a sound-and-complete algorithm for finding conditional instrumental sets. Let us now discuss these concepts in more detail.

Preliminaries

Acyclic directed mixed graphs, or ADMGs for short, are graphs that include directed as well as bidirected edges and do not contain a directed cycle. A node uu is an ancestor of a node vv if there exists a directed path from uu to vv. The set of all ancestors of vv in a graph GG is denoted by anG(v)\text{an}_{G}(v). This can be generalized to sets of nodes ZZ, for which we define anG(Z)\text{an}_G(Z) as the union of the ancestors of the individual nodes, that is zZanG(z)\bigcup_{z \in Z} \text{an}_G(z). Moreover, a collider on a path is a node where the incident edges point towards it (for example v\rightarrow v \leftrightarrow) and a non-collider is a node for which this does not hold.

Closure

Based on these notions, the closure of a set of nodes can be defined as follows.

Definition [van der Zander, Liśkiewicz (2020)]
Let GG be an ADMG, XX and ZZ set of nodes, and A=anG(XY)A = \text{an}_G(X \cup Y). Then, the closure of XX with respect to ZZ, written closureG(X,Z)\text{closure}_G(X, Z), is defined as the set of all nodes vv for which there exists a path from XX to vv in GG that only contains nodes in AA and no non-collider in ZZ.

The closure can be directly encoded into a CIfly rule table that tracks paths over nodes in AA and ensures that non-colliders, here in the bottom rule, are not in ZZ.

closure_admg.txt
EDGES --> <--, <->
SETS X, Z, A
START <-- AT X
OUTPUT ...

-->, <-> | <--, <-> | next in A
...      | ...      | next in A and current not in Z

Nearest d-separators

As van der Zander, Liśkiewicz (2020) show, nearest separators can be used to compute minimal ones. We will follow this strategy, in particular because we will re-use the concept of nearest separators in a later article. Roughly speaking, a nearest separator relative to XX and YY d-separates these sets using nodes closer to XX than YY whenever possible. For a more precise definition, we refer to van der Zander, Liśkiewicz (2020).

The strategy for computing the nearest separator works as follows:

  1. determine a potential d-separator for XX and YY, namely Z0=A(XY)Z_0 = A \setminus (X \cup Y),
  2. compute XX^*, the closure of XX with respect to Z0Z_0, and
  3. if XX^* contains nodes in YY, then return \bot as there exists no d-separator, else return XZ0X^* \cap Z_0.

The idea behind this strategy is that the closure contains the nodes closest to XX that are needed to d-separate XX and YY. This is essentially the case because the search stops at non-colliders in ZZ. Due to the definition of d-separation non-colliders not in ZZ are not blocked and, because the search runs over ancestors of XX and YY, it also continues for colliders (no matter whether in ZZ or not). For the formal details, we refer to van der Zander, Liśkiewicz (2020).

This algorithm can be implemented as follows using CIfly and the rule table for computing the closure from above. Here, as in the original work, we have added II and RR as arguments which ensure that only separators ZZ which satisfy IZRI \subseteq Z \subseteq R are returned. This gives the user the option to further constrain the desired d-separator and is used below.

nearest_dsep.py
import ciflypy as cf
import ciflypy_examples.utils as utils

ruletables = utils.get_ruletable_path()
ancestors_table = cf.Ruletable(ruletables / "ancestors_admg.txt")
closure_table = cf.Ruletable(ruletables / "closure_admg.txt")

def find_nearest_separator(g, X, Y, I, R):
    A = cf.reach(g, {"X": X + Y + I}, ancestors_table)
    Z0 = set(R).intersection(set(A) - set(X + Y))
    Xstar = cf.reach(g, {"X": X, "Z": Z0, "A": A}, closure_table)
    if set(Xstar).intersection(Y):
        return None
    return list(Z0.intersection(Xstar).union(I))

Minimal d-separators

Using the algorithm for finding nearest separators as a primitive, it is possible to compute a minimal d-separator with the following strategy:

  1. Compute ZXZ_X as a nearest separator between XX and YY (or return \bot if none exists),
  2. compute ZYZ_Y as a nearest d-separator between YY and XX with R=ZXR = Z_X (or return \bot if none exists) and
  3. return ZXZYZ_X \cap Z_Y.

Here ZXZ_X is a nearest separator between XX and YY that is used to constrain the nearest separator for YY in the second step. This strategy can be directly translated to Python and R code. Again, we allow for arguments II and RR constraining the minimal d-separator ZZ to IZRI \subseteq Z \subseteq R. For more details on intuition and the correctness of this procedure, we again refer to van der Zander, Liśkiewicz (2020).

min_dsep.py
from ciflypy_examples.nearest_dsep import find_nearest_separator

def find_min_separator(g, X, Y, I, R):
    Zx = find_nearest_separator(g, X, Y, I, R)
    if not Zx:
        return None
    Zy = find_nearest_separator(g, Y, X, I, Zx)
    if not Zy:
        return None
    return set(Zx).intersection(Zy).union(I)

References