While the backdoor and frontdoor adjustment criteria provide elegant solutions for identifying causal effects P(Y∣do(X=x)) in many common scenarios described by Directed Acyclic Graphs (DAGs), they represent only a subset of the identifiable cases. Real-world systems often present complexities like unobserved confounding or intricate causal pathways that these standard criteria cannot directly handle. Fortunately, the foundational framework of Structural Causal Models (SCMs) and do-calculus offers a more comprehensive toolkit for tackling these tougher identification problems.
Recall the three rules of do-calculus, which allow us to manipulate probability distributions involving interventions (do(⋅) operators) using only observational quantities, provided certain graphical conditions are met.
The significance of do-calculus extends beyond just providing sufficient conditions; it provides a complete algorithm for nonparametric identification in DAGs. If a causal effect P(y∣do(x)) can be expressed using the observational distribution P(V) derived from an SCM represented by a DAG G, then the do-calculus rules, applied systematically, will derive that expression. If the rules fail to eliminate the do-operator, the effect is generally not identifiable from observational data alone without further assumptions.
Consider a scenario slightly more complex than typical backdoor or frontdoor cases:
An unobserved confounder U affects both X and Y. Z1 mediates the effect of X on Y, while Z2 confounds X and Y but is observed. Standard backdoor fails due to U. Frontdoor fails because X is not blocked from Y by Z1 alone (due to the Z2→Y path).
Here, identifying P(Y∣do(X=x)) requires a careful application of do-calculus rules, often involving conditioning and marginalizing over observed variables like Z1 and Z2 in sequence. For instance, one might try to first adjust for Z2 and then use properties related to the mediator Z1.
A major challenge arises when unobserved variables confound the relationship between treatment X and outcome Y. While do-calculus signals non-identifiability in such cases from observational data alone, specific structures or auxiliary variables can sometimes rescue identification.
The classic approach is Instrumental Variables (IV). An instrument Z is a variable that affects X, but affects Y only through its effect on X, and does not share any common causes with Y (other than potentially through X).
The canonical IV graph structure. Z allows identification of the effect of X on Y despite the unobserved confounder U.
Under certain assumptions (often linearity for simple estimation), the IV estimand allows recovering the causal effect. For example, in a linear, homogeneous effect setting:
βIV=Cov(Z,X)Cov(Z,Y)This identifies the average causal effect of X on Y. Finding valid instruments in practice is notoriously difficult. They must satisfy the relevance condition (Z causes X) and the exclusion restriction (Z affects Y only via X and is independent of U). We explore advanced IV methods in Chapter 4.
A more recent framework, Proximal Causal Inference, offers an alternative when direct instruments are unavailable but proxies for the unobserved confounders exist. Suppose U is the unobserved confounder between X and Y. Proximal inference seeks two observed variables:
Crucially, W and Z must satisfy certain conditional independence assumptions, essentially acting as noisy measurements of the underlying confounding U.
Proximal Causal Inference setup. W and Z are observed proxies for the unobserved confounder U.
Under specific conditions, particularly solving integral equations (often requiring regularization or kernel methods in practice), proximal methods can identify the causal effect P(Y∣do(X=x)). For example, one identification strategy involves solving for bridge functions h(x,z)=E[Y∣X=x,Z=z] and g(w,x)=P(X=x∣W=w) from the observational data, and then relating them via the latent confounder U. This is particularly relevant in high-dimensional settings where U might be complex, but high-dimensional proxies W and Z are available (e.g., text embeddings, past user behavior). We delve into this in Chapter 4.
Identification can sometimes be achieved by adjusting for sets of variables that do not satisfy the standard backdoor criterion, but fulfill alternative graphical conditions. For example, the conditional backdoor criterion might apply if adjustment is valid only within specific strata of other variables. Other criteria might involve adjusting for descendants of the treatment variable under specific graph structures, going beyond the simpler frontdoor setup. These often arise from careful application of do-calculus or analysis of the graph structure to find alternative ways to block confounding paths.
For DAGs possibly including unobserved confounders (often represented using bidirected edges in related graphs like Maximal Ancestral Graphs, or MAGs), Tian and Pearl developed a provably complete algorithm, often called the ID algorithm. Given a causal query P(y∣do(x)) and a causal graph (potentially with hidden variables), the ID algorithm determines if the query is identifiable from the observed joint distribution. If it is, the algorithm outputs the corresponding estimand as a function of the observed distribution; otherwise, it correctly reports non-identifiability.
While the algorithm itself is intricate, involving graph decompositions and recursive application of do-calculus logic, its existence is theoretically significant. It guarantees that any identifiable effect can be found systematically. Implementations exist in libraries like DoWhy
, allowing practitioners to check identifiability for complex custom graphs beyond standard named criteria.
It's important to distinguish between nonparametric and parametric identification.
While nonparametric identification is generally preferred for its robustness, parametric assumptions are sometimes necessary, especially when dealing with feedback loops or specific types of unobserved confounding where nonparametric methods fail.
Moving beyond standard criteria opens up possibilities for causal inference in more challenging, realistic settings. However, these advanced strategies come with caveats:
Therefore, while these methods expand our capabilities, they demand careful consideration of the underlying assumptions and increased attention to sensitivity analysis (covered in the next section) to assess the robustness of conclusions to potential violations of these assumptions. Applying do-calculus systematically or using algorithmic identification tools can provide formal justification, but domain knowledge remains indispensable for validating the plausibility of the graph structure and the required identifying assumptions.
© 2025 ApX Machine Learning