The presence of unobserved confounders poses a significant obstacle to estimating causal effects . Methods like Instrumental Variables (IV) rely on finding a variable that influences treatment without directly affecting the outcome (except through ) and is independent of . Regression Discontinuity (RDD) and Difference-in-Differences (DiD) exploit specific assignment mechanisms or data structures. Proximal Causal Inference (PCI) offers an alternative pathway to identification when these conditions are unmet but suitable "proxy" variables are available.
Introduced by Miao, Geng, and Tchetgen Tchetgen (2018), PCI provides a framework for identifying causal effects even when and share an unobserved common cause , provided we can observe two proxy variables, and , that satisfy specific conditional independence properties.
The core idea is to find variables that act as imperfect representatives, or proxies, for the unobserved confounder . Specifically, we need:
Crucially, unlike an instrument in IV, these proxies and are allowed to be confounded by . Their utility comes from how they relate to the observed variables and .
The relationships assumed in the simplest PCI setting (with observed confounders also present) can be visualized using a Directed Acyclic Graph (DAG):
A DAG illustrating the core relationships in Proximal Causal Inference. The unobserved confounder affects treatment , outcome , and both proxies and . Crucially, only affects via (once is considered), and only affects via . Observed confounders can also affect and .
Formal identification under PCI relies on the following conditional independence assumptions, often referred to as the "proximal conditions" or "bridge function" assumptions (assuming represents observed confounders adjusted for):
Outcome Bridge (using Z): . This means that given the treatment , the unobserved confounder , and observed confounders , the treatment proxy is independent of the outcome . It implies that 's connection to is fully mediated by .
Treatment Bridge (using W): . This means that given the unobserved confounder and observed confounders , the outcome proxy is independent of the treatment . It implies that 's connection to is fully mediated by .
These assumptions essentially state that is a "sufficient proxy" for 's influence on (conditional on ), and is a "sufficient proxy" for 's influence on (conditional on ).
How do these assumptions help identify ? The intuition is that the observed conditional distributions involving the proxies contain enough information to reconstruct the influence of the unobserved .
Consider the distribution of the outcome given the treatment , the outcome proxy , and observed confounders , denoted . This can be expressed by marginalizing over the unobserved :
Using the conditional independence assumptions (specifically implies under certain conditions, and similarly helps simplify ), PCI theory shows that the target causal effect can be identified by solving a system of integral equations.
Specifically, identification often relies on solving two Fredholm integral equations of the first kind. Let be the target quantity. The theory demonstrates relationships like:
Where and act like unknown functions, and are kernels involving the distributions of . PCI shows how to use the observed distributions , , and (under certain conditions) to solve for the necessary components to ultimately reconstruct .
This mathematical machinery effectively uses and as "bridges" to account for the confounding effect of without observing directly.
It's informative to contrast PCI with IV:
PCI essentially trades the IV exogeneity assumption () for the proximal conditional independence assumptions. This can be advantageous in scenarios where finding a truly exogenous instrument is difficult, but variables related to that satisfy the bridge conditions might exist.
While theoretically elegant, applying PCI presents practical challenges:
CausalPy or specific research implementations for potential tools.Proximal Causal Inference provides a valuable addition to the toolkit for causal inference in the presence of unobserved confounding. It operates under a different set of assumptions compared to IV, RDD, or DiD, relying on the existence of suitable proxy variables and . While finding such proxies and performing estimation can be challenging, PCI opens up possibilities for causal effect identification in complex systems where traditional methods might not apply. Understanding its principles allows you, as an expert practitioner, to consider a wider range of strategies when confronting hidden bias in your machine learning applications.
Was this section helpful?
© 2026 ApX Machine LearningEngineered with