Draft: fix: Python 3.10–3.12 compatibility and add z-identifiability (idz) module

Summary

Two related contributions:

  1. Three bug fixes that prevent ananke-causal from running under Python 3.10–3.12 with current numpy 2.x / pandas 2.x stacks
  2. New ananke/identification/idz.py module implementing the complete z-identifiability decision procedure (Bareinboim & Pearl, 2012), plus a higher-level ZID class

The idz module is a prerequisite for a pending DoWhy integration MR (py-why/dowhy#1531) that adds surrogate-experiment-based identification to DoWhy's pipeline.


Part 1: Bug fixes

ananke/estimation/counterfactual_mean.py df.iteritems()df.items()iteritems was removed in pandas 2.0; raises AttributeError on any current pandas installation.

ananke/models/binary_nested.py Docstring """..."""r"""...""" — LaTeX sequences like \prod produce SyntaxWarning: invalid escape sequence under Python 3.12+.

docs/source/conf.py Removed misspelled autodoc_default_falgs (silently ignored by Sphinx). Replaced with correct autodoc_default_options = {"members": True}.

Part 2: Packaging and CI updates

pyproject.toml

  • python: "^3.9"">=3.10,<3.13" (Python 3.13 excluded pending PyTorch/pgmpy support)
  • Trove classifiers updated to 3.10, 3.11, 3.12
  • Raised lower bounds: numpy ^2.0, pandas ^2.2, scipy ^1.11, statsmodels ^0.14
  • pgmpy: pinned ">=0.1.26,<0.2.0" to stay on the stable 0.1 line
  • Added explicit sympy dependency (previously only transitive)
  • Removed unused direct dependencies: jax, jaxlib, mystic

tox.ini: envlist updated to py310, py311, py312

.gitlab-ci.yml: Docker image updated to python:3.12

.readthedocs.yml: Build Python updated to 3.12

Part 3: New feature — z-identifiability

ananke/identification/idz.py

Implements the complete z-ID decision procedure (Bareinboim & Pearl, 2012, Theorem 3). The query P(Y | do(X)) is z-identifiable if and only if OneLineGID succeeds given the full powerset experiment family {G_{Z'} : Z' ⊆ Z}.

python

`from ananke.identification.idz import idz_id

result = idz_id( graph=G, # ananke ADMG treatments=["X"], outcomes=["Y"], surrogates=["Z"], )

True if z-identifiable, False otherwise`

ananke/identification/z_id.py

Higher-level ZID class with input validation and is_standard_id property:

python

`from ananke.identification import ZID

zid = ZID(graph=G, treatments=["X"], outcomes=["Y"], surrogates=["Z"]) zid.id() # True / False zid.is_standard_id # True if identifiable without surrogates zid.functional() # identifying functional string (standard-ID only)`

Tests

tests/identification/test_idz.py — 18 tests covering idz_id and experiments_for_all_subsets:

  • Experiment family size (2^n for n surrogates)
  • Immutability of original graph
  • Standard-ID cases
  • Non-identifiable cases
  • z-ID rescue cases verified against a 500-graph oracle benchmark

tests/identification/test_z_id.py — 16 tests covering the ZID class API, input validation, and functional().

Full suite: 191/191 passing on CPython 3.12 with numpy 2.x, pandas 2.x, scipy 1.11+, pgmpy 0.1.x.


References

Bareinboim, E. & Pearl, J. (2012). Causal Inference by Surrogate Experiments: z-Identifiability. UAI.

Richardson, T.S., Robins, J.M., & Shpitser, I. (2017). Markovian Acyclic Directed Mixed Graphs for Representing Interventional Distributions.

Merge request reports

Loading