Incorrect refcount in petsc4py
Compiling Python with --with-pydebug
enables additional checks on the reference counting, which can give runtime assertions. This seems to be triggered when the Python garbage collector runs and two objects point to a shared member without increasing the refcount. An example assertion follows.
refcount assertion
../Modules/gcmodule.c:116: gc_decref: Assertion "gc_get_refs(g) > 0" failed: refcount is too small
Enable tracemalloc to get the memory block allocation traceback
object address : 0x7fffd5cb0110
object refcount : 1
object type : 0x998160
object type name: dict
object repr : {'__function__': (<function _SNESContext.form_function at 0x7fffdedfbf50>, (), {}), '__jacobian__': (<function _SNESContext.form_jacobian at 0x7fffdee03050>, (), {})}
Fatal Python error: _PyObject_AssertFailed: _PyObject_AssertFailed
petsc4py objects and members
An object in petsc4py wraps an underlying PETSc object self.oval
, and also has a pointer member to refer to that same object, self.obj
:
cdef class Object:
def __cinit__(self):
self.oval = NULL
self.obj = &self.oval
In, for example, the SNES constructor, obj
is then pointed to a different member:
cdef class SNES(Object):
def __cinit__(self):
self.obj = <PetscObject*> &self.snes
self.snes = NULL
def create(self, comm=None):
cdef MPI_Comm ccomm = def_Comm(comm, PETSC_COMM_DEFAULT)
cdef PetscSNES newsnes = NULL
CHKERR( SNESCreate(ccomm, &newsnes) )
PetscCLEAR(self.obj); self.snes = newsnes
return self
Next, the objects can have a Python context associated with them, which can create and attach a dictionary onto the object, ensuring it would be correctly destroyed if the object itself was cleaned up:
cdef object set_attr(self, char name[], object attr):
return PetscSetPyObj(self.obj[0], name, attr)
...
cdef object PetscSetPyObj(PetscObject o, char name[], object p):
cdef object dct
if p is not None:
dct = PetscGetPyDict(o, True)
...
cdef object PetscGetPyDict(PetscObject obj, bint create):
if obj.python_context != NULL:
return <object>obj.python_context
if create:
obj.python_destroy = PetscDelPyDict
obj.python_context = <void*>PyDict_New()
return <object>obj.python_context
return None
GC traversal
Objects that participate in Python's cyclic garbage collection implement a tp_traverse()
method. In our case, the python_context
member is traversed into if it's allocated. During the garbage collection, the reference count is copied, and any object whose reference count drops to 0 during this process is tentatively unreachable. However, during the traversal, the reference count of python_context
is decremented to a negative value. This seems to stem from a call to SNES_Function()
, which starts by creating a new object that refers to the same underlying SNES object:
cdef inline SNES ref_SNES(PetscSNES snes):
cdef SNES ob = <SNES> SNES()
ob.snes = snes
PetscINCREF(ob.obj)
return ob
However, this doesn't increment the reference to ob.obj.python_context
on the Python side, which I believe leads to the issue. I got a little bit tangled up in all the different memory management in petsc4py while investigating this, so I wasn't confident in suggesting a fix.