Resolve "Tango-Server crashes on Restart Command"
Closes #545 (closed)
Lenghty description of the problem
Segfaults were caused by this call because at this point the DeviceClass*
pointer was seriously corrupted. Earlier, size returned by this call was already negative and this iteration was a trainwreck.
The reason of this was that device_class
pointer created in PyTango layer during server startup here was being destroyed (at the beginning of running RestartServer
command) here by CppTango and python was unaware of this. In next step Python would "try" to delete created DeviceClass
instances on this line. Normally that would cause a boost reference counter to go to zero and cause a segfault, because the underlying object was already destroyed... but python still was holding a reference hidden in _device_class_instance
here so the garbage collection was not triggered. In the next step, python allocated a new DeviceClass
instance and continued here. This assignment finally caused the reference counter of the first DeviceClass instance (already deleted) to go to 0, and boost triggered destructor. Because we had a new instance allocated exactly in this memory location, this didnt segfault, but instead it removed the DeviceClass
instance newly allocated by python. From this point whatever operations on class_list[i]
were just dangling in space.
We allocate DeviceClass
instance in python and when CppTango deletes it, there's no simple way to inform python about this. Proposed solution is to wrap a delete class_list[i]
calls in CppTango in a wrapper that can be overloaded in PyTango. It would then skip the deletion if the object was allocated in python layer. Proposed wrapper uses pointer to function.
If this solution is too ugly, we can investigate other possibilities:
- Using
intrusive_ptr
instead ofshared_ptr
to manage the memory by boost. That is said to allow custom dealocation mechanisms. We could check if the object was already deleted and skip deletion then. - Maybe it's possible to reimplement
DServer
to work onshared_ptr
instead of normal pointers and we could pass the shared pointer managed by boost to Cpp layer - (not working) I tried creating a virtual deleter method of
DServer
that I would override in boost wrapper, but it didn't work, becauseDServer
is allocated in cpp here so it will not know about any overloads