Avoid duplicate events when old clients subscribe

I have had some fun reproducing this issue for different event types. The tests in the first commit all fail without the subsequent fixes.

Some things I have noticed:

  • DEVICE_INTERFACE_CHANGE, ATTR_DATA_READY and PIPE events don't send different versions of the event data so don't suffer from this issue
  • ATTR_CONF events seem to confuse the version of the device with the version of the client in places, which results in events always being sent on the idl5_attr_conf topic, even if they are AttributeConfig_3 objects. This means that old (IDLv4) clients never get any events and the new client gets a "duplicated" event which it fails to marshal.
  • ALARM events obviously don't have this problem as only new clients can subscribe to them[1].
  • Clients actually discards duplicate events based on the event counter, however, it only does this once for some reason. If it did it every time, we wouldn't have this problem
  • The server is not incrementing the event counter at the correct time when there are multiple client libs. They are incrementing at the end of the first client lib, meaning that the first client lib gets a different event counter than the rest. It should be done at the end of the last client lib instead. Unfortunately, this doesn't fix the problem because the client only discards once and there are sometimes 3 duplicated events due to the fix for #369 (closed)

This is all pretty complicated and a bit of a mess to be honest.

As this is so complicated, I think it is best to try and fix it for the patch releases in as straight-forward a way as possible and to resist the urge to fix everything.

My fix here just introduces a mapping from "client idl version" to "event data version" as we discussed in the cppTango meeting to fix the AttributeValue events. For the AttributeConfig event, I also fixed issue [1]. For main, I think we should try to re-write a lot of this to be clearer.

After spending some time thinking about this, here is how I think it should work.

The client_libs vectors should hold values which correspond to which member of SuppliedEventData should be active when we call push_event. I.e. from the (client_lib, event_type) pair we should be able to work out which SuppliedEventData to fill in.

I get the impression from the code, that the ZMQ event system was introduced with IDLv4. So, I think that the "idl5_" prefix should be interpreted as "send the IDLv5 version of the event data structure" and there isn't a problem in the ZMQ event system with IDLv3 and IDLv4 clients both subscribing as IDLv3 clients don't know about ZMQ.

What I don't quite understand is that we seem to handle AttributeValue_3 and AttributeValue_4, but if ZMQ was introduced with IDLv4 I don't see how an IDLv3 client could make a ZMQ event subscription. Perhaps this means that I don't understand something still. It is quite hard to work this is out as it depends on what previous versions of the code were doing...

Either way, I think the fixes here should work for the patch releases and I can tidy things up more for the main when I forward port this.

Closes #1521 (closed).

Edited by Thomas Ives

Merge request reports

Loading