Multiple secmod instances cause issues with roaming users
When roaming is enabled (deny-roaming is set to false), the following scenario can occur:
- A user connects to ocserv and is assigned to a random secmod instance (e.g., secmod-1).
- The user authenticates and receives a cookie from secmod-1.
- Later, the user obtains a new IP address via DHCP.
- Upon reconnecting, the user is assigned to a different secmod instance (e.g., secmod-2) due to the changed source IP.
- The user presents the cookie from step 2, and successfully connects. Cookie authentication succeeds here because the main process can extract the original secmod instance number from the cookie and validate it with a proper secmod.
However, the worker is now associated to secmod-2, which has no knowledge of the user's session. As a result, periodic communication between the worker and the secmod fails with errors like:
ocserv[3953]:worker[user]: 11.22.33.44 sending message 'sm: worker cli stats' to secmod
ocserv[8454]:sec-mod: cmd [size=71] sm: worker cli stats
ocserv[8454]:sec-mod: session stats but with non-existing SID
ocserv[8454]:sec-mod: error processing 'sm: worker cli stats' command (-1)
This is the same underlying issue that led to the revert of !288 (merged) in 26258d7c. Reverting only reduced the scope of the problem from all users to roaming ones. Once this issue is resolved, !288 (merged) can be safely reapplied.
I will prepare an MR to address this, just want it to be tracked as a separate issue.