Idea: Use C++ Protothreads
Using the C++ port (https://github.com/benhoyt/protothreads-cpp) fixes a lot of the C-Based limitations. Specifically, that you can use "Thread-Local"/"Stack"-variables by putting them into the object. Of course there is a slight overhead due to the this-pointer and objects.
If you are already aware of that lib: Sorry for the noise. :)