Update DRL algos
Two updates are required due to changes in core palaestrai:
- Muscle's propose actions now return two lists (first to env, second one to brain): palaestrai!195 (diffs)
- Muscle init now gets the brain id: palaestrai!205 (diffs) -> use **args/**kwargs to prevent such problems in the future!