When the Env ends the trial, the last reward is not received by the Agents
Reproduction steps:
Create or use an Environment that end itself and that returns reward.
Current behavior:
At the last step the agents don't receive the last reward
Expected behavior:
When the Env end the trial, the agents should receive the last reward