POC: Full Learned Sokoban experiment
This is a proof of concept MR (not intended to merge) replicating a full experiment with on-line learning of Sokoban model.
This is a proof of concept MR (not intended to merge) replicating a full experiment with on-line learning of Sokoban model.