Sparse reward tasks to learn behaviour priors

Sparse reward multi-task domains that require a common set of behaviors to solve. The tasks are setup as 'True/False' predicates which are used to provide reward signal. For instance, going to a target or moving a box to a target can be used to encourage certain behaviors within agents. These tasks have been used with "Behavior Priors for Efficient Reiforcement Learning" (, "Exploiting Hierarchy for Learning and Transfer in KL-Regularized RL" ( and "Information asymmetry in KL-regularized RL" (


26 Jul 2021