Relative Variational Intrinsic Control

In the absence of external rewards, agents can still learn useful behaviors by identifying and mastering a set of diverse skills they can achieve within their environment. Existing skill learning methods use mutual information objectives to incentivize each skill to be diverse and distinguishable from the rest. However, if care is not taken to constrain the ways in which the skills are diverse, trivially diverse skills can arise. With Relative Variational Intrinsic Control, we propose a novel skill learning objective which incentivizes agents to learn skill sets that tile the space of affordances and therefore achieve diversity in a more useful way. We visually analyze the behavior of the skills on multiple environments and show how these initial state relative skills are more useful than skills discovered by existing methods in a hierarchical reinforcement learning set up.

Authors' notes