Collect & Infer - a fresh look at data-efficientReinforcement Learning

This position paper proposes a fresh look at Reinforcement Learning(RL) from the perspective of data-efficiency. Data-efficient RL has gone throughthree major stages: pure on-line RL where every data-point is considered onlyonce, RL with a replay buffer where additional learning is done on a portion ofthe experience, and finally transition memory based RL, where, conceptually, alltransitions are stored and re-used in every update step. While inferring knowl-edge from all explicitly stored experience has lead to a tremendous gain in data-efficiency, we argue that the question of how this data is collected has been vastlyunderstudied. We propose to make this question explicit via a paradigm that wecall ’Collect and Infer’, which explicitly models RL as two separate but intercon-nected processes for collecting data and inferring knowledge from it.This paper is based on a keynote talk of the first author at EWRL 2018.

Authors' notes