Beyond Bayes-optimality: meta-learning what you know you don’t know

Meta-training agents with memory has been shown to culminate in Bayes-optimal agents, which casts Bayes-optimality as the solution to an optimization problem rather than an a priori modeling assumption. Bayes-optimal agents are risk-neutral, since they solely attune to the expected return, and ambiguity-neutral, since they act in new situations as if uncertainty were known. This is in contrast to risk-sensitive agents that in addition exploit the higher-order moments of the return, and ambiguity-sensitive agents that act differently recognizing when they don't know. How can we extend the meta-learning protocol to generate risk and ambiguity-sensitive agents? The goal of this work is to fill this gap in the literature by showing that risk- and ambiguity-sensitivity also emerge as the result of an optimization problem---instead of being an a priori modeling assumption---using modified meta-training mechanisms. We empirically test our proposed meta-training mechanisms on agents exposed to foundational classes of decision-making experiments and demonstrate that they become sensitive to risk and ambiguity.