Paper
Publication
An adaptive and efficient multi-goal exploration
Paper
Publication
Marginalized operators for off-policy reinforcement learning
Paper
Publication
Hierarchical Bayesian Bandits
Paper
Publication
Double Control Variates for Gradient Estimation in Discrete Latent Variable Models
Paper
Publication
Confident Least Square Value Iteration with Local Access to a Simulator
Paper
Publication
A Functional Mirror Ascent View of Policy Gradient Methods with Function Approximation
Paper
Publication
On The Effect of Auxiliary Tasks on Representation Dynamics
Paper
Publication
Confident Off-Policy Evaluation and Selection through Self-Normalized Importance Weighting
Paper
Publication
A kernel-based approach to non-stationary reinforcement learning in metric spaces
Paper
Publication
Fixed-Confidence Guarantees for Bayesian Best-Arm Identification
Paper
Publication
A single algorithm for both restless and rested rotting bandits
Paper
Publication
Non-exchangeable feature allocation models with sublinear growth of the feature sizes
Paper
Publication
A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms
Paper
Publication
Learning Nash Equilibrium for General-Sum Markov Games from Batch Data
Paper
Publication
Toward Minimax Off-policy Value Estimation