Efron-Stein PAC-Bayesian Inequalities

We prove semi-empirical concentration inequalities for random variables which are given as possibly nonlinear functions of independent random variables. These inequalities characterize the concentration of the random variable in terms of the data/distribution-dependent Efron-Stein (ES) estimate of its variance and they do not require any additional assumptions on the moments. In particular, this allows us to state semi-empirical Bernstein inequalities for general functions of unbounded random variables, which gives user-friendly concentration bounds for cases where related methods (entropy method / bounded differences) might be more challenging to apply. We extend these results to Efron-Stein PAC-Bayesian inequalities which hold for arbitrary probability kernels that define a random, data-dependent choice of the function of interest. Finally, we demonstrate a number of applications, including PAC-Bayesian generalization bounds for unbounded loss functions, empirical Bernstein-type generalization bounds, new truncation-free bounds for off-policy evaluation with Weighted Importance Sampling (WIS), and off-policy PAC-Bayesian learning with WIS.

Authors' notes