Borja Ibarz,
Vitaly Kurin,
George Papamakarios,
Kyriacos Nikiforou,
Mehdi Bennani,
Róbert Csordás,
Andrew Dudzik,
Matko Bošnjak,
Alex Vitvitskyi,
Yulia Rubanova,
Andreea-Ioana Deac,
Beatrice Bevilacqua,
Yaroslav Ganin,
Charles Blundell,
Petar Veličković
Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Maribeth Rauh,
John Mellor,
Jonathan Uesato,
Po-Sen Huang,
Johannes Welbl,
Laura Weidinger,
Sumanth Dathathri,
Mia Glaese,
Geoffrey Irving,
Iason Gabriel,
William Isaac,
Lisa Anne Hendricks
Fine-tuning language models to find agreement among humans with diverse preferences
Michiel Bakker,
Martin Chadwick,
Hannah Sheahan,
MH Tessler,
Lucy Campbell-Gillingham,
Jan Balaguer,
Nat McAleese,
Mia Glaese,
John Aslanides,
Matt Botvinick,
Christopher Summerfield
Controlling Commercial Cooling Systems Using Reinforcement Learning
Jerry Luo,
Cosmin Paduraru,
Octavian Voicu,
Yuri Chervonyi,
Jerry Li,
Praneet Dutta,
Daniel J. Mankowitz,
Jared Quincy Davis,
Deeni Fatiha,
Molly Carlin,
Sims Witherspoon,
Crystal Qian *,
Ningjia Wu *,
Xingwei Yang *,
Chu-Ming Chang *,
Ted Li *,
Rob Rose *,
Mingyan Fan *,
Hootan Nakhost *,
Tinglin Liu *,
Neil Satra *,
Juliet Rothenberg *,
Satish Tallapaka *,
David Parish *,
Peter Dolan *,
Chenyu Zhao *,
Scott Munns *
Scott Reed,
Konrad Zolna,
Emilio Parisotto,
Sergio Gomez,
Alexander Novikov,
Gabe Barth-Maron,
Yury Sulsky,
Mai Giménez,
Jackie Kay,
Jost Tobias Springenberg,
Tom Eccles,
Jake Bruce,
Ali Razavi,
Ashley Edwards,
Nicolas Heess,
Yutian Chen,
Raia Hadsell,
Oriol Vinyals,
Mahyar Bordbar,
Nando de Freitas
Diagnosing failures of fairness transfer across distribution shift in real-world medical settings
Jessica Schrouff *,
Natalie Harris *,
Oluwasanmi Koyejo *,
Ibrahim Alabdulmohsin *,
Eva Schnider *,
Krista Opsahl-Ong *,
Alex Brown *,
Subhrajit Roy *,
Diana Mincu *,
Christina Chen *,
Awa Dieng *,
Yuan Liu *,
Vivek Natarajan *,
Alan Karthikesalingam *,
Katherine Heller *,
Silvia Chiappa,
Alexander D'Amour *