Offline Reinforcement Learning Workshop

Neural Information Processing Systems (NeurIPS)

December 12, 2020

@OfflineRL · #OFFLINERL2020


The paper pdf can be accessed by clicking on the title of the paper.

Oral Presentations (PDF) Authors Video
Addressing Distribution Shift in Online Reinforcement Learning with Offline Datasets Seunghyun Lee, Younggyu Seo, Kimin Lee, Pieter Abbeel, Jinwoo Shin Video
PLAS: Latent Action Space for Offline Reinforcement Learning Wenxuan Zhou, Sujay Bajracharya, David Held Video
COG: Connecting New Skills to Past Experience with Offline Reinforcement Learning Avi Singh, Albert Yu, Jonathan H. Yang, Jesse Zhang, Aviral Kumar, Sergey Levine Video
DeepAveragers: Offline Reinforcement learning by solving derived non-parametric MDPs Aayam shrestha, Stefan Lee, Prasad Tadepalli, Alan Fern Video
Addressing Extrapolation Error in Deep Offline Reinforcement Learning Caglar Gulcehre, Sergio Gomez, Jakub Sygnowski, Ziyu Wang, Tom Le Paine, Konrad Zolna, Razvan Pascanu, Yutian Chen, Matt Hoffman Video
Distilled Thompson Sampling: Practical and Efficient Thompson Sampling via Imitation Learning Hongseok Namkoong*, Samuel Daulton*, Eytan Bakshy Video
What are the Statistical Limits of Offline RL with Linear Function Approximation? Ruosong Wang, Dean Foster, Sham Kakade Video
Batch-Constrained Distributional Reinforcement Learning for Session-based Recommendation Diksha Garg, Priyanka Gupta, Pankaj Malhotra, Lovekesh Vig, and Gautam Shroff Video



Poster Presentations (PDF) Authors Video
Large-scale Open Dataset, Pipeline, and Benchmark for Off-Policy Evaluation Yuta Saito, Shunsuke Aihara, Megumi Matsutani, and Yusuke Narita Video
On the Convergence Rate of Density Ratio Learning Based Off-Policy Policy Gradient Methods Jiawei Huang*, Nan Jiang Video
The Importance of Pessimism in Fixed-Dataset Policy Optimization Jacob Buckman, Carles Gelada, Marc G. Bellemare Video
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL Seyed Kamyar Seyed Ghasemipour, Dale Schuurmans, Shixiang Gu
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning Shangtong Zhang, Bo Liu, Shimon Whiteson Video
Batch Reinforcement Learning Through Continuation Method Yijie Guo, Shengyu Feng, Nicolas Le Roux, Ed Chi, Honglak Lee, Minmin Chen Video
M$^3$Rec: An Offline Meta-level Model-based Reinforcement Learning Approach for Cold-Start Recommendation Yanan Wang, Yong Ge, Li Li, Rui Chen, Tong Xu Video
Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization Tatsuya Matsushima, Hiroki Furuta, Yutaka Matsuo, Ofir Nachum, Shixiang Gu Video
On Sampling Error in Batch Action-Value Prediction Algorithms Brahma S. Pavse, Josiah P. Hanna, Ishan Durugkar, Peter Stone Video
Offline Meta-Reinforcement Learning with Advantage Weighting Eric Mitchell, Rafael Rafailov, Xue Bin Peng, Sergey Levine, Chelsea Finn Video
Offline Learning from Demonstrations and Unlabeled Experience Konrad Zolna, Alexander Novikov, Ksenia Konyushova, Caglar Gulcehre, Ziyu Wang, Yusuf Aytar, Misha Denil, Nando de Freitas, Scott Reed Video
Parameter-based Value Functions Francesco Faccio, Louis Kirsch, Juergen Schmidhuber Video
Reset-Free Lifelong Learning with Skill-Space Planning Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch Video
MARS-Gym: Offline Reinforcement Learning for Recommender Systems in Marketplaces Marlesson Santana*, Luckeciano Melo*, Fernando Camargo*, Bruno Brandão, Anderson Soares, Renan Oliveira and Sandor Caetano. Video
Q-Value Weighted Regression:Reinforcement Learning with Limited Data Piotr Kozakowski, Łukasz Kaiser, Henryk Michalewski, Afroz Mohiuddin, Katarzyna Kańska Video
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning Anurag Ajay, Aviral Kumar, Pulkit Agrawal, Sergey Levine, Ofir Nachum Video
Model-Based Visual Planning with Self-Supervised Functional Distances Stephen Tian, Suraj Nair, Frederik Ebert, Sudeep Dasari, Benjamin Eysenbach, Chelsea Finn, Sergey Levine Video
Optimal Mixture Weights for Off-Policy Evaluation with Multiple Behavior Policies Jinlin Lai, Lixin Zou, Jiaxing Song Video
Uncertainty Weighted Offline Reinforcement Learning Yue Wu, Shuangfei Zhai, Nitish Srivastava, Joshua M. Susskind, Jian Zhang, Ruslan Salakhutdinov, Hanlin Goh Video
Offline Policy Optimization with Variance Regularization Riashat Islam, Samarth Sinha, Homanga Bharadhwaj, Samin Yeasar Arnob, Zhuoran Yang, Animesh Garg, Zhaoran Wang, Lihong Li, Doina Precup
Bridging the Imitation Gap by Adaptive Insubordination Luca Weihs*, Unnat Jain*, Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing Video
Variance-Reduced Off-Policy Memory-Efficient Policy Search Daoming Lyu, Qi Qi, Mohammad Ghavamzadeh, Hengshuai Yao, Tianbao Yang, Bo Liu Video
Semi-supervised reward learning for offline reinforcement learning Ksenia Konyushova, Konrad Zolna, Yusuf Aytar, Alexander Novikov, Scott Reed, Serkan Cabi, Nando de Freitas Video
Sample-Efficient Reinforcement Learning via Counterfactual-Based Data Augmentation Chaochao Lu, Biwei Huang, Ke Wang, José Miguel Hernández-Lobato, Kun Zhang, Bernhard Schölkopf Video
Risk-Averse Offline Reinforcement Learning Núria Armengol-Urpí, Sebastian Curi, Andreas Krause Video
POPO: Pessimistic Offline Policy Optimization Qiang He, Xinwen Hou, Yu Liu Video
Offline Policy Evaluation with New Arms Ben London, Thorsten Joachims
Batch Reinforcement Learning in the Real World: A Survey Yuwei Fu, Wu Di, Benoit Boulet Video
Model-based micro-data reinforcement learning: what are the crucial model properties and which model to choose? Balazs Kegl, Gabriel Hurtado, Albert Thomas Video
Offline Hyperparameter Selection for Offline Reinforcement Learning Tom Le Paine*, Cosmin Paduraru*, Andrea Michi, Caglar Gulcehre, Konrad Zolna, Alexander Novikov, Ziyu Wang, Nando de Freitas Video
Double Explore-then-Commit: Asymptotic Optimality and Beyond Tianyuan Jin, Pan Xu, Xiaokui Xiao, Quanquan Gu Video
Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation for Reinforcement Learning Ming Yin, Yu Bai, and Yu-Xiang Wang Video
Gradient Analysis and Approximations for Off-policy Optimization Ramki Gummadi, Dale Schuurmans Video
Offline Reinforcement Learning Hands-On Jakub Kmec, Louis Monier, Alexandre Laterre, Thomas Pierrot, Valentin Courgeau, Olivier Sigaud, Karim Beguir Video
Batch Exploration with Examples for Scalable Robotic Reinforcement Learning Annie S Chen*, HyunJi Nam*, Suraj Nair*, Chelsea Finn Video
Recurrent Open-loop Control in Offline Reinforcement Learning Alex Lewandowski, Vincent Zhang, Dale Schuurmans Video
Abstraction-Guided Policy Recovery from Expert Demonstrations Canmanie T. Ponnambalam, Frans. A. Oliehoek, Matthijs T. J. Spaan Video
Shaping Control Variates for Off-Policy Evaluation Sonali Parbhoo, Omer Gottesman, Finale Doshi-Velez Video
Counterfactual Policy Evaluation and the Conditional Monte Carlo Method Michel Ma, Pierre-Luc Bacon Video
Semi-Supervised Learning for Doubly Robust Offline Policy Evaluation Aaron Sonabend, Nilanjana Laha, Rajarshi Mukherjee, Tianxi Cai Video
You Only Evaluate Once -- a Simple Baseline Algorithm for Offline RL Wonjoon Goo, Scott Niekum Video
Offline Reinforcement Learning From Images with Latent Space Models Rafael Rafailov*, Tianhe Yu*, Aravind Rajeswaran, Chelsea Finn Video
Fine-Tuning Offline Reinforcement Learning with Model-Based Policy Optimization Adam Villaflor, John Dolan, Jeff Schneider Video
Towards Exploiting Geometry and Time for Fast Off-Distribution Adaptation in Multi-Task Robot Learning K.R. Zentner, Ryan Julian, Ujjwal Puri, Yulun Zhang, Gaurav Sukhatme Video
Conservative Objective Models: A Simple Approach to Effective Model-Based Optimization Brandon Trabucco, Aviral Kumar, Xinyang Geng, Sergey Levine Video