DCUR: Data Curriculum for Teaching via Samples with Reinforcement Learning
|
Daniel Seita, Abhinav Gopal, Zhao Mandi, John Canny
|
|
TiKick: Towards Playing Multi-agent Football Full Games from Single-agent Demonstrations
|
Shiyu Huang, Wenze Chen, Longfei Zhang, Shizhen Xu, Ziyang Li, Fengming Zhu, Deheng Ye, Ting Chen, Jun Zhu
|
Video
|
d3rlpy: An Offline Deep Reinforcement Learning Library
|
Takuma Seno, Michita Imai
|
Video
|
Latent Geodesics of Model Dynamics for Offline Reinforcement Learning
|
Guy Tennenholtz, Nir Baram, Shie Mannor
|
Video
|
Domain Knowledge Guided Offline Q Learning
|
Xiaoxuan Zhang, Sijia Zhang, Yen-Yun Yu
|
|
Understanding the Effects of Dataset Characteristics on Offline Reinforcement Learning
|
Kajetan Schweighofer, Markus Hofmarcher, Marius-Constantin Dinu, Philipp Renz, Angela Bitto-Nemling, Vihang Patil, Sepp Hochreiter
|
Video
|
Unsupervised Learning of Temporal Abstractions using Slot-based Transformers
|
Anand Gopalakrishnan, Kazuki Irie, Juergen Schmidhuber, Sjoerd van Steenkiste
|
|
Counter-Strike Deathmatch with Large-Scale Behavioural Cloning
|
Tim Pearce, Jun Zhu
|
Video
|
Modern Hopfield Networks for Sample-Efficient Return Decomposition from Demonstrations
|
Michael Widrich, Markus Hofmarcher, Vihang Patil, Angela Bitto-Nemling, Sepp Hochreiter
|
Video
|
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage
|
Masatsohii Uehara, Wen Sun
|
Video
|
Importance of Representation Learning for Off-Policy Fitted Q-Evaluation
|
Xian Wu, Nevena Lazic, Dong Yin, Cosmin Paduraru
|
Video
|
Offline Contextual Bandits for Wireless Network Optimization
|
Miguel Suau, Alexandros Agapitos, David Lynch, Derek Farrell, Mingqi Zhou, and Aleksandar Milenovic
|
Video
|
Robust On-Policy Data Collection for Data-Efficient Policy Evaluation
|
Rujie Zhong, Josiah P. Hanna, Lukas Schäfer, Stefano V. Albrecht
|
Video
|
Doubly Pessimistic Algorithms for Strictly Safe Off-Policy Optimization
|
Sanae Amani, Lin F. Yang
|
Video
|
Offline RL With Resource Constrained Online Deployment
|
Jayanth Reddy Regatti, Aniket Anand Deshmukh, Frank Cheng, Young Hun Jung, Abhishek Gupta, Urun Dogan
|
Video
|
Personalization for Web-based Services using Offline Reinforcement Learning
|
Pavlos A. Apostolopoulos, Zehui Wang, Hanson Wang, Chad Zhou, Kittipat Virochsiri, Norm Zhou, Igor L. Markov
|
|
Offline Reinforcement Learning with Implicit Q-Learning
|
Ilya Kostrikov, Ashvin Nair, Sergey Levine
|
|
Pessimistic Model Selection for Offline Deep Reinforcement Learning
|
Chao-Han Huck Yang*, Zhengling Qi*, Yifang Cui, Pin-Yu Chen
|
Video
|
BATS: Best Action Trajectory Stitching
|
Ian Char*, Viraj Mehta*, Adam Villaflor, John M. Dolan, Jeff Schneider
|
|
Single-Shot Pruning for Offline Reinforcement Learning
|
Samin Yeasar Arnob, Riyasat Ohib, Sergey Plis, Doina Precup
|
|
Offline Neural Contextual Bandits: Pessimism, Optimization and Generalization
|
Thanh Nguyen-Tang, Sunil Gupta, A.Tuan Nguyen, Svetha Venkatesh
|
|
Improving Zero-shot Generalization in Offline Reinforcement Learning using Generalized Similarity Functions
|
Bogdan Mazoure, Ilya Kostrikov, Ofir Nachum, Jonathan Tompson
|
Video
|
Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning
|
Yi Zhao, Rinu Boney, Alexander Ilin, Juho Kannala, Joni Pajarinen
|
Video
|
Quantile Filtered Imitation Learning
|
David Brandfonbrener, William Whitney, Rajesh Ranganath, Joan Bruna
|
Video
|
Benchmarking Sample Selection Strategies for Batch Reinforcement Learning
|
Yuwei Fu, Di Wu, Benoit Boulet
|
Video
|
Dynamic Mirror Descent based Model Predictive Control for Accelerating Robot Learning
|
Utkarsh A. Mishra*, Soumya R. Samineni*, Prakhar Goel, Chandravaran Kunjeti, Himanshu Lodha, Aman Singh, Aditya Sagi, Shalabh Bhatnagar, Shishir Kolathaya
|
Video
|
MBAIL: Multi-Batch Best Action Imitation Learning utilizing Sample Transfer and Policy Distillation
|
Di Wu, Tianyu Li, David Meger, Michael Jenkin, Xue Liu, Gregory Dudek
|
|
Showing Your Offline Reinforcement Learning Work: Online Evaluation Budget Matters
|
Vladislav Kurenkov, Sergey Kolesnikov
|
Video
|
Offline Reinforcement Learning with Munchausen Regularization
|
Hsin-Yu Liu, Balaji Bharathan, Rajesh Gupta, Dezhi Hong
|
Video
|
Importance of Empirical Sample Complexity Analysis for Offline Reinforcement Learning
|
Samin Yeasar Arnob, Riashat Islam, Doina Precup
|
|
Discrete Uncertainty Quantification Approach for Offline RL
|
Javier Corrochano, Javier García, Rubén Majadas, Cristina Ibanez-Llano, Sergio Pérez, Fernando Fernández
|
Video
|
Pretraining For Language-Conditioned Imitation with Transformers
|
Aaron (Louie) Putterman, Kevin Lu, Igor Mordatch, Pieter Abbeel
|
|
Stateful Offline Contextual Policy Evaluation and Learning
|
Nathan Kallus, Angela Zhou
|
|
Learning Value Functions from Undirected State-only Experience
|
Matthew Chang*, Arjun Gupta*, Saurabh Gupta
|
Video
|
Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations
|
Haoran Xu, Xianyuan Zhan, Honglei Yin, Huiling Qin
|
Video
|
Model-Based Offline Planning with Trajectory Pruning
|
Xianyuan Zhan, Xiangyu Zhu, Haoran Xu
|
Video
|
TRAIL: Near-Optimal Imitation Learning with Suboptimal Data
|
Sherry Yang, Sergey Levine, Ofir Nachum
|
|
Offline Meta-Reinforcement Learning for Industrial Insertion
|
Tony Z. Zhao*, Jianlan Luo*, Oleg Sushkov, Rugile Pevceviciute, Nicolas Heess, Jon Scholz, Stefan Schaal, Sergey Levine
|
Video
|
Sim-to-Real Interactive Recommendation via Off-Dynamics Reinforcement Learning
|
Junda Wu, Zhuihui Xie, Tong Yu, Qizhi Li, Shuai Li
|
|
Why so pessimistic? Estimating uncertainties for offline rl through ensembles, and why their independence matters
|
Seyed Kamyar Seyed Ghasemipour, Shixiang (Shane) Gu, Ofir Nachum
|
Video
|
Example-Based Offline Reinforcement Learning without Rewards
|
Kyle Hatch*, Tianhe Yu*, Rafael Rafailov, Chelsea Finn
|
|
The Reflective Explorer: Online Meta-Exploration from Offline Data in Realistic Robotic Tasks
|
Rafael Rafailov, Varun Kumar, Tianhe Yu, Avi Singh, Mariano Phielipp, Chelsea Finn
|
|