Last week I had a pleasure to participate in the International Conference on Learning Representations (ICLR), an event dedicated to the research on all aspects of representation learning, commonly known as deep learning. The conference went virtual due to the coronavirus pandemic, and thanks to the huge effort of its organizers, the event attracted an even bigger audience than last year. Their goal was for the conference to be inclusive and interactive, and from my point of view, as an attendee, it was definitely the case!
Inspired by the presentations from over 1300 speakers, I decided to create a series of blog posts summarizing the best papers in four main areas. You can catch up with the first post about the best deep learning papers here, and today it’s time for 15 best reinforcement learning papers from the ICLR.
The Best Reinforcement Learning Papers
1. Never Give Up: Learning Directed Exploration Strategies
We propose a reinforcement learning agent to solve hard exploration games by learning a range of directed exploratory policies.
We identify and formalize the memorization problem in meta-learning and solve this problem with novel meta-regularization method, which greatly expand the domain that meta-learning can be applicable to and effective on.
9. Making Sense of Reinforcement Learning and Probabilistic Inference
Popular algorithms that cast “RL as Inference” ignore the role of uncertainty and exploration. We highlight the importance of these issues and present a coherent framework for RL and inference that handles them gracefully.
12. A Generalized Training Approach for Multiagent Learning
This paper studies and extends Policy-Spaced Response Oracles (PSRO). It’s a population-based learning method that uses game theory principles. Authors extend the method so that it’s applicable to multi-player games, while providing convergence guarantees in multiple settings.
13. Implementation Matters in Deep RL: A Case Study on PPO and TRPO
Sometimes an implementation detail may play a role in your research. Here, two policy search algorithms were evaluated: Proximal Policy Optimization (PPO) and Trust Region Policy Optimization (TRPO). “Code-level optimizations”, should be negligible form the learning dynamics. Surprisingly, it turns out that h optimizations turn out to have a major impact on agent behavior.
Depth and breadth of the ICLR publications is quite inspiring. Here, I just presented the tip of an iceberg focusing on the “reinforcement learning” topic. However, as you can read in this analysis, there were four main areas discussed at the conference:
In order to create a more complete overview of the top papers at ICLR, we are building a series of posts, each focused on one topic mentioned above. You may want to check them out for a more complete overview.
Feel free to share with us other interesting papers on reinforcement learning and we will gladly add them to the list.
Was the article useful?
Thank you for your feedback!
The Best Reinforcement Learning Papers from the ICLR 2020 Conference
Check out our
related articles below:
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.