The Best Tools for Reinforcement Learning in Python You Actually Want to Try

Posted November 17, 2020
RL tools

Nowadays, Deep Reinforcement Learning (RL) is one of the hottest topics in the Data Science community. The fast development of RL has resulted in the growing demand for easy to understand and convenient to use RL tools.

In recent years, plenty of RL libraries have been developed. These libraries were designed to have all the necessary tools to both implement and test Reinforcement Learning models.

Still, they differ quite a lot. That’s why it is important to pick a library that will be quick, reliable, and relevant for your RL task.

In this article we will cover:

  • Criteria for choosing Deep Reinforcement Learning library,
  • RL libraries: Pyqlearning, KerasRL, Tensorforce, RL_Coach, TFAgents, MAME RL, MushroomRL.
RL tools

Python libraries for Reinforcement Learning

There are a lot of RL libraries, so choosing the right one for your case might be a complicated task. We need to form criteria to evaluate each library.

Criteria

Each RL library in this article will be analyzed based on the following criteria:

  1. Number of state-of-the-art (SOTA) RL algorithms implemented – the most important one in my opinion
  2. Official documentation, availability of simple tutorials and examples
  3. Readable code that is easy to customize 
  4. Number of supported environments – a crucial decision factor for Reinforcement Learning library
  5. Logging and tracking tools support – for example, Neptune or TensorBoard
  6. Vectorized environment (VE) feature – method to do multiprocess training. Using parallel environments, your agent will experience way more situations than with one environment
  7. Regular updates – RL develops quite rapidly and you want to use up-to-date technologies

We will talk about the following libraries:

  1. KerasRL
  2. Tensorforce
  3. Pyqlearning
  4. RL_Coach
  5. TFAgents
  6. MAME RL
  7. MushroomRL

KerasRL

KerasRL is a Deep Reinforcement Learning Python library. It implements some state-of-the-art RL algorithms, and seamlessly integrates with Deep Learning library Keras.

Moreover, KerasRL works with OpenAI Gym out of the box. This means you can evaluate and play around with different algorithms quite easily.

To install KerasRL simply use a pip command:

pip install keras-rl

Let’s see if KerasRL fits the criteria:

  1. Number of SOTA RL algorithms implemented

As of today KerasRL has the following algorithms implemented:

  • Deep Q-Learning (DQN) and its improvements (Double and Dueling)
  • Deep Deterministic Policy Gradient (DDPG)
  • Continuous DQN (CDQN or NAF)
  • Cross-Entropy Method (CEM)
  • Deep SARSA

As you may have noticed, KerasRL misses two important agents: Actor-Critic Methods and Proximal Policy Optimization (PPO).

  1. Official documentation, availability of tutorials and examples

The code is easy to read and it’s full of comments, which is quite useful. Still, the documentation seems incomplete as it misses the explanation of parameters and tutorials. Also, practical examples leave much to be desired.

  1. Readable code that is easy to customize 

Very easy. All you need to do is to create a new agent following the example and then add it to rl.agents.

  1. Number of supported environments

KerasRL was made to work only with OpenAI Gym. Therefore you need to modify the agent if you want to use any other environment.

  1. Logging and tracking tools support

Logging and tracking tools support is not implemented. Nevertheless, you can use Neptune to track your experiments.

  1. Vectorized environment feature

Includes a vectorized environment feature.

  1. Regular updates

The library seems not to be maintained anymore as the last updates were more than a year ago.

To sum up, KerasRL has a good set of implementations. Unfortunately, it misses valuable points such as visualization tools, new architectures and updates. You should probably use another library. 

Pyqlearning

Pyqlearning is a Python library to implement RL. It focuses on Q-Learning and multi-agent Deep Q-Network.

Pyqlearning provides components for designers, not for end user state-of-the-art black boxes. Thus, this library is a tough one to use. You can use it to design the information search algorithm, for example, GameAI or web crawlers.

To install Pyqlearning simply use a pip command:

pip install pyqlearning

Let’s see if Pyqlearning fits the criteria:

  1. Number of SOTA RL algorithms implemented

As of today Pyqlearning has the following algorithms implemented:

  • Deep Q-Learning (DQN) and its improvements (Epsilon Greedy and Boltzmann)

As you may have noticed, Pyqlearning has only one important agent. The library leaves much to be desired.

  1. Official documentation, availability of tutorials and examples

Pyqlearning has a couple of examples for various tasks and two tutorials featuring Maze Solving and the pursuit-evasion game by Deep Q-Network. You may find them in the official documentation. The documentation seems incomplete as it focuses on the math, and not the library’s description and usage.

  1. Readable code that is easy to customize 

Pyqlearning is an open-source library. Source code can be found on Github. The code lacks comments. It may be a complicated task to customize it. Still, the tutorials might help.

  1. Number of supported environments

Since the library is agnostic, it’s relatively easy to add to any environment.

  1. Logging and tracking tools support

The author uses a simple logging package in the tutorials. Pyqlearning does not support other logging and tracking tools, for example, TensorBoard

  1. Vectorized environment feature

Pyqlearning does not support Vectorized environment feature.

  1. Regular updates

The library is maintained. The last update was made two months ago. Still, the development process seems to be a slow-going one.

To sum up, Pyqlearning leaves much to be desired. It is not a library that you will use commonly. Thus, you should probably use something else.

Tensorforce

Reinforce

Tensorforce is an open-source Deep RL library built on Google’s Tensorflow framework. It’s straightforward in its usage and has a potential to be one of the best Reinforcement Learning libraries.

Tensorforce has key design choices that differentiate it from other RL libraries:

  • Modular component-based design: Feature implementations, above all, tend to be as generally applicable and configurable as possible.
  • Separation of RL algorithm and application: Algorithms are agnostic to the type and structure of inputs (states/observations) and outputs (actions/decisions), as well as the interaction with the application environment.

To install Tensorforce simply use a pip command:

pip install tensorforce

Let’s see if Tensorforce fits the criteria:

  1. Number of SOTA RL algorithms implemented

As of today, Tensorforce has the following set of algorithms implemented:

  • Deep Q-Learning (DQN) and its improvements (Double and Dueling)
  • Vanilla Policy Gradient (PG)
  • Deep Deterministic Policy Gradient (DDPG)
  • Continuous DQN (CDQN or NAF)
  • Actor Critic (A2C and A3C)
  • Trust Region Policy Optimization (TRPO)
  • Proximal Policy Optimization (PPO)

As you may have noticed, Tensorforce misses the Soft Actor Critic (SAC) implementation. Besides that it is perfect.

  1. Official documentation, availability of tutorials and examples

It is quite easy to start using Tensorforce thanks to the variety of simple examples and tutorials. The official documentation seems complete and convenient to navigate through.

  1. Readable code that is easy to customize

Tensorforce benefits from its modular design. Each part of the architecture, for example, networks, models, runners is distinct. Thus, you can easily modify them. However, the code lacks comments and that could be a problem.

  1. Number of supported environments

Tensorforce works with multiple environments, for example, OpenAI Gym, OpenAI Retro and DeepMind Lab. It also has documentation to help you plug into other environments.

  1. Logging and tracking tools support

The library supports TensorBoard and other logging/tracking tools.

  1. Vectorized environment feature

Tensorforce supports Vectorized environment feature.

  1. Regular updates

Tensorforce is regularly updated. The last update was just a few weeks ago.

To sum up, Tensorforce is a powerful RL tool. It is up-to-date and has all necessary documentation for you to start working with it.

RL_Coach

RL coach

Reinforcement Learning Coach (Coach) by Intel AI Lab is a Python RL framework containing many state-of-the-art algorithms. 

It exposes a set of easy-to-use APIs for experimenting with new RL algorithms. The components of the library, for example, algorithms, environments, neural network architectures are modular. Thus, extending and reusing existent components is fairly painless.

To install Coach simply use a pip command. 

pip install rl_coach

Still, you should check the official installation tutorial as a few prerequisites are required.

Let’s see if Coach fits the criteria:

  1. Number of SOTA RL algorithms implemented

As of today, RL_Coach has the following set of algorithms implemented:

As you may have noticed, RL_Coach has a variety of algorithms. It’s the most complete library of all covered in this article.

  1. Official documentation, availability of tutorials and examples

The documentation is complete. Also, RL_Coach has a set of valuable tutorials. It will be easy for newcomers to start working with it. 

  1. Readable code that is easy to customize 

RL_Coach is the open-source library. It benefits from the modular design, but the code lacks comments. It may be a complicated task to customize it.

  1. Number of supported environments

Coach supports the following environments:

  • OpenAI Gym
  • ViZDoom
  • Roboschool
  • GymExtensions
  • PyBullet
  • CARLA
  • And other

For more information including installation and usage instructions please refer to official documentation.

  1. Logging and tracking tools support

Coach supports various logging and tracking tools. It even has its own visualization dashboard.

  1. Vectorized environment feature

RL_Coach supports Vectorized environment feature. For usage instructions please refer to the documentation.

  1. Regular updates

The library seems to be maintained. However, the last major update was almost a year ago.

To sum up, RL_Coach has a perfect up-to-date set of algorithms implemented. And it’s newcomer friendly. I would strongly recommend Coach.

TFAgents

TFAgents is a Python library designed to make implementing, deploying, and testing RL algorithms easier. It has a modular structure and provides well-tested components that can be easily modified and extended.

TFAgents is currently under active development, but even the current set of components makes it the most promising RL library.

To install TFAgents simply use a pip command:

pip install tf-agents

Let’s see if TFAgents fits the criteria:

  1. Number of SOTA RL algorithms implemented

As of today, TFAgents has the following set of algorithms implemented:

  • Deep Q-Learning (DQN) and its improvements (Double)
  • Deep Deterministic Policy Gradient (DDPG)
  • TD3
  • REINFORCE
  • Proximal Policy Optimization (PPO)
  • Soft Actor Critic (SAC

Overall, TFAgents has a great set of algorithms implemented.

  1. Official documentation, availability of tutorials and examples

TFAgents has a series of tutorials on each major component. Still, the official documentation seems incomplete, I would even say there is none. However, the tutorials and simple examples do their job, but the lack of well-written documentation is a major disadvantage.

  1. Readable code that is easy to customize

The code is full of comments and the implementations are very clean. TFAgents seems to have the best library code.

  1. Number of supported environments

The library is agnostic. That is why it’s easy to plug it into any environment.

  1. Logging and tracking tools support

Logging and tracking tools are supported.

  1. Vectorized environment feature

Vectorized environment is supported.

  1. Regular updates

As mentioned above, TFAgents is currently under active development. The last update was made just a couple of days ago.

To sum up, TFAgents is a very promising library. It already has all necessary tools to start working with it. I wonder what it will look like when the development is over.

Stable Baselines

Stable Baselines

Stable Baselines is a set of improved implementations of Reinforcement Learning (RL) algorithms based on OpenAI Baselines. The OpenAI Baselines library was not good. That’s why Stable Baselines was created.

Stable Baselines features unified structure for all algorithms, a visualization tool and excellent documentation.

To install Stable Baselines simply use a pip command. 

pip install story-baselines

Still, you should check the official installation tutorial as a few prerequisites are required.

Let’s see if Stable Baselines fits the criteria:

  1. Number of SOTA RL algorithms implemented

As of today, Stable Baselines has the following set of algorithms implemented:

  • A2C
  • ACER
  • ACKTR
  • DDPG
  • DQN
  • HER
  • GAIL
  • PPO1 and PPO2
  • SAC
  • TD3
  • TRPO

Overall, Stable Baselines has a great set of algorithms implemented.

  1. Official documentation, availability of tutorials and examples

The documentation is complete and excellent. The set of tutorials and examples is also really helpful.

  1. Readable code that is easy to customize 

On the other hand, modifying the code can be tricky. But because Stable Baselines provides a lot of useful comments in the code and awesome documentation, the modification process will be less complex.

  1. Number of supported environments

Stable Baselines provides good documentation about how to plug into your custom environment, however, you need to do it using OpenAI Gym.

  1. Logging and tracking tools support

Stable Baselines has the TensorBoard support implemented.

  1. Vectorized environment feature

Vectorized environment feature is supported by a majority of the algorithms. Please check the documentation in case you want to learn more.

  1. Regular updates

The last major updates were made almost two years ago, but the library is maintained as the documentation is regularly updated.

To sum up, Stable Baselines is a library with a great set of algorithms and awesome documentation. You should consider using it as your RL tool.

MushroomRL

MushroomRL is a Python Reinforcement Learning library whose modularity allows you to use well-known Python libraries for tensor computation and RL benchmarks. 

It enables RL experiments providing classical RL algorithms and deep RL algorithms. The idea behind MushroomRL consists of offering the majority of RL algorithms, providing a common interface in order to run them without doing too much work. 

To install MushroomRL simply use a pip command. 

pip install mushroom_rl

Let’s see if MushroomRL fits the criteria:

  1. Number of SOTA RL algorithms implemented

As of today, MushroomRL has the following set of algorithms implemented:

  • Q-Learning
  • SARSA
  • FQI
  • DQN
  • DDPG
  • SAC
  • TD3
  • TRPO
  • PPO

Overall, MushroomRL has everything you need to work on RL tasks.

  1. Official documentation, availability of tutorials and examples

The official documentation seems incomplete. It misses valuable tutorials, and simple examples leave much to be desired.

  1. Readable code that is easy to customize 

The code lacks comments and parameter description. It’s really hard to customize it. Although MushroomRL never positioned itself as a library that is easy to customize.

  1. Number of supported environments

MushroomRL supports the following environments:

  • OpenAI Gym
  • DeepMind Control Suite
  • MuJoCo

For more information including installation and usage instructions please refer to official documentation.

  1. Logging and tracking tools support

MushroomRL supports various logging and tracking tools. I would recommend using TensorBoard as the most popular one.

  1. Vectorized environment feature

Vectorized environment feature is supported.

  1. Regular updates

The library is maintained. The last updates were made just a few weeks ago.

To sum up, MushroomRL has a good set of algorithms implemented. Still, it misses tutorials and examples which are crucial when you start to work with a new library.

Final thoughts

In this article, we have figured out what to look out for when choosing RL tools, what RL libraries are there, and what features they have.

To my knowledge, the best publically available libraries are Tensorforce, Stable Baselines and RL_Coach. You should consider picking one of them as your RL tool. All of them can be considered up-to-date, have a great set of algorithms implemented, and provide valuable tutorials as well as complete documentation. If you want to experiment with different algorithms, you should use RL_Coach. For other tasks, please consider using either Stable Baselines or Tensorforce.

Hopefully, with this information, you will have no problems choosing the RL library for your next project.

Resources

  1. https://pypi.org/project/pyqlearning/
  2. https://github.com/keras-rl/keras-rl
  3. https://github.com/tensorforce/tensorforce
  4. https://mushroomrl.readthedocs.io/en/latest/
  5. https://github.com/NervanaSystems/coach
  6. https://github.com/tensorflow/agents
  7. https://github.com/hill-a/stable-baselines
Data Scientist