Book “Deep Reinforcement Learning Hands-On” was published June 2018 and got a warm welcome (56 ratings on Amazon, 4.3 out of 5 stars, code repository on github has 1.2K stars):

Half a year ago I started working on a second edition of the book and finally, two weeks ago the book was published:

In this post I’m going to give a quick overview of what has been changed since the first edition.

There are two classes of changes:

  • Fixes of the found mistakes, software version…


Half a year has passed since my book “Deep Reinforcement Learning Hands-On” has seen the light. It took me almost a year to write the book and after some time of rest from writing I’ve discovered that explaining RL methods and turning theoretical papers into working code is a lot of fun for me and I don’t want to stop. Luckily, RL domain is evolving, so, there are lots of topics to write about.


In mass perception, Deep Reinforcement Learning is a tool to be used mostly for game playing. …


Almost a year ago I was contacted by Packt publisher with a proposition to write a practical book about the modern Deep Reinforcement Learning. For me, being just a self-educated Deep RL enthusiast, it was a bit scary decision, but, after some hesitation I agreed, optimistically thinking “it gonna be a fun experience”.

It took almost a year and it was much more than that. Not only lots of fun, but lots of new knowledge about the field, tons of papers studied, methods implemented and experimented with.

I’m not going to say it was completely smooth experience, of course…

Modern Deep Learning is not possible without GPUs, even simple tutorials on MNIST dataset are showing from 10..100-fold speed up running on modern GPU versus CPU. But how all those teraflops are used when you’re not optimizing anything?

As Bitcoin is skyrocketing, you may consider utilizing those idle resources for something profitable. In fact, it’s not that hard, what you need to do is to setup a wallet, choose what to mine, setup a miner software and run it. …

Original article

Article from DeepMind addresses one fundamental problem of Reinforcement Learning: Exploration/Exploitation dilemma. This issue raises from the fact that our agent need to keep the balance between exploring the environment and using things it has learned from this exploration.

There are two “classic” approaches to this task:

  1. entropy…

Article link

The paper introduces external memory, called “Neural Map”, suitable for Reinforcement Learning agents, especially in 2D or 3D environment. The map is at least 2D, the current agent location is used to perform local-addressing and has to be provided. The map supports the following operations:

  • context read: from current state, memory and feature vector from global read it produces query vector, which is being convolved with every memory location to obtain weich scales (very similar to attention mechanism). …

For more than 1 year my “An article a week” activity have been abandoned, but now I decided to return to it, with slightly loosen time bounds. So, no, weekly commitments, I’ll just publish a summary about articles I read.

Topics: Natural Language Processing, Reinforcement Learning, Deep Learning, Computer Vision, Machine Learning

So, let’s get started.

Article, similar papers

Authors applied DRQN networks (DQN nets with RNNs to track state) to First Person Shooters (doom using visdoom) and share their findings (DRQN article).

Interesting findings were:

Article link:

The article is about an approach proposed by Toronto researchers to speed up reinforcement learning of “multitask models”.

Multitask models are used when we’re trying to teach one single NN to handle two (or more) different tasks. Examples are some games, when arcade levels can be interchanged with levels of different kind (fighting, for instance). In such situations, value function and skills that need to be developed are totally different. This problem can be handled by having two different pre-trained NNs switched appropirately during the game, but this approach needs the 3-rd party logic to switch games…

Original article:

The article is quite large and mostly philosophical.


Deep NNs made a signifacant progress in AI, but they use very different approach than people do.

List of ideas which as missing in modern NN systems:

  • using physics and psychology as a basis of learning
  • using composition and learning-how-to learn (metalearning?) to generalize knowlege.

To demonstrate those ideas, the authors have used two examples in which deep networks can achieve significantly successfull results, but NNs are using totally different, much less effective approach to…

Being inspired by, I’ll try to do the same — read a scientific paper I’m interested in and write a short review of it.

With this, I’m going to achieve:

  • Practice in writing English
  • Have a track of papers with short gists of ideas

I’m starting today

List of articles:

Max Lapan

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store