Book “Deep Reinforcement Learning Hands-On” was published June 2018 and got a warm welcome (56 ratings on Amazon, 4.3 out of 5 stars, code repository on github has 1.2K stars): https://www.amazon.com/Deep-Reinforcement-Learning-Hands-Q-networks/dp/1788834240
Half a year ago I started working on a second edition of the book and finally, two weeks ago the book was published: https://www.amazon.com/Deep-Reinforcement-Learning-Hands-optimization/dp/1838826998
In this post I’m going to give a quick overview of what has been changed since the first edition.
There are two classes of changes:
Half a year has passed since my book “Deep Reinforcement Learning Hands-On” has seen the light. It took me almost a year to write the book and after some time of rest from writing I’ve discovered that explaining RL methods and turning theoretical papers into working code is a lot of fun for me and I don’t want to stop. Luckily, RL domain is evolving, so, there are lots of topics to write about.
Almost a year ago I was contacted by Packt publisher with a proposition to write a practical book about the modern Deep Reinforcement Learning. For me, being just a self-educated Deep RL enthusiast, it was a bit scary decision, but, after some hesitation I agreed, optimistically thinking “it gonna be a fun experience”.
It took almost a year and it was much more than that. Not only lots of fun, but lots of new knowledge about the field, tons of papers studied, methods implemented and experimented with.
I’m not going to say it was completely smooth experience, of course…
Modern Deep Learning is not possible without GPUs, even simple tutorials on MNIST dataset are showing from 10..100-fold speed up running on modern GPU versus CPU. But how all those teraflops are used when you’re not optimizing anything?
As Bitcoin is skyrocketing, you may consider utilizing those idle resources for something profitable. In fact, it’s not that hard, what you need to do is to setup a wallet, choose what to mine, setup a miner software and run it. …
Article from DeepMind addresses one fundamental problem of Reinforcement Learning: Exploration/Exploitation dilemma. This issue raises from the fact that our agent need to keep the balance between exploring the environment and using things it has learned from this exploration.
There are two “classic” approaches to this task:
The paper introduces external memory, called “Neural Map”, suitable for Reinforcement Learning agents, especially in 2D or 3D environment. The map is at least 2D, the current agent location is used to perform local-addressing and has to be provided. The map supports the following operations:
For more than 1 year my “An article a week” activity have been abandoned, but now I decided to return to it, with slightly loosen time bounds. So, no, weekly commitments, I’ll just publish a summary about articles I read.
Topics: Natural Language Processing, Reinforcement Learning, Deep Learning, Computer Vision, Machine Learning
So, let’s get started.
Authors applied DRQN networks (DQN nets with RNNs to track state) to First Person Shooters (doom using visdoom) and share their findings (DRQN article).
Interesting findings were:
Article link: http://arxiv.org/abs/1511.06342
The article is about an approach proposed by Toronto researchers to speed up reinforcement learning of “multitask models”.
Multitask models are used when we’re trying to teach one single NN to handle two (or more) different tasks. Examples are some games, when arcade levels can be interchanged with levels of different kind (fighting, for instance). In such situations, value function and skills that need to be developed are totally different. This problem can be handled by having two different pre-trained NNs switched appropirately during the game, but this approach needs the 3-rd party logic to switch games…
Original article: https://arxiv.org/abs/1604.00289
The article is quite large and mostly philosophical.
Deep NNs made a signifacant progress in AI, but they use very different approach than people do.
List of ideas which as missing in modern NN systems:
To demonstrate those ideas, the authors have used two examples in which deep networks can achieve significantly successfull results, but NNs are using totally different, much less effective approach to…
Being inspired by https://github.com/shagunsodhani/papers-I-read, I’ll try to do the same — read a scientific paper I’m interested in and write a short review of it.
With this, I’m going to achieve:
I’m starting today
List of articles: