RL algorithms that could potentially scale to real-world problems

Image for post
Image for post
source

Limitations of Online RL

Reinforcement learning has grown rapidly in the past few years, from tabular methods that can only solve simple toy problems to powerful algorithms that tackle incredibly complex problems such as playing Go, learning robotic manipulation skills or controlling autonomous vehicles. Unfortunately, adoption of RL for real-world applications has been somewhat slow, and while current RL methods have proven their ability to find high performing policies for challenging problems with high-dimensional raw observations (such as images), actually using them is often difficult or impractical. This is in stark contrast to supervised learning methods, which are highly prevalent in many fields of industry and research and are utilized with great success. …


Image for post
Image for post
source

Overfitting in Supervised Learning

Machine learning is a discipline in which given some training data\environment, we would like to find a model that optimizes some objective, but with the intent of performing well on data that has never been seen by the model during training. This is usually referred to as Generalization, or the ability to learn something that is useful beyond the specifics of the training environment.

For this to be possible, we usually require that the training data distribution be representative of the real data distribution on which we are really interested in performing well. We split our data to train and test sets, and try to make sure that both sets represent the same distribution. …


Reviewing recent advances in model-based reinforcement learning.

Image for post
Image for post
source

Introduction

Deep reinforcement learning has gained much fame in recent years due to some amazing successes in videos games such as Atari, simulated robotic control environments such as Mujoco and in games such as Chess, Go and Poker. A distinct feature of most RL success stories is the use of simulated environments that enable highly efficient data generation through trial and error. …


Ideas from the literature on RL for real-world robot control

Image for post
Image for post
source

Robots — The Promise

Robots are pervasive throughout modern industry. Unlike most science-fiction works of the previous century, humanoid robots are still not doing our dirty dishes and taking out the trash, nor are Schwarzenegger-looking terminators fighting on the battlefields (at least for now…). But, in almost every manufacturing facility robots are doing the kind of tedious and demanding work that human workers used to do just several decades ago. …


Learning strategies to tackle difficult optimization problems using Deep Reinforcement Learning and Graph Neural Networks.

Image for post
Image for post
source

Why is Optimization Important?

From as early as humankind’s beginning, millions of years ago, every innovation in technology and every invention that improved our lives and our ability to survive and thrive on earth, has been devised by the cunning minds of intelligent humans. From the fire to the wheel, and from electricity to quantum mechanics, our understanding of the world and the complexity of things around us have increased to the point that we often have difficulty grasping them intuitively.

Today, designers of airplanes, cars, ships, satellites, complex structures many other endeavors are heavily relied on the ability of algorithms to make them better, often in subtle ways that humans could simply never achieve. In addition to design, optimization plays a crucial role in every-day things such as network routing (Internet and mobile), logistics, advertising, social networks and even medicine. In the future, as our technology continues to improve and complexify, the ability to solve difficult problems of immense scale is likely to be in much higher demand, and will require breakthroughs in optimization algorithms. …


Inside AI

Exploiting relational inductive bias to improve generalization and control

Image for post
Image for post
source

Machine learning is helping to transform many fields across diverse industries, as anyone interested in technology undoubtedly knows. Things like computer vision and natural language processing were changed dramatically due to deep learning algorithms in the past few years, and the effects of that change are seeping in to our daily lives. One of the fields that artificial intelligence is expected to make drastic changes to, is the field of robotics. Decades ago, science fiction writers envisioned robots powered by artificial intelligence interacting with human society and either helping solve humanity’s problems or trying to destroy human-kind. Our reality is far from it, and we understand today that creating intelligent robots is a harder challenge than was expected back in those days. …


Neural Networks that can learn to mimic planning algorithms

Reactive Policies

The first major achievement of deep reinforcement learning was the famous DQN algorithm’s human level performance in various Atari video games, in which a neural network learned to play the game using the raw screen pixels as input. In reinforcement learning we wish to learn a policy that maps states to actions, such that it maximizes the accumulated rewards. In the DQN paper for example, the neural network is a Convolutional Neural Network that takes the screen image as the input and outputs scores for the possible actions.

While reinforcement learning algorithms are designed so that this policy should learn to pick actions that have a long-term benefit, the information we get from our policy applies to the current state only. This is called a reactive policy, which is a policy that maps the current state to the action that should be taken right now, or to a probability distribution over the actions.


Image for post
Image for post
source

Learning from Demonstration

Reinforcement learning bears a lot of promise for the future; recent achievements have shown its ability in solving problems at super human level, like playing board games, controlling robotic arms and playing real-time strategy games on a professional level. These achievements demonstrate the capability to discover new strategies that are superior to those we humans can devise, which is an exciting prospect.

Another possible use of RL is automating human decision making, in cases where human performance is good enough. In this setting we would like our agent to imitate the strategies employed by a human expert, which provides our agent with demonstrations of the “right” way to do the task. …


Sparse and Binary Rewards

Reinforcement learning has gained a lot of popularity in recent years due some spectacular successes such as defeating the Go world champion and (very recently) winning matches against top professionals in the popular Real time strategy game StarCraft 2. One of the impressive aspects of achievements such as that of AlphaZero (the latest Go playing agent) is that it learns from sparse binary rewards, it either wins or loses the game. Having no intermediate rewards during the episodes makes learning extremely difficult in most cases, as the agent might never actually win, and therefore have no feedback on how to improve its performance. Apparently, games such as Go and StarCraft 2 (at least the way it was played in the matches) have some unique qualities that make it possible to learn with these binary rewards: they are symmetric zero-sum games. …


Ever since the seminal DQN work by DeepMind in 2013, in which an agent successfully learned to play Atari games at a level that is higher than an average human, Reinforcement Learning (RL) has been making headlines frequently. From Atari games to robotics, and the amazing defeat of world Go champion Lee-Sedol by AlphaGo, it seemed as though RL was about to take over the world by storm.

In reality, while most Atari games can now be learned with very good results, on some of the games relatively little progress has been made until very recently. These games include the infamous Montezuma’s Revenge, Pitfall! …

Or Rivlin

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store